Gnu-morello October 2022

gnu-morello@op-lists.linaro.org

3 participants
4 discussions

binary release of the GNU Toolchain for Morello (version: 10.1.Morello-Alp1

by Vadim Zaliva

I am very excited about the binary release of gcc-toolchain, but I run into problems trying to compile something as simple as: int main() { return 0; } $ ~/morello-gnu/bin/aarch64-none-elf-gcc -march=morello -O0 -nostdlib hello.c /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400000 $ ~/morello-gnu/bin/aarch64-none-elf-gcc -march=morello -O0 hello.c /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/lib/hybridcap/libc.a(lib_a-exit.o): in function `exit': /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/newlib/libc/stdlib/exit.c:70: undefined reference to `_exit' /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/lib/hybridcap/crt0.o: in function `_start': /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:360: undefined reference to `__init_global_caps' /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:361: undefined reference to `__processRelocs' /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:365: undefined reference to `initialise_monitor_handles' collect2: error: ld returned 1 exit status Any suggestions on what I am doing wrong? Also, I think it mentions that pure and hybrid capabilities modes are supported, but I could not find option to specify which one to use. Thanks! Vadim -- Senior Research Associate Department of Computer Science and Technology University of Cambridge http://zaliva.org/

3 years, 1 month

morello-binutils: Only check for valid Morello bounds on non-exec syms

by Matthew Malcomson

Capabilities pointing to symbols in SEC_CODE sections are given the bounds of the entire PCC. We ensure that the PCC bounds are padded and aligned as needed in the linker. Capabilities pointing to other symbols (e.g. in data sections) are given the bounds of the symbol that they point to. It is the responsibility of the assembly generator (i.e. usually the compiler) to ensure these bounds are correctly aligned and padded as necessary. We emit a warning for imprecise bounds in the second case, until this patch that warning also looked at the first case. This was a mistake and is rectified in this commit. ############### Attachment also inlined for ease of reply ############### diff --git a/bfd/elfnn-aarch64.c b/bfd/elfnn-aarch64.c index 3e0562b72bc69a9c43f01c80c604b85c582f4f7b..da3c629fcdbfbceed633d9af7c5751dd358492fc 100644 --- a/bfd/elfnn-aarch64.c +++ b/bfd/elfnn-aarch64.c @@ -6965,17 +6965,6 @@ c64_fixup_frag (bfd *input_bfd, struct bfd_link_info *info, bfd_vma base = value, limit = value + size; unsigned align = 0; - if (!bounds_ok && !c64_valid_cap_range (&base, &limit, &align)) - { - /* Just warn about this. It's not a requirement that bounds on - objects should be precise, so there's no reason to error out on - such an object. */ - /* xgettext:c-format */ - _bfd_error_handler - (_("%pB: capability range for '%s' may exceed object bounds"), - input_bfd, sym_name); - } - if (perm_sec && perm_sec->flags & SEC_CODE) { /* Any symbol pointing into an executable section gets bounds according @@ -6994,6 +6983,16 @@ c64_fixup_frag (bfd *input_bfd, struct bfd_link_info *info, data or jump to other functions. */ size = pcc_high - pcc_low; } + else if (!bounds_ok && !c64_valid_cap_range (&base, &limit, &align)) + { + /* Just warn about this. It's not a requirement that bounds on + objects should be precise, so there's no reason to error out on + such an object. */ + /* xgettext:c-format */ + _bfd_error_handler + (_("%pB: capability range for '%s' may exceed object bounds"), + input_bfd, sym_name); + } if (perm_sec != NULL) { diff --git a/ld/testsuite/ld-aarch64/aarch64-elf.exp b/ld/testsuite/ld-aarch64/aarch64-elf.exp index 4200c1d17e97ac9b54f2a0a051f15f0d51d47c06..a07a6c50bb8b601761cf38607afbc0a4dcc9ef0c 100644 --- a/ld/testsuite/ld-aarch64/aarch64-elf.exp +++ b/ld/testsuite/ld-aarch64/aarch64-elf.exp @@ -390,6 +390,8 @@ run_dump_test_lp64 "morello-illegal-tls" run_dump_test_lp64 "morello-illegal-tls-pie" run_dump_test_lp64 "morello-illegal-tls-shared" +run_dump_test_lp64 "morello-large-function" + run_dump_test "no-morello-syms-static" run_dump_test "reloc-overflow-bad" diff --git a/ld/testsuite/ld-aarch64/morello-large-function.d b/ld/testsuite/ld-aarch64/morello-large-function.d new file mode 100644 index 0000000000000000000000000000000000000000..dc0af92e12f9ab7203a400997ded14c8bdf4172c --- /dev/null +++ b/ld/testsuite/ld-aarch64/morello-large-function.d @@ -0,0 +1,23 @@ +# Mainly here to check that this actually links. +# This testcase used to complain that the function capability may have +# imprecise bounds, but since such capabilities are given PCC bounds that error +# was invalid. +# +# Even though the only point is to check that the testcase links, we still +# ensure that the dump of the .data section contains a relocation with the +# correct permissions. +#as: -march=morello+c64 +#ld: -pie -static +#objdump: -DR -j .data + +.*: file format .* + + +Disassembly of section \.data: + +[0-9a-f]+ <__data_start>: + *[0-9a-f]+: .* + .*: R_MORELLO_RELATIVE \*ABS\*\+.* + *[0-9a-f]+: .* udf #0 + *[0-9a-f]+: .* + *[0-9a-f]+: 04000000 .* diff --git a/ld/testsuite/ld-aarch64/morello-large-function.s b/ld/testsuite/ld-aarch64/morello-large-function.s new file mode 100644 index 0000000000000000000000000000000000000000..46cb51e0c68cba982e8bf2b83b696399a6225f21 --- /dev/null +++ b/ld/testsuite/ld-aarch64/morello-large-function.s @@ -0,0 +1,9 @@ +.data + .chericap _start +.text + .globl _start + .type _start,@function +_start: + ret + .zero 0x8000 + .size _start, .-_start

3 years, 5 months

[To Be Committed] Various fixes for capability IFUNCs

by Matthew Malcomson

1) Enable having a CAPINIT relocation against an IFUNC. We update the `final_link_relocate` switch case around IFUNC's to also handle CAPINIT relocations. The handling of CAPINIT relocations is slightly different than for AARCH64_NN (i.e. ABS64) relocations since we generally need to emit a dynamic relocation. Handling this relocation also needs to manage the PDE case when a hard-coded address has been put into code to satisfy something like an `adrp`. In these cases the canonical address of the IFUNC becomes its PLT stub rather than the result of the resolver. We then need to use a RELATIVE relocation rather than an IRELATIVE one. N.b. unlike the ABS64 relocation, since a CAPINIT will always emit a dynamic relocation we do not require pointer equality adjustments on a symbol from having seen a CAPINIT. That means we do not need to request that the PLT stub of an IFUNC is treated as the canonical address just from having seen a CAPINIT relocation. A CAPINIT relocation against an IFUNC needs to be recorded internally so that _bfd_elf_allocate_ifunc_dyn_relocs does not garbage collect the PLT stub and associated IRELATIVE relocation. See changes in the CAPINIT case of the IFUNC switch of elfNN_aarch64_final_link_relocate, and in the CAPINIT case of elfNN_aarch64_check_relocs. 2) Ensure that GOT relocations against an IFUNC have their fragment populated with the LSB set. For GOT relocations against a capability IFUNC we need to introduce a relocation for the runtime to provide us with a valid capability. See changes in the GOT cases of the IFUNC switch of elfNN_aarch64_final_link_relocate, changes in the elfNN_aarch64_allocate_ifunc_dynrelocs function, and changes around handling an IFUNC GOT entry in elfNN_aarch64_finish_dynamic_symbol. 3) Ensure that mapping symbols are emitted for the .iplt. Without this many of the testcases here are disassembled incorrectly. See changes in elfNN_aarch64_output_arch_local_syms. 4) IRELATIVE relocations are against symbols which are not in the dynamic symbol table, hence they need their fragment populated to inform the dynamic linker the bounds and permissions to call the associated resolver with. See part of the CAPINIT IFUNC handling in elfNN_aarch64_final_link_relocate, and the IRELATIVE handling in elfNN_aarch64_create_small_pltn_entry. 5) Disallow an ABS64 relocation against a purecap IFUNC. Such a relocation is expecting a 64-bit value but the function will return a capability. Some handling could be implemented by some communication method to the dynamic linker that this particular value should be 64-bit (maybe by emitting an AARCH64_IRELATIVE relocation rather than a MORELLO_IRELATIVE one), but as yet GCC doesn't generate such a relocation and we believe it's unlikely to be needed. See new error check in AARCH64_NN clause of elfNN_aarch64_final_link_relocate. 6) Ensure that for statically linked PDE's, we segregate IRELATIVE and RELATIVE relocations. IRELATIVE relocs should be in the .rela.iplt section, while RELATIVE relocs should be in the .rela.dyn section. Correspondingly all RELATIVE relocations should be between the __rela_dyn_{start,end} symbols, and all IRELATIVE relocations should be between the __rela_iplt_{start,end} symbols. This segregation is made based on dynamic relocation type rather than static relocation that generates it. The segregation allows the static libc to more easily handle relocations. Update testcases accordingly. We introduce some new testcases, morello-ifunc.s contains uses of an IFUNC which has been referenced directly in code. When compiling a PDE this triggers the pointer equality requirement and hence the canonical address for this symbol becomes the PLT stub rather than the result of the resolver. morello-ifunc1.s does not use the IFUNC directly in code so that the address used everywhere is the result of the resolver. Both of these have testcases assembled and linked for static, dynamically linked PDE, and PIE. The testcase without a hard-coded access also has a testcase for -shared. morello-ifunc2.s is written to check that a CAPINIT relocation does indeed stop the garbage collection of an IFUNC's PLT and IRELATIVE relocation. morello-ifunc3.s tests that we error on an ABS64 relocation against a C64 IFUNC. morello-ifunc-dynlink.s tests that a CAPINIT relocation against an IFUNC symbol defined in a shared library behaves the same way as one against a FUNC symbol defined in a shared library. Implementation note: When segregating IRELATIVE and RELATIVE relocs the change for relocations against IFUNC symbols populated in the GOT is straightforward. For CAPINIT relocations the change is not as straightforward. The problem is that on sight of CAPINIT relocations in check_relocs we immediately allocate space in the srelcaps section. In trying to satisfy the above we need to know whether we're going to be emitting an IRELATIVE relocation or RELATIVE one in order to know which section it should go in. The determining factor between these two kinds of relocations is whether there is a text relocation to this IFUNC symbol, since that determines whether we need to make this CAPINIT relocation a RELATIVE relocation pointing to the PLT stub (in order to satisfy pointer equality) or an IRELATIVE relocation pointing to the resolver. Whether such a relocation occurs is recorded against each symbol in the pointer_equality_needed member. This can only be known after all relocations have been seen in check_relocs. Hence, when coming across a CAPINIT relocation in check_relocs we do not in general know whether this CAPINIT relocation should end up as an IRELATIVE or RELATIVE relocation. This patch postpones the decision by recording the number of CAPINIT relocations against a given symbol in a hash table while going through check_relocs and allocating the relevant space in the required section in size_dynamic_sections. N.b. this is similar in purpose to the dyn_relocs linked list on a symbol. We do not use that existing member which is on every symbol since the structure does not allow any indication of what kind of relocation triggered the need. Moreover the structure is used for different purposes throughout the linker and disentangling the new meaning from the existing ones seems overly confusing. Overall, the decisions about which sections relocations against an IFUNC should go in are: CAPINIT relocations: If this is a static PDE link, and the symbol does not need pointer equality handling, then this should emit an IRELATIVE relocation and that should go in the .rela.iplt section. If this is a PIC link, then this should go in the .rela.ifunc section (along with all other dynamic relocations against the IFUNC, as commented in _bfd_elf_allocate_ifunc_dyn_relocs). Otherwise this relocation should go in the srelcaps section (which goes in .rela.dyn). GOT relocations: If this is a static PDE link, and the symbol does not need pointer equality, then this should emit an IRELATIVE relocation into the .rela.iplt section. If this is a static PDE link, then this should emit a RELATIVE relocation and that should go in the srelcaps section (which is in .rela.dyn). Otherwise this should go in .rela.got section.

3 years, 5 months

[PATCH] ld, aarch64: Account for stubs in bounds sizing

by Alex Coplan

Hi, This patch deals with the interaction between the code that attempts to make bounds precise (for both the PCC bounds and for some individual sections) and the code that adds stubs (e.g. long-branch veneers and interworking stubs) in the AArch64 backend. We aim to set precise bounds for the PCC span and some individual sections in elfNN_c64_resize_sections. However, it transpires that elfNN_aarch64_size_stubs can change the layout in ways that extend sections that should be covered under the PCC span outside of the bounds set in elfNN_c64_resize_sections. The introduction of stubs can also change (even reduce) the amount of padding required to make the bounds on any given section precise. To address this problem, we move the core logic from elfNN_c64_size_sections into a new function, c64_resize_sections, that is safe to be called repeatedly. Similarly, we move the core logic from elfNN_aarch64_size_stubs into a new function aarch64_size_stubs which again can be called repeatedly. We then adjust elfNN_aarch64_size_stubs to call aarch64_size_stubs and c64_resize_sections in a loop, stopping when c64_resize_sections no longer makes any changes to the layout. An important observation made above is that the introduction of stubs can change the amount of padding needed to make bounds precise. Likewise, introducing padding can in theory necessitate the introduction of stubs (e.g. if the change in layout necessitates a long-branch veneer). This is why we run the resizing/stubs code in a loop until no further changes are necessary. Since the amount of padding needed to achieve precise bounds for a section can change (indeed reduce) with the introduction of stubs, we need a mechanism to update the amount of padding applied to a section in a subsequent iteration of c64_resize_sections. We achieve this by introducing a new interface in ld/emultempl/aarch64elf.em. We have the functions: static void c64_set_section_padding (asection *osec, bfd_vma padding, void **cookie); static void c64_get_section_padding (void *cookie); Here, the "cookie" value is, to consumers of this interface (i.e. bfd/elfnn-aarch64.c), an opaque handle used to refer to the padding that was introduced for a given section. The consuming code then passes back the cookie to later query the amount of padding already installed or to update the amount of padding. Internally, within aarch64elf.em, the "cookie" is just a pointer to the node in the ldexp tree containing the integer amount of padding inserted. In the AArch64 ELF backend, we then maintain a (lazily-allocated) mapping between output sections and cookies in order to be able to update the padding we installed in subsequent iterations of c64_resize_sections. While working on this patch, an edge case became apparent: the case where pcc_high_sec requires precise bounds (i.e. where we call ensure_precisely_bounded_section on pcc_high_sec). As it stands, in this case, the code to ensure precise PCC bounds may in fact make the bounds on pcc_high_sec itself no longer representable (even if we previously ensured this by calling ensure_precisely_bounded_section). In general, it is not always possible to choose an amount of padding to add to the end of pcc_high_sec to make both pcc_high_sec and the PCC span itself have precise bounds (without introducing an unreasonably large alignment requirement on pcc_high_sec). To handle the edge case above, we decouple these two problems by adding a separate amount of padding *after* pcc_high_sec to make the PCC bounds precise. If pcc_high_sec is required to have precise bounds, then that can be done in the usual way by adding padding to pcc_high_sec in ensure_precisely_bounded_section. The new mechanism for adding padding after an output section is implemented in aarch64elf.em:c64_pad_after_section. To avoid having to add yet another mechanism to update the padding *after* pcc_high_sec, we avoid adding this padding until all other resizing / bounds-setting work is done. This is not possible for individual sections since padding introduced there may have a knock-on effect requiring further work, but we believe this isn't the case for the padding added after pcc_high_sec to make the PCC bounds precise. This patch also reveals a pre-existing issue whereby we end up calling ensure_precisely_bounded_section on the *ABS* section. Without a further change to prevent this, this can lead to a null pointer dereference in ensure_precisely_bounded_section, since the "owner" field on the *ABS* pointer is NULL, and we use this field to obtain a pointer to the output BFD in the new c64_get_section_padding_info function. Of course, it doesn't make sense for ensure_precisely_bounded_section to be called on the *ABS* section in the first place. This can happen when there are relocations against ldscript-defined symbols which are defined at the top level of the ldscript (i.e. not in a particular output section). Those symbols initially have their output section set to the *ABS* section. Later, we resolve such symbols to their correct output section in ldexp_finalize_syms, but the code in c64_resize_sections is running in ldemul_after_allocation, which comes before the call to ldexp_finalize_syms in the lang_process flow. For now, we just skip such symbols when looking for sections that need precise bounds in c64_resize_sections, but this issue will later need fixing properly. We choose to avoid fixing the pre-existing issue in this patch to avoid over-complicating an already complex change. Tested on aarch64-none-elf and aarch64-none-linux-gnu, OK for the Morello branch? Thanks, Alex

3 years, 5 months

2026

2025

2024

2023

2022

2021

2020

Gnu-morello October 2022