On 24 Jan 2022, at 12:04, Matthew Malcomson Matthew.Malcomson@arm.com wrote:
Hi Jess, Thanks for the heads-up, I hadn't made that connection with the loss of information about symbol type (the size thing we saw and agree it's an important bug).
Can I just double-check that I understand where the type of a local symbol is used by the linker or runtime w.r.t. CHERI? The places I know of are all around handling the Morello LSB for STT_FUNC symbols and I wanted to make sure I wasn't missing anything (just got the impression from your email that there were more cases).
That’s Morello-specific. All CHERI implementations need to know the symbol type. The symbol type governs the permissions of the object; STT_FUNC is RX and STT_OBJECT is either RW or R, depending on the permissions of the symbol’s section (where .data.rel.ro is read-only). STT_NOTYPE gets conservatively treated as STT_OBJECT; this doesn’t affect anything except if you have an STT_NOTYPE function, as then that will be R rather than RX (given .text is AX not AWX), and of course on Morello will lose the LSB.
On Morello any capability fragment relocations will also have their bounds based on the type; STT_FUNCs cover the entire text segment, including GOT, STT_OBJECTs get tight bounds. CheriBSD ignores the bounds for executable capability relocations, as historically they were also tight (which is what the old __cap_relocs format had and, for CHERI-MIPS and CHERI-RISC-V, still has) and because it’s a waste of time to set dynamic bounds that you already know up-front.
Also rtld has similar logic for relocations against dynamic symbols and for dlsym, though you’re not going to end up with relocations against section symbols in those cases, just risk bad hand-written assembly that forgets to set the type.
Jess
Thanks, Matthew
From: jessica.clarke@cl.cam.ac.uk jessica.clarke@cl.cam.ac.uk Sent: 18 January 2022 11:49 To: Matthew Malcomson Matthew.Malcomson@arm.com Cc: gnu-morello@op-lists.linaro.org gnu-morello@op-lists.linaro.org; Kyrylo Tkachov Kyrylo.Tkachov@arm.com Subject: Re: [Gnu-morello] [Committed] morello-binutils: Bugfixes in MORELLO GOT relocations On 18 Jan 2022, at 11:11, Matthew Malcomson via Gnu-morello gnu-morello@op-lists.linaro.org wrote:
Trying to link code against newlib with the current BFD Morello linker we get quite a lot of cases of the error below. "relocation truncated to fit: R_MORELLO_LD128_GOT_LO12_NC against symbol `<whatever>' defined in .text.<whatever> section in <filename>"
This happens because the relocation gets transformed into a relocation pointing into the GOT in elfNN_aarch64_final_link_relocate, but the h->target_internal flag that indicates whether this is a C64 function symbol or not is then added to the *end* value rather than the value that is stored in the GOT.
This then correctly falls foul of a check in _bfd_aarch64_elf_put_addend that ensures the value we get from this relocation is 8-byte aligned since it must be pointing to the start of a valid entry in the GOT.
Here we ensure that this LSB is set on the value newly added into the GOT rather than on the offset pointing into the GOT. This both means that loading function symbols from the GOT will have the LSB correctly set (hence we stay in C64 mode when branching to this function as we should) and it means that the error about a misaligned GOT address is fixed.
In this patch we also ensure that we add a dynamic relocation to initialise the correct GOT entry when we are resolving a MORELLO relocation that requires an entry in the GOT. This was already handled in the case of a global symbol, but had not been handled in the case of a local symbol. This is why we set `relative_reloc` to TRUE in if resolving a MORELLO GOT relocation against a static executable.
In writing the testcase for this patch we found an existing bug to do with static relocations of this kind (of this kind meaning that are handled in this case statement). The assembler often chooses to create the relocation against the section symbol rather than the original symbol, and make up for that by giving the relocation an addend. The
Capability relocations against section symbols aren’t valid for CHERI, they break the ability to determine the type of the symbol (function vs data, which can’t be inferred from the section type alone as both tend to exist in .text) and, in the case of data, its size. So this sounds like an important bug in your assembler, regardless of whether your GOT implementation includes an addend in the key.
Jess
linker does not have any mechanism to create "symbol plus addend" entries in the GOT -- it indexes into the GOT based on the symbol only. Hence all relocations which are a section symbol plus addend end up pointing at one value in the GOT just containing the value of the symbol. We do not fix this existing bug, but just note it given that this is in the same area.
############### Attachment also inlined for ease of reply ###############
diff --git a/bfd/elfnn-aarch64.c b/bfd/elfnn-aarch64.c index 5c78fb54f919ddc7877b69da21f4594ade8ee98b..5fabfa32117fb20364aeefe949e21043c4731f24 100644 --- a/bfd/elfnn-aarch64.c +++ b/bfd/elfnn-aarch64.c @@ -7091,6 +7091,9 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto, case BFD_RELOC_AARCH64_MOVW_GOTOFF_G1: off = symbol_got_offset (input_bfd, h, r_symndx); base_got = globals->root.sgot;
bfd_boolean c64_reloc =
(bfd_r_type == BFD_RELOC_MORELLO_LD128_GOT_LO12_NC
|| bfd_r_type == BFD_RELOC_MORELLO_ADR_GOT_PAGE); if (base_got == NULL) BFD_ASSERT (h != NULL);
@@ -7099,9 +7102,6 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto, if (h != NULL) { bfd_vma addend = 0;
bfd_boolean c64_reloc =
(bfd_r_type == BFD_RELOC_MORELLO_LD128_GOT_LO12_NC
|| bfd_r_type == BFD_RELOC_MORELLO_ADR_GOT_PAGE); /* If a symbol is not dynamic and is not undefined weak, bind it locally and generate a RELATIVE relocation under PIC mode.
@@ -7122,7 +7122,8 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto, && !symbol_got_offset_mark_p (input_bfd, h, r_symndx)) relative_reloc = TRUE;
value = aarch64_calculate_got_entry_vma (h, globals, info, value,
value = aarch64_calculate_got_entry_vma (h, globals, info,
value | h->target_internal, output_bfd, unresolved_reloc_p); /* Record the GOT entry address which will be used when generating
@@ -7136,7 +7137,6 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto, value = _bfd_aarch64_elf_resolve_relocation (input_bfd, bfd_r_type, place, value, addend, weak_undef_p);
value |= h->target_internal; } else {
@@ -7160,14 +7160,17 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto,
if (!symbol_got_offset_mark_p (input_bfd, h, r_symndx)) {
bfd_put_64 (output_bfd, value, base_got->contents + off);
bfd_put_64 (output_bfd, value | sym->st_target_internal,
base_got->contents + off); /* For local symbol, we have done absolute relocation in static linking stage. While for shared library, we need to update the content of GOT entry according to the shared object's runtime base address. So, we need to generate a R_AARCH64_RELATIVE reloc for dynamic linker. */
if (bfd_link_pic (info))
if (bfd_link_pic (info)
|| (!bfd_link_pic (info) && bfd_link_executable (info)
&& c64_reloc)) relative_reloc = TRUE; symbol_got_offset_mark (input_bfd, h, r_symndx);
@@ -7183,8 +7186,6 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto, value = _bfd_aarch64_elf_resolve_relocation (input_bfd, bfd_r_type, place, value, addend, weak_undef_p);
value |= sym->st_target_internal; } if (relative_reloc)
@@ -7198,8 +7199,7 @@ elfNN_aarch64_final_link_relocate (reloc_howto_type *howto,
/* For a C64 relative relocation, also add size and permissions into the frag. */
if (bfd_r_type == BFD_RELOC_MORELLO_LD128_GOT_LO12_NC
|| bfd_r_type == BFD_RELOC_MORELLO_ADR_GOT_PAGE)
if (c64_reloc) { bfd_reloc_status_type ret;
diff --git a/ld/testsuite/ld-aarch64/aarch64-elf.exp b/ld/testsuite/ld-aarch64/aarch64-elf.exp index 8de7bdfc6376e3ee0b750eb6429f18d6f4a92017..6b42dcfeddeb8426ce0c3aff19cd6ba904ddf2a8 100644 --- a/ld/testsuite/ld-aarch64/aarch64-elf.exp +++ b/ld/testsuite/ld-aarch64/aarch64-elf.exp @@ -246,6 +246,8 @@ run_dump_test_lp64 "emit-relocs-morello-3" run_dump_test_lp64 "emit-relocs-morello-3-a64c" run_dump_test_lp64 "emit-relocs-morello-4" run_dump_test_lp64 "emit-relocs-morello-5" +run_dump_test_lp64 "emit-relocs-morello-6" +run_dump_test_lp64 "emit-relocs-morello-6b" run_dump_test_lp64 "emit-morello-reloc-markers-1" run_dump_test_lp64 "emit-morello-reloc-markers-2" run_dump_test_lp64 "emit-morello-reloc-markers-3" diff --git a/ld/testsuite/ld-aarch64/emit-relocs-morello-6.d b/ld/testsuite/ld-aarch64/emit-relocs-morello-6.d new file mode 100644 index 0000000000000000000000000000000000000000..d97a59a916bda27854933362ba00afa174bb6886 --- /dev/null +++ b/ld/testsuite/ld-aarch64/emit-relocs-morello-6.d @@ -0,0 +1,44 @@ +# Check handling relocations into the got that require a GOT entry. +# This case handles PIE binaries. +# +# This testcase uses exact values in order to check that of the two GOT entries +# created, the one that is referenced by the first instruction in _start is +# the one which has the LSB set in its value. +# +# It's difficult to check this in the DejaGNU testsuite without checking for +# specific values that we know are good. However this is susceptible to +# defaults changing where the .text and .got sections end up. +# +# If this testcase prooves to be too flaky while the linker gets updated then +# we should look harder for some solution, but for now we'll take this +# tradeoff. +#source: emit-relocs-morello-6.s +#as: -march=morello+c64 +#ld: -Ttext-segment 0x0 -pie -static +#objdump: -DR -j .got -j .text
+.*: file format .*
+Disassembly of section .text:
+00000000000001e8 <_start>:
- 1e8: c240c400 ldr c0, [c0, #784]
- 1ec: c240c000 ldr c0, [c0, #768]
+Disassembly of section .got:
+00000000000102f0 <.got>:
- 102f0: 000101f0 .inst 0x000101f0 ; undefined
\.\.\.
- 10300: 000001e8 udf #488
10300: R_MORELLO_RELATIVE \*ABS\*
- 10304: 00000000 udf #0
- 10308: .*
- 1030c: .*
- 10310: 000001e9 udf #489
10310: R_MORELLO_RELATIVE \*ABS\*
- 10314: .*
- 10318: .*
- 1031c: .*
diff --git a/ld/testsuite/ld-aarch64/emit-relocs-morello-6.s b/ld/testsuite/ld-aarch64/emit-relocs-morello-6.s new file mode 100644 index 0000000000000000000000000000000000000000..eafc9968c522450d832ec0b0ac68df9ada5cb446 --- /dev/null +++ b/ld/testsuite/ld-aarch64/emit-relocs-morello-6.s @@ -0,0 +1,20 @@ +# Checking +# - LD128 relocation has been resolved to GOT location. +# - Relocation at that GOT location is introduced. +# - GOT fragment contains address required. +# - GOT fragment has LSB set if relocation is a function symbol. +.arch morello+c64
- .text
- .global _start
- .type _start,@function
+_start:
- .size _start,12
- .type obj,@object
- .global obj
- .size obj,1
+obj:
- ldr c0, [c0, :got_lo12:_start]
- ldr c0, [c0, :got_lo12:obj]
diff --git a/ld/testsuite/ld-aarch64/emit-relocs-morello-6b.d b/ld/testsuite/ld-aarch64/emit-relocs-morello-6b.d new file mode 100644 index 0000000000000000000000000000000000000000..3d2ca260156ea4e83d99cce8962cf42fe9b19151 --- /dev/null +++ b/ld/testsuite/ld-aarch64/emit-relocs-morello-6b.d @@ -0,0 +1,56 @@ +# Check handling relocations into the got that require a GOT entry. +# This case handles non-PIE binaries. +# +# This testcase uses exact values in order to check that of the two GOT entries +# created, the one that is referenced by the first instruction in _start is +# the one which has the LSB set in its value. +# +# It's difficult to check this in the DejaGNU testsuite without checking for +# specific values that we know are good. However this is susceptible to +# defaults changing where the .text and .got sections end up. +# +# If this testcase prooves to be too flaky while the linker gets updated then +# we should look harder for some solution, but for now we'll take this +# tradeoff. +# +# Here we have to use a format which dumps the hex of the relocation section +# since objdump does not show us dynamic relocations on a non-dynamic binary. +#source: emit-relocs-morello-6.s +#as: -march=morello+c64 +#ld: -Ttext-segment 0x0 -static +#objdump: -D -j .rela.dyn -j .got -j .text
+.*: file format .*
+Disassembly of section .rela.dyn:
+0000000000000000 <__rela_dyn_start>:
- 0: 00010060 .*
- 4: 00000000 .*
- 8: 0000e803 .*
\.\.\.
- 18: 00010050 .*
- 1c: 00000000 .*
- 20: 0000e803 .*
\.\.\.
+Disassembly of section .text:
+0000000000000030 <_start>:
- 30: c2401800 ldr c0, [c0, #96]
- 34: c2401400 ldr c0, [c0, #80]
+Disassembly of section .got:
+0000000000010040 <_GLOBAL_OFFSET_TABLE_>:
\.\.\.
- 10050: 00000030 .*
- 10054: 00000000 .*
- 10058: 00000101 .*
- 1005c: 00000000 .*
- 10060: 00000031 .*
- 10064: 00000000 .*
- 10068: 00000c01 .*
- 1006c: 00000000 .*
<got-relocations-bugfixes.patch>-- Gnu-morello mailing list -- gnu-morello@op-lists.linaro.org To unsubscribe send an email to gnu-morello-leave@op-lists.linaro.org