I am very excited about the binary release of gcc-toolchain, but I run into problems trying to compile something as simple as:
int main()
{
return 0;
}
$ ~/morello-gnu/bin/aarch64-none-elf-gcc -march=morello -O0 -nostdlib hello.c
/home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400000
$ ~/morello-gnu/bin/aarch64-none-elf-gcc -march=morello -O0 hello.c
/home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/lib/hybridcap/libc.a(lib_a-exit.o): in function `exit':
/data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/newlib/libc/stdlib/exit.c:70: undefined reference to `_exit'
/home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/lib/hybridcap/crt0.o: in function `_start':
/data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:360: undefined reference to `__init_global_caps'
/home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:361: undefined reference to `__processRelocs'
/home/lord/morello-gnu/bin/../lib/gcc/aarch64-none-elf/10.1.0/../../../../aarch64-none-elf/bin/ld: /data/jenkins/workspace/GNU-toolchain/morello-trunk/src/newlib-cygwin/libgloss/aarch64/crt0.S:365: undefined reference to `initialise_monitor_handles'
collect2: error: ld returned 1 exit status
Any suggestions on what I am doing wrong?
Also, I think it mentions that pure and hybrid capabilities modes are supported, but I could not find option to specify which one to use.
Thanks!
Vadim
--
Senior Research Associate
Department of Computer Science and Technology
University of Cambridge
http://zaliva.org/
The symbol that objdump reports for the start of the data section is not
important and is different between linux and bare metal builds.
Just avoiding specifying this symbol in our testcase fixes a testsuite
failure in the linux build.
############### Attachment also inlined for ease of reply ###############
diff --git a/ld/testsuite/ld-aarch64/morello-large-function.d b/ld/testsuite/ld-aarch64/morello-large-function.d
index dc0af92e12f9ab7203a400997ded14c8bdf4172c..55f87d3a4db8d0af011bd5c10ebca289d679e868 100644
--- a/ld/testsuite/ld-aarch64/morello-large-function.d
+++ b/ld/testsuite/ld-aarch64/morello-large-function.d
@@ -15,7 +15,7 @@
Disassembly of section \.data:
-[0-9a-f]+ <__data_start>:
+[0-9a-f]+ <.*>:
*[0-9a-f]+: .*
.*: R_MORELLO_RELATIVE \*ABS\*\+.*
*[0-9a-f]+: .* udf #0
GNU bfd linker removes duplicate CIE and FDE entries in the exception
handling information. When it does this entries in the .eh_frame
section end up in different positions to where they were originally.
In order to account for that, when the linker removes an FDE/CIE entry
from one object's .eh_frame section in order to prefer an equivalent
entry in another object's .eh_frame section, the linker adjusts symbols
which were pointing to the first entry to point to the second.
If the assembler has changed symbols pointing into the .eh_frame section
such that they are now described by a section symbol plus offset, the
linker can not perform this transformation. This means that symbols can
end up pointing at different information than they originally pointed
at.
NOTE1: This changes the behaviour of this on *all* targets. As it
stands that seems like the correct approach since the linker behaviour
that we are accounting for is a general behaviour. On top of that, this
translation should not make a change in functionality if the linker
behaviour were not enabled for some target (since without the linker
behaviour this transformation should not affect anything -- which is why
it's believed to be safe in the first place). However it is still
important to note that we have not actually tested these changes on
other architectures.
NOTE2: Since the GNU linker makes its decision to look for items to
merge or not based on the *output* section name, there is a mapping
between output sections and input sections that can be modified by the
user, and we may not even be using the GNU linker in the first place,
our patch can not be 100% accurate and robust when choosing which
sections to avoid this adjustment.
It is still desirable to avoid problematic adjustment in the common case
of using the GNU linker with the default mapping between input sections
and output sections. Though it may not be desirable to hard-code a
feature of the default linker script at the time of writing this patch
into GAS.
Here we use the same check that the assembler uses in gas/dw2gencfi.c to
identify a .eh_frame section. This has the benefits of being a check
the assembler is using already (so the assembler is internally
consistent) and matching the split that the default bfd linker scripts
make between all input sections that said scripts name.
E.g. the default linker script for aarch64-none-elf matches the
following patterns for an output section named .eh_frame_hdr
{ *(.eh_frame_hdr) *(.eh_frame_entry .eh_frame_entry.*) }
and matches the below for an output section named .eh_frame
{ KEEP (*(.eh_frame)) *(.eh_frame.*) }.
The linker then applies the problematic transformation to the .eh_frame
output section and not to the .eh_frame_hdr section.
The check we use here makes a corresponding decision to all sections
which would be caught by the above patterns. I.e. it avoids adjusting
symbols in sections which would end up in the .eh_frame output section
and does not avoid adjusting symbols in sections which would end up in
the .eh_frame_hdr output section.
NOTE3: This behaviour by itself is not causing any problems for us. The
trigger for making this change (especially in morello binutils) is that
when crtbeginT.o "registers" an object's exception handling information
with a static glibc, it uses `adrp` and `add` to access the
__EH_FRAME_BEGIN__ symbol. Currently the relocation on `adrp` is
adjusted into a "section symbol plus offset" transformation (which ends
up as exactly the start of the section symbol in the crtbeginT.o
object), but the `add` instruction is not adjusted in this way.
It is this *difference* that is problematic.
It means that we can end up with a broken pointer using the
.eh_frame page and the __EH_FRAME_BEGIN__ offset into a page.
With base AArch64, both these instructions would be adjusted to point to
the .eh_frame section of crtbeginT.o. This is still a buggy behaviour
in the assembler due to the reasons given above, but it at least meant
that the static glibc got a sensible pointer (though one starting after
any exception frame information on the crti.o and crt1.o object files).
With this change, both instructions stay pointing at __EH_FRAME_BEGIN__
in the object file. That means that the linker will leave both
instructions pointing at the same place after de-duplication of
exception information. That place is not guaranteed to be the start of
the total exception frame information, but in practice it is always
closer to the start of the debug frame than without having made this
patch.
We could have stopped static glibc from crashing by making sure that we
accessed the .eh_frame section symbol for both instructions rather than
using __EH_FRAME_BEGIN__ for both. This would behave in the same way as
stock AArch64.
This would mean that static glibc would not be affected by the
particulars of how the GNU bfd linker merges CIE and FDE entries
together. On the other hand it would mean that static glibc would never
have the ability to unwind through start code in crt1.o and crti.o.
I don't have a particularly strong opinion on which of these is the best
approach, I chose this one since it gives the static glibc access to the
full debug information for the moment.
############### Attachment also inlined for ease of reply ###############
diff --git a/gas/testsuite/gas/aarch64/eh-frame-symbols.d b/gas/testsuite/gas/aarch64/eh-frame-symbols.d
new file mode 100644
index 0000000000000000000000000000000000000000..8276648ea1c5ea1ce7cd945ecfdf952cd71a0911
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/eh-frame-symbols.d
@@ -0,0 +1,24 @@
+# Checking that our relocations against the symbol __EH_FRAME_BEGIN__ are not
+# transformed into relocations against the section symbol .eh_frame.
+#as: -march=morello+c64
+#objdump: -dr
+
+.*: file format .*
+
+
+Disassembly of section \.text:
+
+.* <get_eh_frame_begin>:
+ *[0-9a-f]+: ........ stp c29, c30, \[csp, #-64\]!
+ *[0-9a-f]+: ........ adrp c0, 0 <get_eh_frame_begin>
+ 4: R_MORELLO_ADR_PREL_PG_HI20 __EH_FRAME_BEGIN__
+ *[0-9a-f]+: ........ adrp c0, 0 <get_eh_frame_begin>
+ 8: R_MORELLO_ADR_PREL_PG_HI20 \.eh_frame
+ *[0-9a-f]+: ........ add c0, c0, #0x0
+ c: R_AARCH64_ADD_ABS_LO12_NC __EH_FRAME_BEGIN__
+ *[0-9a-f]+: ........ add c0, c0, #0x0
+ 10: R_AARCH64_ADD_ABS_LO12_NC \.eh_frame
+ *[0-9a-f]+: ........ ldp c29, c30, \[csp\], #64
+ *[0-9a-f]+: ........ ret c30
+
+#...
diff --git a/gas/testsuite/gas/aarch64/eh-frame-symbols.s b/gas/testsuite/gas/aarch64/eh-frame-symbols.s
new file mode 100644
index 0000000000000000000000000000000000000000..a10b59e034e15d048c5aef4ee1a82da9acd9e69e
--- /dev/null
+++ b/gas/testsuite/gas/aarch64/eh-frame-symbols.s
@@ -0,0 +1,31 @@
+.section .eh_frame,"a"
+.type __EH_FRAME_BEGIN__, %object
+__EH_FRAME_BEGIN__:
+
+.text
+.type get_eh_frame_begin, %function
+get_eh_frame_begin:
+ .cfi_startproc purecap
+ stp c29, c30, [csp, -64]!
+ .cfi_def_cfa_offset 64
+ .cfi_offset 227, -64
+ .cfi_offset 228, -48
+ adrp c0, __EH_FRAME_BEGIN__
+ adrp c0, .eh_frame
+ add c0, c0, :lo12:__EH_FRAME_BEGIN__
+ add c0, c0, :lo12:.eh_frame
+ ldp c29, c30, [csp], 64
+ .cfi_restore 228
+ .cfi_restore 227
+ .cfi_def_cfa_offset 0
+ ret
+ .cfi_endproc
+
+.global _start
+.type _start, %function
+_start:
+ mov x0, #0
+ ret
+
+# .zero 0xff0 - 0x38 - 0x78
+.zero 0xff0 - 0x38
diff --git a/gas/write.c b/gas/write.c
index 054f27987d51f1732afdc81bbc9e19a95160c53f..5301bbd465048208c51f9b12fb6479c765c9a5f4 100644
--- a/gas/write.c
+++ b/gas/write.c
@@ -878,6 +878,28 @@ adjust_reloc_syms (bfd *abfd ATTRIBUTE_UNUSED,
continue;
}
+ /* Avoid adjusting a relocation against a symbol pointing into a
+ ".eh_frame" section in an ELF binary. The GNU bfd linker attempts
+ to de-duplicate CIE/FDE's in *output* .eh_frame sections in ELF
+ binaries and adjust any relocations pointing at the now removed
+ duplicate to point to the remaining entry.
+ Hence a symbol which had been adjusted to a section symbol plus
+ offset would end up pointing at something completely different.
+
+ We can not robustly match the exact linker-input sections that will
+ end up in the .eh_frame output section, since users may provide
+ their own linker scripts. However it does seem useful to avoid
+ transforming symbols in the sections that are expected to end up in
+ this linker output section. Rather than add a clause here to match
+ the current default linker script we use the same check based on
+ section name as is used in `gas/dw2gencfi.c` to check for .eh_frame
+ sections. */
+ if (IS_ELF
+ && strncmp (segment_name (symsec),
+ ".eh_frame", sizeof ".eh_frame" - 1) == 0
+ && segment_name (symsec)[9] != '_')
+ continue;
+
/* Never adjust a reloc against local symbol in a merge section
with non-zero addend. */
if ((symsec->flags & SEC_MERGE) != 0
We're disabling transformations of relocations against symbols like this
in the assembler when the relocation is against something in the GOT and
when the relocation is against something which generates a capability.
For entries in the GOT we disable this transformation since the GNU bfd
linker relies on indexing into its internal representation of the GOT
using symbols and does not distiinguish between entries using the same
symbol but different offsets. Hence transforming multiple symbols into
the same section symbol with different offsets would mean that at least
one will get an incorrect value.
Relocations which require the static linker to emit dynamic relocations
in order to generate capabilities (CAPINIT and capability relocations
into the GOT) require symbol information so that the dynamic linker can
put correct permissions and bounds on those relocations.
NOTE: We get to use an existing testcase for this change, but it showed
up something strange about objdump. One `adrp` instruction has changed
in the output so that it shows as pointing to a different location.
This happens to be an `objdump` quirk. Objdump looks at the relocation
associated with an address and attempts to include that relocation when
determining what address to print out. This mechanism has two problems,
one is that objdump does not account for the offset in that relocation
(only the symbol). Another is that on an object file (i.e. not a final
executable) the virtual memory address of all sections is zero. These
combined mean that the vma is miscalculated, and the translation from
vma to symbol is not injective. In other words: the extra change in
morello-ldst-reloc.d on top of switching the relocation symbols is in
order to account for an objdump bug and not a problem with this gas
change.
############### Attachment also inlined for ease of reply ###############
diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
index a1bd6e1a769d968ed0bdb2b3dca7bc87732cc498..c8373b311dd14f3c53ce4099ab1ffee28d360e8a 100644
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -9382,13 +9382,37 @@ check_mapping_symbols (bfd * abfd ATTRIBUTE_UNUSED, asection * sec,
bfd_boolean
aarch64_fix_adjustable (struct fix *fixP)
{
- /* We need size information of the target symbols to initialise
- capabilities. */
- if (fixP->fx_r_type == BFD_RELOC_MORELLO_CAPINIT)
- return FALSE;
-
switch (fixP->fx_r_type)
{
+ /* The AArch64 GNU bfd linker can not handle 'symbol + offset' entries in the
+ GOT (it internally uses a symbol to reference a GOT slot). Hence we can't
+ emit any "section symbol + offset" relocations for the GOT. */
+ case BFD_RELOC_AARCH64_GOT_LD_PREL19:
+ case BFD_RELOC_AARCH64_ADR_GOT_PAGE:
+ case BFD_RELOC_AARCH64_LD64_GOT_LO12_NC:
+ case BFD_RELOC_AARCH64_LD32_GOT_LO12_NC:
+ case BFD_RELOC_AARCH64_MOVW_GOTOFF_G0_NC:
+ case BFD_RELOC_AARCH64_MOVW_GOTOFF_G1:
+ case BFD_RELOC_AARCH64_LD64_GOTOFF_LO15:
+ case BFD_RELOC_AARCH64_LD32_GOTPAGE_LO14:
+ case BFD_RELOC_AARCH64_LD64_GOTPAGE_LO15:
+ case BFD_RELOC_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21:
+ case BFD_RELOC_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
+ case BFD_RELOC_AARCH64_TLSIE_LD32_GOTTPREL_LO12_NC:
+ case BFD_RELOC_AARCH64_TLSIE_LD_GOTTPREL_PREL19:
+ case BFD_RELOC_AARCH64_TLSIE_MOVW_GOTTPREL_G0_NC:
+ case BFD_RELOC_AARCH64_TLSIE_MOVW_GOTTPREL_G1:
+ case BFD_RELOC_AARCH64_LD_GOT_LO12_NC:
+ case BFD_RELOC_AARCH64_TLSIE_LD_GOTTPREL_LO12_NC:
+ return FALSE;
+
+ /* We need size information of the target symbols to initialise
+ capabilities. */
+ case BFD_RELOC_MORELLO_CAPINIT:
+ case BFD_RELOC_MORELLO_ADR_GOT_PAGE:
+ case BFD_RELOC_MORELLO_LD128_GOT_LO12_NC:
+ return FALSE;
+
/* We need to retain symbol information when jumping between A64 and C64
states or between two C64 functions. In the C64 -> C64 situation it's
really only a corner case that breaks when symbols get replaced with
diff --git a/gas/testsuite/gas/aarch64/morello-ldst-reloc.d b/gas/testsuite/gas/aarch64/morello-ldst-reloc.d
index d2fb08ec9d13c28e6fa70f12c8f49f88fc37f34f..2199e90dd9091367ff6b8f237fa806ab642386ca 100644
--- a/gas/testsuite/gas/aarch64/morello-ldst-reloc.d
+++ b/gas/testsuite/gas/aarch64/morello-ldst-reloc.d
@@ -21,13 +21,13 @@ Disassembly of section \.text:
.*: R_AARCH64_ADD_ABS_LO12_NC ptr
.* <f1>:
- .*: 90800002 adrp c2, 0 <_start>
- .*: R_MORELLO_ADR_GOT_PAGE \.data\+0x10
+ .*: 90800002 adrp c2, 10 <add>
+ .*: R_MORELLO_ADR_GOT_PAGE cap
.*: c2400042 ldr c2, \[c2\]
- .*: R_MORELLO_LD128_GOT_LO12_NC \.data\+0x10
+ .*: R_MORELLO_LD128_GOT_LO12_NC cap
.*: 82600042 ldr c2, \[x2\]
- .*: R_MORELLO_LD128_GOT_LO12_NC \.data\+0x20
+ .*: R_MORELLO_LD128_GOT_LO12_NC ptr
.*: f9400042 ldr x2, \[c2\]
- .*: R_AARCH64_LD64_GOT_LO12_NC \.data\+0x20
+ .*: R_AARCH64_LD64_GOT_LO12_NC ptr
.*: 82600c42 ldr x2, \[x2\]
- .*: R_AARCH64_LD64_GOT_LO12_NC \.data\+0x20
+ .*: R_AARCH64_LD64_GOT_LO12_NC ptr
diff --git a/gas/testsuite/gas/aarch64/reloc-insn.d b/gas/testsuite/gas/aarch64/reloc-insn.d
index 0f3b4143d964ed08ef1d435fd518a7e2b13a80f8..8898a88bca0fe8f8e2755741c9d466636bf4aaff 100644
--- a/gas/testsuite/gas/aarch64/reloc-insn.d
+++ b/gas/testsuite/gas/aarch64/reloc-insn.d
@@ -157,9 +157,9 @@ Disassembly of section \.text:
18c: 39400001 ldrb w1, \[x0\]
190: d65f03c0 ret
194: f94001bc ldr x28, \[x13\]
- 194: R_AARCH64_LD64_GOTPAGE_LO15 \.data
+ 194: R_AARCH64_LD64_GOTPAGE_LO15 dummy
198: f9400000 ldr x0, \[x0\]
- 198: R_AARCH64_LD64_GOTOFF_LO15 .data
+ 198: R_AARCH64_LD64_GOTOFF_LO15 dummy
000000000000019c <llit>:
19c: deadf00d \.word 0xdeadf00d
Hi,
The calculation of OBJALLOC_ALIGN in include/objalloc.h ensures that
allocations are sufficiently aligned for doubles, but on CHERI
architectures it is possible that void * has a greater alignment
requirement than double.
Instead of deriving the alignment requirement from double alone, this
patch uses a union to compute the maximum alignment between double and
void *.
This fixes alignment faults seen when compiling the binutils for
pure-capability Morello. With this patch applied, the majority of
binutils tests pass when the binutils themselves are compiled for
purecap.
This patch is a backport of commit
a8af417a8a1559a3ebceb0c761cf26ebce5eab7f, initially upstreamed to
Morello GCC.
OK for morello branch?
Thanks,
Alex