On 11/12/20 12:28 PM, J.R.T. Clarke wrote:
On 12 Nov 2020, at 12:45, Luis Machado luis.machado@linaro.org wrote:
Hi,
On 11/11/20 6:15 PM, J.R.T. Clarke wrote:
Hi Luis, On 11 Nov 2020, at 18:45, Luis Machado via Gnu-morello gnu-morello@op-lists.linaro.org wrote:
From: Luis Machado luis.machado@arm.com
Support the data_capability and code_capability types, which the capability counterparts of the data_ptr and code_ptr types.
Adjust the Morello C registers to be of data_capability and code_capability types.
Out of interest, what's the justification behind this? In general the C
This is mostly a change to improve how GDB handles/displays the C registers. Given Morello C registers are 129 bit in size in order to hold a capability, it is convenient to have a data/code capability type so GDB can display the decoded capability for some commands. For example, "info reg".
Being a pointer, it will also try to show data the capability points to.
It is true that the content of a C register may not always be a capability, but in that case GDB will also show you the full hex value (minus the tag bit).
GDB has a 64-bit limitation for int types, so we can't have a 128-bit int type for these registers. And that would require users to use a new print modifier to print the data as a capability, which may or may not be convenient. It's up for debate.
So there are two issues here:
- That it might hold an integer rather than a capability
- That it might be a code capability rather than a data capability
I agree they need a type that means they are interpreted as a capability, but I disagree that that should be data_ptr; some of them will be code pointers, some will be integers, and some will just be arbitrary untagged 128-bit quantities because you memcpy'd plain data. The right type to express that in CHERI C is (u)intcap_t, so your (u)intcap should also be the appropriate type, surely?
I understand your concern in terms of type coherence and differences compared to the formal CHERI C spec, but debuggers don't strictly adhere to language specs in terms of their user interface, otherwise some operations would've been awkward if the tool followed strict language rules.
Debuggers focus instead on making it easier for the user to see what's important to them. From this perspective, code_cap/data_cap/__intcap_t/__uintcap_t are pretty much the same here, except for small user conveniences
For example:
(gdb) info symbol &foo foo in section .bss of /tmp/hello (gdb) ptype foo type = int (gdb) p $sp $1 = (void *) 0x7fffffffe348 (gdb) set $sp=&foo (gdb) p $sp $2 = (void *) 0x555555558014 <foo>
This is what [data|code]_[ptr|cap] accomplishes in GDB.
If we turn SP into a __intcap_t or __uintcap_t, GDB won't automatically fetch symbol information. It will be just a raw hex/integer dump when you print it.
As for the issues you listed, I'm not sure I understand how big an impact they would have.
For (1), the tag bit will make it obvious we're dealing with an invalid capability or data.
For (2), there wouldn't be user-visible differences. GDB would be happy to display either function or data symbols associated with the value stored in the register, regardless of the register being of data/code type.
registers could be either data or code capabilities depending on whether they're storing a function pointer (at least without a descriptor-based ABI). Surveying the AArch64 and RISC-V XML files I see that data_ptr and code_ptr are only used in cases where the registers are definitely only used for that, i.e. PC is code_ptr and SP is data_ptr on AArch64, and on RISC-V RA also gets code_ptr and GP, TP and FP get data_ptr, with everything else being left as an integer. Should the CHERI equivalents not mirror that, with the general-purpose registers remaining (u)intcap and only PCC and CSP having more-specific types
We could make the C registers (u)intcap again, sure. But that wouldn't change things much for GDB. (u)intcap types are treated in a similar way as capability pointers.
Which is fine, we do want them to be printed as capabilities. Maybe code_ptr/data_ptr/uintcap are all equivalent at the moment, but they may not be in future, so we should ensure that the registers are accurately typed.
I see the CUCL Morello port uses data_ptr/code_ptr types for all the C registers, which I suppose is due to GDB's limitation of 64-bit ints. So my approach is also aligned with that.
I take it this isn't a problem for architectures using 65-bit capabilities, since we can just use a regular 64-bit int then. But for Morello this is a bit more tricky.
We use data_ptr specifically for the general-purpose ones because we don't have data_cap nor (u)intcap types and thus have no other way to do it. That should change though once we merge in your changes to the internals. GDB's 64-bit int limitation is irrelevant; even on RV32 we still need a non-plain-integer type, as you don't want to interpret the metadata as part of a 64-bit integer (and you also *do* want to interpret the tag). Our RV32 and RV64 have identical XML other than the bit width.
Understood. We should discuss this further on the debugger-focused meeting.