|
| 1 | +== Quick Start |
| 2 | + |
| 3 | +This document describes the RISC-V extensions for supporting CHERI capabilities in hardware. |
| 4 | +Capabilities can be used to provide memory safety, mitigating up to 70% of memory safety issues cite:[msrc-cheri-eval], as well as to provide efficient compartmentalisation. |
| 5 | +The extensions are split into the core features required for a working capability system ({cheri_base_ext_name}), and features required to support a mix-and-match of binaries compiled for CHERI and unchanged binaries ({cheri_default_ext_name}). |
| 6 | +Some other smaller extensions are described that provide additional functionality relevant to CHERI. |
| 7 | + |
| 8 | +=== Capability Properties |
| 9 | + |
| 10 | +Capabilities are 2*XLEN (which we call CLEN) bit structures, containing all the information required to identify and authorise access to a region of memory. |
| 11 | +This includes: |
| 12 | + |
| 13 | + * An XLEN bit address, describing where the capability currently points. |
| 14 | + |
| 15 | + * Bounds: a _base_ and a _top_ address, describing the range of addresses the capability can be used to access. |
| 16 | + |
| 17 | + * Permissions (read, write, execute, read capability, ...) describing the kinds of accesses the capability can be used for. |
| 18 | + |
| 19 | + * Sealing information: a capability can be _sealed_, restricting it to only be used or modified in particular ways. |
| 20 | + |
| 21 | +A one-bit integrity tag is stored alongside a capability: this is maintained by hardware and cannot be directly modified by software. |
| 22 | +It indicates whether the capability is valid. |
| 23 | +An initial <<infinite-cap>> capability with access to all of memory with all permissions is provided in system registers on reset: all valid capabilities are derived from it. |
| 24 | +This is the only way to obtain a valid capability: no software, even machine mode, can _forge_ a capability. |
| 25 | + |
| 26 | +=== Added State |
| 27 | + |
| 28 | +A CHERI core adds state to allow capabilities to be used from within registers, and to ensure they are not corrupted as they flow through the system. |
| 29 | +This means the following state is added: |
| 30 | + |
| 31 | +* Metadata within architectural registers: XLEN-wide integer registers (e.g. `sp`, `a0`) are all extended with another XLEN bits of capability metadata, including bounds and permissions. |
| 32 | + The resulting CLEN bits in full form a capability, and we refer to the same register prefixed with a `c`, i.e. `csp`, `ca0`. |
| 33 | + The integer part of the register is interpreted as the address field of the capability. |
| 34 | + The zero register is extended with zero metadata and a cleared tag: this is called the <<null-cap>> capability. |
| 35 | + As well as general purpose registers, system registers that store addresses are extended to contain capabilities. |
| 36 | + For example, <<mtvec>> is extended to a capability version <<mtvecc>> (the machine trap vector capability) to allow the code bounds to be changed on an exception. |
| 37 | + |
| 38 | +* Tags in registers, caches, and memory: |
| 39 | + |
| 40 | +** Every register has a one-bit tag, indicating whether the capability in the register is valid to be dereferenced. |
| 41 | + This tag is cleared if the register is written as an integer. |
| 42 | + |
| 43 | +** The tags are also tracked through the memory subsystem: every aligned CLEN-bits wide region has a non-addressable one-bit tag, which the hardware manages atomically with the data. |
| 44 | + The tag is cleared if the memory region is ever written other than using a capability store from a tagged capability register. |
| 45 | + Any caches must preserve this abstraction. |
| 46 | + |
| 47 | +=== Checking Memory |
| 48 | + |
| 49 | +Every memory access performed by a CHERI core must be authorised by a capability. |
| 50 | +It is explicitly defined for every instruction where to find the capability to check against. |
| 51 | +In _purecap_ code, where all pointers are individual capabilities, the capability and address are used together, so e.g. `lw t0, 16(csp)` loads a word from memory, getting the address and bounds from the `csp` register. |
| 52 | +For code that has not yet been fully adapted to CHERI (_hybrid_ code), the processor can run in a pointer mode (not to be confused with a privilege mode) where the authorising capability is instead taken from a special CSR: the default data capability (<<ddc>>). |
| 53 | + |
| 54 | +Instruction fetch is also authorised by a capability: the program counter capability (<<pcc>>) which extends PC. |
| 55 | +This allows code fetch to be bounded, preventing a wide range of attacks that subvert control flow with integer data. |
| 56 | +Where {cheri_default_ext_name} is supported, the <<pcc>> also contains the <<m_bit,mode bit>> indicating whether the processor is running in integer or capability pointer mode. |
| 57 | +Changing the bounds used for instruction fetch or the pointer mode can be as easy as performing a capability-based jump (<<JALR>> in capability pointer mode). |
| 58 | +A <<MODESW>> instruction and compressed version is also added to allow cheap mode switching. |
| 59 | + |
| 60 | +Exception codes are added for CHERI-specific exceptions on fetch, jumps, and memory access. |
| 61 | +No other exception paths are added: in particular, capability manipulations do not trap, but may clear the tag on the result capability if the operation is not permitted. |
| 62 | + |
| 63 | +=== Added Instructions |
| 64 | + |
| 65 | +The added instructions can be split into the following categories: |
| 66 | + |
| 67 | +* Capability manipulations (e.g. <<CADD>>, <<SCBNDS>>): for security, capabilities can only be modified in restricted ways. |
| 68 | + Special instructions are provided to perform these allowed operations, for example _shrinking_ the bounds or _reducing_ the permissions. |
| 69 | + Any attempt to manipulate capabilities without using the instructions clears the tag, rendering them unusable for accessing memory. |
| 70 | + |
| 71 | +* Capability inspection (e.g. <<GCBASE>>, <<GCPERM>>): capability fields (for example the _bounds_ describing what addresses the capability gives access to) are stored compressed in registers and memory. |
| 72 | + These instructions give convenient access to allow software to query them. |
| 73 | + |
| 74 | +* Memory access instructions (e.g. <<LC>>, <<SC>>): capabilities must be read from and written to memory atomically along with their tag. |
| 75 | + Instructions are added to perform these wider accesses, allowing capability flow between the memory and the register file. |
| 76 | + |
| 77 | +=== Existing Instructions |
| 78 | + |
| 79 | +Existing RISC-V instructions are largely unmodified: in {cheri_int_mode_name}, there is binary compatibility. |
| 80 | +Instructions that access memory, as well as branches and jumps, are automatically checked against <<ddc>> and <<pcc>>, raising an exception if the checks fail. |
| 81 | +However, <<ddc>> and <<pcc>> are reset to <<infinite-cap>> capabilities, meaning the checks will always pass on systems that have not written to CHERI system registers. |
| 82 | + |
| 83 | +In {cheri_cap_mode_name}, these instructions are instead modified to check against the full capability from the address register (e.g. `lw t0, 16(csp)`). |
| 84 | +In some cases, they are also changed to return a full capability value, e.g. <<AUIPC>> will return the full <<pcc>> including the metadata. |
0 commit comments