diff --git a/specification/appendix_d.adoc b/specification/appendix_d.adoc new file mode 100644 index 0000000..7c770bb --- /dev/null +++ b/specification/appendix_d.adoc @@ -0,0 +1,86 @@ +[[appendix_d]] +== Appendix D: M-mode TSM based deployment model + +This deployment model targets high-assurance systems whose design might be constrained +by real-time and formal verification requirements. It trade offs a feature-rich design supporting +dynamic resource allocation, for simpler implementation, where the correctness can be formally verified. + +[id=dep3] +[caption="Figure {counter:image}"] +[title= ": M-mode TSM based deployment model for CoVE"] +image::img_11.png[align=center] + +=== Overview +<> shows that the deployment model supports a single confidential supervisor domain in which +the TSM runs along with the TSM driver in the M-mode. This single confidential supervisor domain can run multiple +TVMs that are isolated from each other using the MMU, i.e., G-stage page tables managed by TSM. TSM isolates the +hosting supervisor domain (i.e., OS/VMM and non-confidential applications and VMs) from the confidential supervisor +domain (TSM and TVMs) using a hardware memory isolation mechanism, like PMP. +The Supervisor Domain Access Protection (Smmtt) extension is therefore not required in this model but not precluded. +IO accesses to confidential memory must be prevented, for example, with IOPMP. + +[NOTE] +==== +Since the TSM is not required to run in the HS-mode, this deployment model supports systems that emulate the +hypervisor extension or run TVMs and OS/VMM in S-mode. The latter requires use of a hardware memory isolation mechanism +that enforces memory accesses to confidential memory while being only controlled by the TSM, e.g., PMP. +==== + +=== Static memory partitioning +The deployment model proposes static partitioning of memory into confidential and non-confidential to simplify +formal reasoning about the correctness of the TVM execution and isolation. TSM performs this paritioning early +during the boot of the platform, resulting in the following advantages: (1) simplified formal reasoning about the +ownership of memory, (2) attestation that covers static system configuration (e.g., values of PMP registers), +(3) reduced attack surface between OS/VMM and TSM due to narrower ABI. A possible negative consequence of +static partitioning is underutilization of resources. Specifically, confidential memory created at platform +initialization might be larger than the required amount of memory utilized by TVMs and the TSM during runtime. +Lack of the conversion mechanism of confidential memory pages to non-confidential memory (enabled for example by Smmtt) +prevents the hosting supervisor domain (OS/VMM, applications, and VMs) from using the memory over-provisioned by +the confidential supervisor domain (TSM, TVMs). + +=== TVM creation +To reduce the complexity of the TSM implementation, the TSM creates a TVM as a result of a single operation triggered with +the `sbi_covh_promote_to_tvm()` call. Specifically, OS/VMM initializes and starts a regular VM, which early during the +boot process requests to be promoted to a TVM. This call traps in the OS/VMM and is reflected to the TSM, which copies +VM's data, the page table configuration, and the vCPU state into confidential memory. If the request fails, the promotion +of a VM to a TVM fails and error is returned to the VM. When the request succeeds, OS/VMM marks the VM as a TVM, +so that it can then properly resume its execution via TSM. + +=== Local attestation +Embedded systems might operate without access to a network, which prevents use of remote attestation. For this +reason, this deployment model also supports local attestation, in which the TSM attests to the integrity of the TVM image +during its creation and allows its creation only when it contains a specfic `TVM attestation payload` (TAP). This +payload carries a cryptographic proof issued with the expected attestation key specific to the TSM integrity +and platform configuration. The pointer to TAP is passed in a call to promote a VM to a TVM. If it is zero, +then remote attestation is used, otherwise local attestation is used. Local attestation is strongest when it is hardware enforced. + +Attestation for embedded systems utilizes one or both of the following properties: + +. The embedded platform must be able to verify that the TVM is authorized to run on the platform. +. The TVM must be able to verify that the configuration of the hardware platform is acceptable/correct. + +Separate mechanisms may be used to achieve these goals. + +==== TVM Authorization +The TSM must have a list of public keys of those authorized to sign VMs (TVMs) for execution on the platform. The attestation payload associated with the TVM will be +signed with a private key. When the VM is being promoted to a TVM, TSM checks the signature inside the TAP. +If the signature is not valid, the TSM will not convert the VM and will terminate execution of the +VM. The method for provisioning these public keys into the TSM is outside the scope of this specification. + +==== Verifying Platform configuration +When the creator of the (authorized) TVM does not want it to execute on improperly configured or unauthorized hardware, there should be a mechanism supported by hardware (and firmware) for verification. +Assuming presence of the hardware root-of-trust for measurement and hardware root-of-trust for storage, the VM can be created with an encrypted disk and the key that is used to decrypt the disk can be sealed to the measurments of the platform. +The creator of the VM using the specifications for the platform decides what values are required in order for the key to be released. +When the request to promote VM to a TVM is called and local attestation is successful, the TSM unseals the key with help of the hardware root-of-trust. At the point when the TVM needs to decrypt its disk (e.g., for mounting the filesystem), the TVM utilizes an ABI call (`covg_retrieve_secret()`) to retrieve the decryption key from the TSM. + +=== Further recommendations +Embedded systems with real-time requirements must have a fixed upper bounded execution time. This requires determining +the maximal number of instructions that can execute between TVM context switches. From this reason, this deployment model +recommends an uninterruptible TSM. <> shows this operation mode, in which TSM running in M-mode exposes COVH and +COVG ABI to OS/VMM and TVM, respectively. VM ECALLs trap directly to TSM due to the `medeleg` configuration and all +interrupts during TVM execution trap in TSM due to the `mideleg` configuration. + +[id=depd2] +[caption="Figure {counter:image}"] +[title= ": TSM operation"] +image::img_12.png[align=center] \ No newline at end of file diff --git a/specification/attestation.adoc b/specification/attestation.adoc index 3529fa4..cae39ea 100644 --- a/specification/attestation.adoc +++ b/specification/attestation.adoc @@ -3,16 +3,21 @@ [[attestation]] == TVM Attestation and Measurements -The CoVE TVM attestation framework allows for CoVE workload owners to assert -the trustworthiness of the hardware and software environment their workload is -running in. +The CoVE TVM attestation framework allows CoVE workload owners to assert +the trustworthiness of the hardware and software environment in which their workload is +running. This framework allows for *remote* and *local attestation*. -Attestation relies on the ability for the SoC to generate a cryptographic evidence +Remote attestation relies on the ability for the system on chip (SoC) to generate a cryptographic evidence for a workload executing in a CoVE TVM. The workload executing in a TVM may -request this cryptographic evidence to relay to a remote relying party which can -then verify that the evidence is valid (per some appraisal policy), and thus attest -to the trustworthiness of the TVM. The relying party can then accept to release -secrets or attestation result tokens back to the trusted workload. +request this cryptographic evidence to relay to a remote relying party which +verifies that the evidence is valid (per some appraisal policy). If valid, the +evidence attests to the trustworthiness of the TVM, so that the relying party +can release secrets or attestation result tokens to the TVM. + +Local attestation is a model where the TVM presents cryptographic evidence to the TSM +that enables the TSM to release a secret to the TVM if it is authorized to +run on the platform. This form of attestation meets the needs of systems that +have limited or no access to the network. This section describes the CoVE attestation evidence content, format and generation interface. @@ -65,7 +70,7 @@ The TCB elements for each of them is summarized in the following table: .4+.^|Platform <| HW RoT for boot, measurement and storage .4+<| All M-mode firmwares, including the TSM-driver - <| All CPU hardware logic, including MMU and caches + <| All CPU hardware logic <| All SoC subsystems, including memory confidentiality, integrity and replay-protection for volatile memory <| IOMMU and translation agents @@ -88,7 +93,7 @@ measurements and a runtime one. TVM initial measurements are generated from the CoVE workload TCB elements involved in the TVM construction. Any TCB element that directly or indirectly supports a TVM must be measured into the TVM initial measurement registers. Once -a TVM is finalized, i.e. after the `sbi_tee_host_finalize_tvm()` TH-ABI is +a TVM is finalized, i.e., after the `sbi_tee_host_finalize_tvm()` TH-ABI is called, the TVM initial measurements must no longer be extended. Each TVM's initial measurements are stored in dedicated measurement registers and @@ -141,7 +146,7 @@ CoVE implementation may look like the following table: | 5 | TVM Configuration - <| TVM Entry Point and Initial Arguments + <| TVM Entry Point, Initial Arguments, and the vCPU state | TSM |=== @@ -162,7 +167,7 @@ The TVM measurement extension interface is exposed through the optional TG-ABI [NOTE] ==== -if an implementation uses UEFI firmware to initialize the CoVE TVM guest +If an implementation uses UEFI firmware to initialize the CoVE TVM guest environment, then refer to UEFI specification <> chapter 38 on confidential computing for UEFI ABI related to runtime measurement extension and event log creation. @@ -172,10 +177,10 @@ event log creation. All above described TCB elements measurements are added to an attestation evidence and then reported to relying parties. The attestation mechanism -and protocol that take place between the attester (i.e. the TVM) and the -remote attestation service are out of this document scope. +and protocol that take place between the attester (i.e., the TVM) and the +remote attestation service are out of the scope of this document. -In this section we describe the high level attestation model for CoVE, +In this section, we describe the high level model of remote attestation for CoVE, together with the attestation evidence content, format and generation process. ==== Model @@ -222,13 +227,13 @@ TCB layer measurements: [.center] [stem] ++++ -CDI_{0} = KDF(UDS_{Len},\ UDS\ ||\ H_{alg}(Meas(TCB_{0})) +CDI_{0} = KDF(UDS_{Len},\ UDS\ ||\ H_{alg}(Meas(TCB_{0}))) ++++ :stem: asciimath [.center] [stem] ++++ -CDI_{N} = KDF(CDI_{Len},\ CDI_{N-1}\ ||\ H_{alg}(Meas(TCB_{N})) +CDI_{N} = KDF(CDI_{Len},\ CDI_{N-1}\ ||\ H_{alg}(Meas(TCB_{N}))) ++++ Asymmetric key pairs can be derived from a CDI in order to generate the @@ -649,9 +654,9 @@ The TVM identity claim value is a `sbi_tee_host_finalize_tvm()` provided argument. It is an optional claim and is not included in the TVM token when the TVM identity argument is set to 0. -It is used by the host TVM creator (e.g. the host VMM) to bind a TVM to an +It is used by the host TVM creator (e.g., the host VMM) to bind a TVM to an identity or more generically a specific piece of data (e.g. an Attestation -Service public key, a configuration blob, etc) through its hash value. +Service public key, a configuration blob, etc.) through its hash value. TVM identity allows for untrusted hosts to provide a TVM with unmeasured but attestable pieces of data. A Relying Party can then verify the TVM measurements diff --git a/specification/glossary.adoc b/specification/glossary.adoc index fe74d8a..ecfe39d 100644 --- a/specification/glossary.adoc +++ b/specification/glossary.adoc @@ -62,6 +62,8 @@ that hosts multiple mutually distrusting software owned by different tenants. | MTT | Memory Tracking Table (MTT). +| Relying party | An entity that depends on the validity of information about another entity, typically for purposes of authorization cite:[RATS]. + | RISC-V Supervisor Domains | RISC-V privileged architecture <> defines the S-mode for execution of supervisor software. S-mode software may optionally enable the Hypervisor extension to host virtual machines. Typically, there is a @@ -90,6 +92,8 @@ information as part of the evidence of the TCB in use. The SVN is typically combined with other meta-data elements when evaluating the attestation information. +| TAP | TVM attestation payload (TAP) is a block of memory in a VM that TSM uses to perform local attestation as part of promoting a VM to a TVM. + | TSM | TEE security manager (TSM) is a software module that enforces TEE security guarantees on a platform. It acts as the trusted intermediary between the VMM and the TVM. TSM extends the TCB chain on the CoVE platform and is therefore subject to attestation. @@ -100,14 +104,15 @@ software, and firmware elements that are trusted by a relying party to protect the confidentiality and integrity of the relying parties' workload data and execution against a defined adversary model. In a system with separate processing elements within a package on a socket, the TCB -boundary is the package. In a multi-socket system the TCB extends across -the socket-to-socket interface, and is managed as one system TCB. +boundary is the package. In a multi-socket system the Hardware TCB extends across +the socket-to-socket interface, and is managed as one system TCB. The software TCB may also extends +across multiple sockets. | TEE | Trusted execution environment (TEE) is a set of hardware and software mechanisms that allow creating attestable and isolated execution environment. -| TVM | TEE VM (TVM) also known as Confidential VM. It is a VM instantiation of an confidential workload. +| TVM | TEE VM (TVM) also known as Confidential VM. It is a VM instantiation of a confidential workload. -| Virtual Machine (VM) | Guest operating system hosted by a VMM. +| VM | Virtual Machine (VM) is a guest operating system hosted by a VMM. | VMM | Virtual machine monitor (VMM) is used interchangeably with the term hypervisor in this document. diff --git a/specification/header.adoc b/specification/header.adoc index 6c54567..0cd94dc 100644 --- a/specification/header.adoc +++ b/specification/header.adoc @@ -72,4 +72,5 @@ include::sbi_cove.adoc[] include::appendix_a.adoc[] include::appendix_b.adoc[] include::appendix_c.adoc[] +include::appendix_d.adoc[] include::bibliography.adoc[] diff --git a/specification/images/img_11.png b/specification/images/img_11.png new file mode 100644 index 0000000..7aca957 Binary files /dev/null and b/specification/images/img_11.png differ diff --git a/specification/images/img_12.png b/specification/images/img_12.png new file mode 100644 index 0000000..1781e7e Binary files /dev/null and b/specification/images/img_12.png differ diff --git a/specification/intro.adoc b/specification/intro.adoc index 5221821..18e6724 100644 --- a/specification/intro.adoc +++ b/specification/intro.adoc @@ -7,7 +7,7 @@ a scalable Trusted Execution Environment (TEE) for hardware virtual-machine-base workloads on RISC-V-based platforms. This CoVE interface specification enables application workloads that require confidentiality to reduce the Trusted Computing Base (TCB) to a minimal TCB, specifically, keeping the host OS/VMM, -devices and other software outside the TCB. Admitting devices into the TCB of CoVE +devices and other software outside the TCB. Admitting devices into the TCB of CoVE TEE VMs is outside the scope of this specification and is described in the CoVE-IO specification. The proposed specification supports an diff --git a/specification/overview.adoc b/specification/overview.adoc index 2606601..148886a 100644 --- a/specification/overview.adoc +++ b/specification/overview.adoc @@ -18,15 +18,20 @@ of hardware-attested trusted execution environment called TEE Virtual Machines execution state and memory are run-time-isolated from the host OS/VMM and other platform software not in the TCB of the TVM. TVMs are protected from a broad set of software-based and hardware-based threats per the threat model described -in <>. The design describes an isolated (Confidential) Supervisor +in <>. The architecture describes an isolated (Confidential) Supervisor Domain to enforce TCB and confidentiality properties, while using an isolated (Hosting) Supervisor Domain for the host domain, thus maintaining the OS/VMMs -role as the resource manager (for both legacy VMs and TVMs). The resources +role as the resource manager (for both legacy VMs and TVMs). + +On processors supporting multiple supervisor domains (the Smmtt extension), the resources managed by the hosting supervisor domain (OS/VMM) include memory, CPU, I/O -resources and platform capabilities to host the TVM workload. The terms +resources and platform capabilities required to host the TVM workload. The terms hosting supervisor domain and OS/VMM are used interchangeably in this specification. The underlying memory isolation mechanisms for supervisor domains -(Smmtt) is agnostic of the number of supervisor domains. +(Smmtt) is agnostic of the number of supervisor domains. On processors that do not support +multiple supervisor domains where Smmtt is not mandated, a single confidential supervisor +domain and a single hosting supervisor domain can be supported (for example <>, +other deployment models are also possible). [id=dep1] [caption="Figure {counter:image}", reftext="Figure {image}"] @@ -36,25 +41,28 @@ image::img_0.png[] As shown in <>, the Confidential Supervisor Domain is managed by software that operates in HS-mode and manages resources granted to it by the Hosting Supervisor Domain Manager (the OS/VMM). The Confidential Supervisor Domain -Manager is called the " *TEE Security Manager* " or *(TSM)* - it acts as the +Manager is called the " *TEE Security Manager* " or *(TSM)* and acts as the trusted intermediary between TEE and non-TEE workloads on the same platform. The TSM should have a minimal hardware-attested footprint. The TCB (which includes the TSM and hardware) enforces strict confidentiality and integrity security -properties for workloads in this supervisor domain. The Root Security Manager -is an M-mode software module (called the " *TSM-driver* ") which isolates the -Confidential Supervisor Domain from all other Supervisor domains and other -platform components (non-confidential and +properties for workloads in this supervisor domain. The Root Security Manager, +also called the " *TSM-driver* ", isolates the Confidential Supervisor Domain +from all other Supervisor domains and other platform components (non-confidential and confidential). The responsibility of the TSM is to enforce the security -objectives accorded to TEE workloads assigned to that supervisor domain. The +objectives accorded to TEE workloads (TVMs) assigned to that supervisor domain. The VMM is expected to continue to manage the security for non-confidential workloads, and importantly the resource-assignment and scheduling management functions for all confidential and non-confidential workloads. +Note that CoVE implementations may partition the TSM and TSM-driver functionality +as required by implementations. For example <> shows the TSM and TSM-driver +functionality in a single deployment unit. -In this scheme, compute resources, such as memory, start off as traditional +CoVE supports models with dynamic resource allocations, where +compute resources, such as memory, start off as traditional untrusted resources owned by the non-confidential/hosting supervisor domain, and are expected to be donated/transitioned to the confidential supervisor domain via application binary interface (ABI) supported by the TSM. Once the conversion process is complete, -confidential memory may be assigned to one or more TVMs by the TSM. +TSM may assign confidential memory to one or more TVMs. A converted confidential resource may be freely assigned to another TVM within the same supervisor domain when it is no longer in use. However, an unused confidential resource must be explicitly reclaimed for use in the @@ -64,12 +72,12 @@ properties). The hosting supervisor domain may use the reclaimed memory for itself or for non-confidential VMs. Each TVM's address space can be comprised of confidential and non-confidential -regions. The former includes both measured pages (that are part of the initial -TVM payload), and confidential zero-pages that can be mapped-in on demand by +regions. The former may include measured pages (that are part of the initial +TVM payload) and confidential zero-pages that can be mapped-in on demand by the VMM following runtime accesses by the TVM. These zero'ed confidential pages are pages that are demand-paged in and are expected to be zero'ed by the TSM to prevent attacks from the host software on the TVM. The TSM also enforces that -the host does not overlap them with existing (present) G-stage mappings for the +the host does not overlap them with the existing (present) G-stage mappings for the TVM. The non-confidential TVM-defined regions include those for shared-pages and memory-mapped I/O (MMIO). @@ -87,10 +95,12 @@ TVMs may be hosted by the host OS/VMM via confidential supervisor domains. Each TVM may consist of the guest firmware, a guest OS and applications. The software components included in the TVM are implementation specific. -As shown in <>, the M-mode firmware TSM-driver is in the TCB of all -Supervisor domains and hence in the TCB for all CoVE workloads hosted on the -platform. The TSM-driver (operating in M-mode) uses -the hardware capabilities to provide: +<> shows the deployment model in which the TSM-driver runs in M-mode and TSM runs +in HS-mode. Systems without Smmtt or the requirement for multiple supervisor domains +can combine the functions of the TSM-driver and TSM in M-mode, see <>. +When supervisor domains are in use, the TSM-driver is in the TCB of all supervisor +domains and hence in the TCB for all CoVE workloads hosted on the platform. +The TSM-driver, which always operates in M-mode, uses the hardware capabilities to provide: * Isolation of memory associated with TEEs (including the TSM). We describe *Confidential memory* as memory that is subject to access-control, @@ -107,21 +117,21 @@ The TSM-driver delegates parts of the TEE management functions to the TSM, specifically isolation across confidential memory assigned to TVMs. The TSM is designed to be portable across RISC-V platforms (that support CoVE) and interact with the machine specific capabilities in the platform through the TEEI. The TSM -provides an ABI to the OS/VMM which has two aspects: A set of host ABIs known +provides an ABI to the OS/VMM which has two aspects. First, a set of host ABIs known as *COVH* that includes functions to manage the lifecycle of the TVM, such as creating, adding pages to a TVM, scheduling a TVM for execution, etc., in an -OS/platform agnostic manner. The TSM also provides an ABI to the TVM contexts: +OS/platform agnostic manner. Second, the TSM also provides an ABI to the TVM contexts: A set of guest ABIs known as *COVG* that enables the TVM workload to request -attestation functions, memory management functions, or paravirtualized IO. +attestation functions, memory management functions, or paravirtualized IO. In order to isolate the TVMs from the host OS/VMM and non-confidential VMs, -the supervisor domains (that contain the TSM state) must be isolated first - -this is achieved by enforcing isolation for memory assigned to the supervisor -domain that the TSM occupies - this is called the *TSM-memory-region.* The -TSM-memory-region is expected to be a static region of memory that holds the TSM -code and data. This region must be access-controlled from all software outside -the TCB (e.g., using Smmtt), and may be additionally protected against physical -access via cryptographic mechanisms. +the supervisor domains (that contain the TSM state) must be isolated first. +This is achieved by enforcing isolation for memory assigned to the supervisor +domain that the TSM occupies. This memory region is called *TSM-memory-region* and +is expected to be a static region of memory that holds the TSM code and data. +It must be access-controlled from all software outside the TCB (e.g., using Smmtt +or PMP), and may be additionally protected against physical access with help of +cryptographic mechanisms. Access to the TSM-memory-region and execution of code from the TSM-memory-region (for the TSM ABIs) is enforced in hardware via the maintenance @@ -129,9 +139,9 @@ of the execution context (ASID, VMID and SDID) maintained per hart. This context is enabled per-hart via the TEECALL interface to context switch into the confidential supervisor domain context via the TSM-driver and disabled via the TEERET interface to context restore to the hosting supervisor domain. -Access to TEE-assigned memory is allowed for the hart when the access is -permitted as per the active permissions enforced by the memory management unit (MMU) for the supervisor -domain active on the hart (enforced through Sv and Smmtt for CoVE). This +Access to TEE-assigned memory is allowed for the hart when access is +permitted as per the active permissions enforced by the memory management unit (MMU) +for the supervisor domain active on the hart (enforced through Sv and Smmtt for CoVE). This per-hart execution context is used by the processor to enforce access-control properties on memory accessed by TEE workloads managed by the TSM. The details of the supervisor domain access protection is specified in the Smmtt @@ -143,12 +153,13 @@ the security of the TVMs through the resource management actions of the OS/VMM. These security primitives require the TSM to enforce TVM virtual-hart state save and restore, as well as enforcing invariants for memory assigned to the TVM, including G-stage translation. The host OS/VMM provides the -typical VM resource management functionality for memory, IO, etc. +typical VM resource management functionality for memory, IO, and VM's lifecycle +management. -<> shows Confidential VMs managed by a VMM and <> shows Confidential -applications managed by an untrusted host OS. As evident from the architecture, the difference -between these two scenarios is the software TCB (owned by the tenant within -the TVM) for the tenant workload - in the application TEE case, a minimal +<> shows TVMs (a.k.a. confidential VMs) managed by a VMM and <> shows Confidential +applications managed by an untrusted host (OS/VMM). As evident from the architecture, the +difference between these two scenarios is the software TCB (owned by the tenant within +the TVM) for the tenant workload in the application TEE case, a minimal guest runtime may be used; whereas in the VM TEE case, an enlightened guest OS is expected in the TVM TCB. Other software models that map to the VU/VS modes of operation are also possible as TEE workloads. Importantly, the hardware @@ -161,6 +172,6 @@ CoVE ABI. image::img_1.png[] The detailed architecture is described in the Section <>. Note that the -architecture described above may have various implementations, however the goal -of this specification is to propose a reference architecture and ratify a -normative CoVE ABI for Confidential VMs as a RISC-V non-ISA specification. +architecture described above may have various implementations (e.g., see <> and <>). +However, the goal of this specification is to propose a reference architecture and ratify a +normative CoVE ABI for TVMs as a RISC-V non-ISA specification. diff --git a/specification/refarch.adoc b/specification/refarch.adoc index 151dd22..1a4ed89 100644 --- a/specification/refarch.adoc +++ b/specification/refarch.adoc @@ -9,62 +9,71 @@ the properties of the TSM, its instantiation, isolation and operational model for the TVM life cycle. The description in this section refers to the reference architecture in Figure 1. -=== CoVE Memory Isolation +=== CoVE Deplyment Models +There are three deployment models described below (1, 2, and 3). CoVE ABI is applicable for +all of them. This specification focuses mainly on the first deployment model (1) where a +primary host supervisor domain is used to host confidential workloads in a +secondary confidential domain. In all models, memory assigned to a Confidential +supervisor domain is called *Confidential* memory and memory accessible to the hosting +supervisor domain is called *Non-Confidential*. + +. The TSM operates in S/HS mode as a per supervisor domain manager to the +hosting supervisor domain which operates in S/HS mode. <> shows this model that utilizes +the Memory Tracking Table (MTT) and the G-stage page tables (PT) for TVM isolation (the 1st +stage PT is normally used by the Guest OS). The MTT is used to assign physical +memory to the Confidential supervisor domain. The MTT allows dynamic programming of the +per-domain access permissions. + +. The TSM is the only root HS-mode component on the platform, hence, G-stage +page tables (PT) can be used to enforce isolation between TVMs and +ordinary VMs. In this model the host VMM must execute in the de-privileged VS +mode and the TSM must provide nested virtualization of the H-extension controls. +This model may be suitable for client/embedded systems and is shown in <>. + +. The TSM runs along the TSM-driver in M-mode, thus allowing only for a single confidential +supervisor domain. This model enforces isolation between TVMs using MMU (G-stage PT) and +between hosting supervisor domain (OS/VMM, VMs) and confidential supervisor domain using, e.g., PMPs. +This model might be suitable for client/embedded systems running in a constraint hardware and software +environment (e.g., it does not require Smmtt). <> discusses this model in detail. +=== CoVE Memory Isolation Memory isolation for TVMs is orchestrated by the TSM-driver and the TSM in two phases: the conversion of memory to confidential memory and the assignment of confidential memory (alongwith the enforcement of properties on use) to TVMs. To enforce isolation across Host and Confidential supervisor domains, CoVE -requires isolation of physical memory (that supports paging when enabled). There -are two deployment models described below (1 and 2). CoVE ABI is applicable for both -modes - this specification focuses on the first deployment model (1) where a -primary host supervisor domain is used to host confidential workloads in a -secondary confidential domain. - -. The TSM operates in S/HS mode as a peer supervisor domain manager to the -hosting supervisor domain which operates in S/HS mode as well. This model uses -the Memory Tracking Table (MTT) along with G-stage page tables (PT) for confidential TVM isolation (where the 1st -stage PT is used by the Guest OS normally). The MTT is used to assign physical -memory to the Confidential supervisor domain called *Confidential* memory and -memory accessible to the hosting supervisor domain called *Non-Confidential*. -MTT allows dynamic programming of the per-domain access permissions. This model -is shown in <> - -. The TSM is the only root HS mode component on the platform, hence, G-stage -page tables (PT) can be used to enforce isolation between confidential TVMs and -ordinary VMs. In this model the host VMM must execute in the de-privileged VS -mode and the TSM must provide nested virtualization of the H-extension controls. -This model may be suitable for client/embedded systems and is shown in <>. +requires isolation of physical memory (that supports paging when enabled). -A TVM and/or TSM needs to access both types of memory: +CoVE defines two types of memory: * Confidential memory - used for TVM/TSM code and security-sensitive data; including state such as 1st-stage, G-stage page tables. * Non-confidential memory - used only for shared data, e.g., communication between the TVM/TSM and the non-TCB host software and/or non-TCB IO devices. -The TSM COVH ABI provides interfaces to the OS/VMM to convert / donate -memory from the hosting supervisor domain to the confidential supervisor domain. +The split of memory into confidential and non-confidential may be static or dynamic. +Static partitioning occurs during platform initialization while the dynamic partitioning +occurs during runtime. The dynamic partitioning allows for better resource utilization, +but requires additional hardware and software requirements. + +To support the dynamic memory partitioning, the TSM enables the OS/VMM to +convert / donate memory from the hosting supervisor domain to the confidential supervisor domain using dedicated COVH ABI. Similarly, a separate ABI intrinsic is used to reclaim memory back from the confidential supervisor domain to the hosting supervisor domain. Once physical -memory is converted to confidential - it is accessible only to the confidential +memory is converted to confidential, it is accessible only to the confidential supervisor domain. By default, TVM memory is assigned by the TSM (which operates in the confidential supervisor domain context) from confidential -physical memory regions. Note that a TVM may be assigned non-confidential -(shared) memory regions as well explicitly under the TVMs control. The TSM -manages the type and accessibility of all memory assigned to the TVM, to -mitigate attacks from non-TCB software. The TSM enforces isolation between TVMs -by using the G-stage page table. - -* Hart operating with the confidential supervisor domain context has MTT -permissions to access Confidential and Non-confidential memory -* Hart not operating in a Confidential supervisor domain has access permissions -only for Non-confidential memory - -The RDSM configures the MTT such that a hart executing in the hosting domain -does not have access to any confidential memory regions. The RDSM configures the -MTT for the confidential domain to allow access to confidential memory -exclusively to that domain, but may also allow access to non-confidential +physical memory regions. Note that a TVM may request assignment of non-confidential +(shared) memory regions to enable communication channels to hosting supervisor domain (e.g., VirtIO). +The TSM manages the type and accessibility of all memory assigned to the TVM, to mitigate attacks +from non-TCB software. The TSM enforces isolation between TVMs by using the G-stage page table. + +* Hart operating with the confidential supervisor domain context has permissions to access Confidential and Non-confidential memory, +* Hart not operating in a Confidential supervisor domain has access permissions only for Non-confidential memory. + +In CoVE implementations supporting the Smmtt extension, the root domain security manager (RDSM) +configures the MTT such that a hart executing in the hosting domain does not have access to any +confidential memory regions. The RDSM configures the MTT for the confidential domain to allow access +to confidential memory exclusively to that domain, but may also allow access to non-confidential (shared) memory regions to one or more secondary domains. [caption="Figure {counter:image}: ", reftext="Figure {image}"] @@ -80,16 +89,21 @@ unique memory encryption key. These additional protection aspects are platform and implementation dependent. ==== -Confidential and non-confidential memory are both always assigned by the VMM, -i.e., the hosting supervisor domain - the TSM-driver is expected to manage the +In deployment models that allow for dynamic memory partitioning, +confidential and non-confidential memory are both always assigned by the VMM, +i.e., the hosting supervisor domain. The TSM-driver is expected to manage the isolation for confidential memory assigned to any of the secondary supervisor domains by programming the Memory Tracking Table (MTT). The desired security properties of memory tracking are discussed below. The TSM (within a supervisor domain) manages page-based allocation using the G-stage page table from the set of confidential memory regions that are enforced by the MTT. -Four aspects of memory isolation are impacted due to this dynamic configurable -property of the MTT: +Four aspects of memory isolation are impacted due to this dynamically configurable +property of the MTT and are discussed next: +(1) address translation/page walk, +(2) management of isolation for confidential physical memory, +(3) handling implicit & explicit memory accesses, and +(4) cached translations/TLB management. ==== Address Translation/Page Walk Figure 2 describes a reference model for memory tracking lookup where @@ -103,21 +117,21 @@ the paging sizes/modes supported by the hart. [title= "Memory Tracking for Supervisor Domains"] image::https://github.com/riscv/riscv-smmtt/blob/main/images/fig2.png?raw=true[] -==== Management of isolation for Confidential Physical Memory +==== Management of Isolation for Confidential Physical Memory -The software TCB (specifically TSM) manages the assignment of physical memory to the Confidential +The software TCB (specifically TSM) manages the assignment of physical memory to the confidential supervisor domain, while the hardware TCB (specifically the hart MMU including virtual memory system, MTT Extensions) enforces the access-control for confidential memory against other supervisor domains. The region sizes at which the memory tracking enforces isolation may be multiples of the architectural page sizes supported by the hart MMU. The IOMMU is expected to support a similar memory tracking lookup to enable a device/function trusted by the TVM to directly access -TVM confidential memory regions. For the CoVE reference architecture this TCB +TVM confidential memory regions. For the CoVE reference architecture, this TCB consists of the hardware (e.g., MMU, IOMMU, Memory Controller) and the software/firmware elements - TSM-driver and the TSM. The TSM-driver is responsible for enforcing isolation of -confidential memory regions (consisting of multiple pages via MTT) and the TSM +confidential memory regions (e.g., via PMP or MTT) and the TSM is responsible for enforcing isolation of confidential memory pages among TVMs -(via G-stage translation) - pages assigned to the TVM may be exclusively +(e.g., via G-stage page tables). Pages assigned to the TVM may be exclusively accessible to the condidential supervisor domain or may be shared with the hosting supervisor domain (e.g., to allow for paravirtualized IO access). @@ -138,7 +152,7 @@ from untrusted/shared memory - enforced by the hart * TEE Paging structure walk - security property: TVM and TSM must not locate page tables in untrusted shared memory. * TEE data fetch - security property: The TVM via the TSM may be allowed to -relax data accesses to non-confidential memory (via MTT) to allow for IO +relax data accesses to non-confidential memory (e.g., via MTT) to allow for IO accesses. ==== Cached translations/TLB management @@ -160,7 +174,11 @@ reclaiming memory assigned to a TVM, the TSM must perform scrubbing of confidential memory before returning control of the memory to the host (via the MTT) or assigning to another TVM. If the TVM is converting memory from confidential to non-confidential, then the TVM must scrub the confidential -memory being returned to the host via `sbi_covg_share_memory_region`. +memory being returned to the host via `sbi_covg_share_memory_region()`. + +With a fixed partitioning of memory into confidential and non-confidential, +memory conversion or reclamation cannot occur. The TSM remains responsible for +scrubbing memory when being assigned to a TVM. ==== === TSM initialization @@ -184,11 +202,12 @@ this TSM-memory-region must be in confidential memory. The TSM binary may be provided by the OS/VMM which may independently authenticate the binary before loading the binary into the TSM-memory-region via the TSM-driver. Alternatively, the platform firmware may pre-load the RoT-authenticated TSM -binary via the TSM-driver. +binary via the TSM-driver or, as in case of some embedded systems, both TSM-driver and +TSM might be loaded as part of the secure boot process (see <>). -In both cases, the TSM binary loaded must be measured and may be +In both cases, the loaded TSM binary must be measured and may be authenticated (per cryptographic signature mechanisms) by the TSM-driver -during the loading process, so that the TSM used is reflected in the +during the loading process, so that the loaded TSM is reflected in the attestation rooted in a hardware RoT. The authentication process provides additional control to restrict TSM binaries that can be loaded on the platform based on policies such as version, vendor, etc. In addition to the @@ -198,15 +217,15 @@ to the TSM-driver and higher privilege components. The measurements and versions of the hardware RoT, the TSM-driver and the TSM will subsequently be provided as evidence of a specific TSM being loaded on a specific platform. -During initialization, the TSM-driver will initialize a TSM-data region -within the TSM-memory region. The TSM-data region may hold per-hart TSM +During initialization, the TSM-driver will initialize a TSM-data-region +within the TSM-memory-region. The TSM-data-region may hold per-hart TSM state, memory assignment tracking structures and additional global data for -TSM management. The TSM-data region is confidential memory that is apriori +TSM management. The TSM-data-region is confidential memory that is apriori access-control-restricted by the TSM-driver to allow only the TSM to access this memory. The per-hart TSM state is used to start TSM execution from a known-good state for security routines invoked by the OS/VMM. The per-hart TSM state should be stored in confidential memory in TSM Hart Control Structures -(THCS - See <>) which is initialized as part of the TSM memory +(THCS, see <>) which is initialized as part of the TSM memory initialization. The THCS structure definition is part of the COVH ABI and may be extended by an implementation, with the minimum state shown in the structure. Isolating and establishing the execution state of the TSM is the @@ -240,7 +259,9 @@ per-hart control sub-structure THCS.hssa (See <>). Figure 3 shows th flow. Beyond the basic operation described above, the following different -operational models of the TSM may be supported by an implementation: +operational models of the TSM may be supported by an implementation. +Interruptible TSM implementations must run TSM-driver and TSM in different +processor privileged modes. * *Uninterruptible* *TSM* - In this model, the TSM security routines are executed in an uninterruptible manner for S-mode interrupts (M-mode @@ -262,10 +283,8 @@ interrupted TSM security routine is initiated by the OS/VMM on the same hart. The TSM hart context restore is enforced by the TSM to allow for the resumed TSM security routine operation to complete. Intermediate state of the operation must be saved and restored by the TSM for such -flows. - -**__This specification describes the operation of the TSM in this -mode of operation.__** +flows. **__This specification primarily describes the operation of the TSM +in this mode of operation.__** * *Interruptible and re-entrant TSM* - In this model, similar to the previous case, the TSM security routines are executed in an interruptible @@ -282,7 +301,7 @@ concurrency controls on internal data structures and per-TVM global data structures (such as the G-stage page table structures). [caption="Figure {counter:image}: ", reftext="Figure {image}"] -[title= "TSM operation - Interruptible and non-reentrant TSM model shown."] +[title= "TSM operation: Interruptible and non-reentrant TSM model according to the deployment model 1."] image::img_3.png[] A TSM entry triggered by an ECALL (with CoVE extension type) by the OS/VMM @@ -343,7 +362,7 @@ the TEECALL / TEERESUME) The TSM-driver is stateless across TEECALL invocations, however a security routine invoked in the TSM via a TEECALL may be interrupted and must be resumed -via a TEERESUME i.e. _the TSM is preemptable but non-reentrant_. These +via a TEERESUME, i.e., _the TSM is preemptable but non-reentrant_. These properties are enforced by the TSM-driver, and other models described above may be implemented. The TSM does not perform any dynamic resource management, scheduling, or interrupt handling of its own. The TSM is not @@ -353,7 +372,7 @@ host OS/VMM to track that the required security checks are performed on each physical hart (or virtual hart context) as required by specific COVH/G flows. When the TSM is entered via the TSM-driver (as part of the ECALL [TEECALL] -- MRET), the TSM starts with sstatus.sie set to 0 i.e. interrupts disabled. +- MRET), the TSM starts with sstatus.sie set to 0, i.e., interrupts disabled. The sstatus.sie does not affect HS interrupts from being seen when mode = U/VS/VU. The OS/VMM sip and sie will be saved by the TSM in the HSSA and will retain the state as it existed when the host OS/VMM invoked the TSM. @@ -388,7 +407,7 @@ interrupt, VS-mode software interrupt, and VS-mode timer interrupts to the TVM. S-mode Software/Timer/External interrupts are delegated to the TSM (with the behavior described above). _All other interrupts_ , M-mode Software/Timer/External, bus error, high temp, RAS etc. are not delegated and -delivered to M-mode/TSM-driver. Under these circumstances the saving of the +delivered to the TSM-driver. Under these circumstances, the saving of the state is the TSM-driver responsibility. Also since scrubbing the TVM state is the TSM responsibility, the TSM-driver may pend an S-mode interrupt to the TSM to allow cleanup on such events. See <> for a table of @@ -410,21 +429,23 @@ TVMs are prevented from execution after that point. TSM (and all TVMs) memory is granted by the host OS/VMM but is isolated (via access-control and/or confidentiality-protection) by the hardware and TCB -elements. The TSM, TVM and hardware isolation methods used must be evident in the +elements. The TSM, TVM, and hardware isolation methods used must be evident in the attestation evidence provided for the TVM since it identifies the hardware and the TSM-driver. There are two facets of TVM and TSM memory isolation that are implementation-specific: -*a)* *Isolation from host software access* - For the deployment model 2, -the CPU must enforce hardware-based access-control of TSM memory via -the G-stage page tables to prevent the guest VMM from accessing TSM -memory. For the deployment model 1, the CPU must also similarly enforce +*a)* *Isolation from host software access* - For deployment model 3, +the CPU must enforce hardware-based access-control of TSM memory via a hardware +memory isolation mechanism (e.g., PMP) configurable only by TCB. +For deployment model 2, this isolation is enforced via the G-stage page tables, +preventing the guest VMM from accessing TSM memory. +For the deployment model 1, the CPU must also similarly enforce access-control of TSM memory to prevent access from host supervisor domain components (VMM and host OS that operate in V=0, HS-mode) software. -Since in this deployment model, other supervisor domains have access to 1st -and G-stage paging hardware, the root security manager (TSM-driver) must use MTT +In this deployment model, other supervisor domains have access to 1st +and G-stage paging hardware the root security manager (TSM-driver) must use MTT to isolate supervisor domain memory. In this deployment model, TEE and TVM address spaces are identified by supervisor domain identifiers (Smsdid) to maintain the isolation during access and in internal @@ -439,10 +460,10 @@ table. The memory tracking table may be enforced at the memory controller, or in a page table walker. *b)* *Isolation against physical/out-of-band access* - The platform TCB may -provide confidentiality, integrity and replay-protection. This may be +provide confidentiality, integrity, and replay-protection. This may be achieved via a Memory Encryption Engine (MEE) to prevent TEE state being exposed in volatile memory during execution. The use of an MEE and the -number of encryption domains supported is implementation-specific. For +number of supported encryption domains is implementation-specific. For example, The hardware may use the Supervisor Domain Identifier during execution (and memory access) to cryptographically isolate memory associated with a TEE which may be encrypted and additionally cryptographically @@ -452,7 +473,7 @@ of cache blocks. *TVM isolation* is the responsibility of the TSM via the G-stage address translation table (hgatp). The TSM must track memory assignment of -TVMs (by the untrusted VMM/OS) to ensure memory assignment is +TVMs (by the untrusted OS/VMM) to ensure memory assignment is non-overlapping, along with additional security requirements. The security requirements/invariants for enforcement of the memory access-control for memory assigned to the TVMs is described in <>. === TVM Execution -As described above, TVMs can access both classes of memory - isolated memory -- which has confidentiality and access-control properties for memory exclusive -to the TVM, and non-confidential memory which is memory accessible to the host -OS/VMM and is used for untrusted operations (e.g., virtio, gRPC communication -with the host). If the confidential memory is access-controlled only, the TSM +As described above, TVMs can access both classes of memory: (1) confidential memory +which has confidentiality and access-control properties for memory exclusive +to the TVM, and (2) non-confidential memory which is memory accessible to the +OS/VMM and is used for untrusted operations, such as, virtio, gRPC communication +with the host. If the confidential memory is access-controlled only, the TSM and TSM-driver are the authority over the access-control enforcement. If the confidential memory is using memory encryption (instead or in addition), the encryption keys used for confidential memory must be different from non-confidential memory. All TVM memory is mapped in the second-stage page tables controlled by the -TSM explicitly - the allocation of memory for the G-stage paging -structures pages used for the G-stage mapping is also performed by the -OS/VMM but the security properties of the G-stage mapping are enforced -by the TSM. By default any memory mapped to a TVM is confidential. A TVM -may then explicitly request that confidential memory be converted to -non-confidential memory regions using services provided by the TSM. More +TSM explicitly. CoVE implementations that support dynamic conversions between confidential +and non-confidential memory might delegate the allocation of memory for the G-stage paging +structures to the OS/VMM, while relying on TSM to enforce the security properties of the G-stage mapping. +By default any memory mapped to a TVM is confidential. A TVM may then explicitly request that +confidential memory be converted to non-confidential memory regions using services provided by the TSM. More information about TVM Execution and the lifecycle of a TVM is described in the <> section of this document. diff --git a/specification/sbi_cove.adoc b/specification/sbi_cove.adoc index fdde08e..d071823 100644 --- a/specification/sbi_cove.adoc +++ b/specification/sbi_cove.adoc @@ -2,7 +2,7 @@ [[sbi_tee]] == Confidential VM Extension (CoVE) SBI extension proposal -This section describes the normative Confidential VM Extension(CoVE) SBI +This section describes the normative Confidential VM Extension (CoVE) SBI extension. This specification introduces four new extensions: * Supervisor Domains Enumeration Extension (EXT_SUPD) @@ -37,9 +37,8 @@ Other future specifications (e.g., CoVE-IO) may need to extend one of the three CoVE SBI extensions with domain specific functions. In order to support that requirement each one of the CoVE extensions SBI function IDs (`FID`) in the availabe 64K range is split into separate namespaces. -% what 64K above means? 64KB? -The main CoVE specification uses FIDs from 0 to 1023 (inclusive), and other +The main CoVE specification uses SBI's function identifiers (FIDs) from 0 to 1023 (inclusive), and other specifications can extend the CoVE SBI by reserving a FID range after 1024. Below are the reserved CoVE FID namespaces: @@ -69,7 +68,7 @@ entity like the OS/VMM (host) in conjunction with the TSM. . Platform TSM detection and capability enumeration. . Conversion of non-confidential memory to confidential memory. -. Trusted VM (TVM) creation. +. TEE VM (TVM) creation. . Donating confidential memory to the TSM for TVM page management. . Defining TVM confidential memory regions. . Mapping TVM code and data payload to confidential-memory regions. @@ -94,10 +93,25 @@ the current status of the TSM. The TSM must be in `TSM_READY` in order to process further ECALLs. ===== TVM creation -TVMs are created using the sbi_covh_create_tvm(). This creates a TVM with state -set to `TVM_INITIALIZING`. -The host must assign confidential memory for page tables, payload mapping, and -vCPUs before it can be transitioned into the `TVM_RUNNABLE` state. +A TVM might be created in two ways: +(1) the OS/VMM requests the TSM to promote an existing VM to a TVM using a single COVH call, +(2) the OS/VMM assembles a TVM under the TSM's supervision in a sequence of COVH calls. + +To promote a non-confidential VM to a TVM, the OS/VMM invokes the `sbi_covh_promote_to_tvm()`, +presenting the VM's state (the CPU state, VM's data, page tables) to the TSM. The TSM then +measures and then copies the entire VM's state into confidential memory. This is performed as a single operation after +which the newly created TVM is in the `TVM_RUNNABLE` state. + +[NOTE] +==== +Creating a TVM in a single step reduces the number of ABI calls, thus simplifies the TSM and OS/VMM implementation. +However, it will block the CPU for the duration of the TVM creation. +==== + +In the second approach, VMM assembles the TVM in a sequence of calls to TSM. The first +call is `sbi_covh_create_tvm()` that creates a TVM in `TVM_INITIALIZING` state. With +subsequent calls to TSM, VMM requests assignment of confidential memory for page tables, +payload mapping, and vCPUs. The last call transitions TVM into the `TVM_RUNNABLE` state. ===== TVM memory management The host is responsible for the following memory management functions: @@ -109,6 +123,15 @@ The host is responsible for the following memory management functions: . Mapping zero-page confidential pages to the TVM regions. . Mapping non-confidential pages TVM-defined regions for shared-pages / MMIO. +[NOTE] +==== +The division into confidential and non-confidential memory might be done statically or dynamically. +CoVE implementations that support only static partitioning of confidential and +non-confidential memory (for example <>) +might partition the memory during platform initialization before execution of untrusted code, +and thus do not need to implement above ABI. +==== + ===== Converting non-confidential memory to confidential memory Platform memory is non-confidential by default, and must be converted to confidential memory before use with TVMs. The conversion process is initiated by @@ -170,13 +193,16 @@ example, the host needs `htval` to determine the fault address, `a0`-`a7` GPRs to handle forwarded ECALLs and so on. For this purpose, the host and TSM use the Nested Acceleration (NACL) extension based shared memory interface <>, from now on called NACL shared memory to avoid confusion with shared memory pages between TVM and -the host. +the host. The NACL shared memory interface is between TSM and the host and TSM is responsible for writing any trap-related CSRs and GPRs needed by the host to -handle the exception. TSM is also responsible for reading the returned result -and forwarding it to the TVM. Further details about which CSRs and GPRs are used -by the TSM and the host can be found in <>. +handle the exception. The TSM is also responsible for reading the returned results +from NACL shared memory and forwarding it to the TVM. +For single-step TVM creation, OS/VMM also uses NACL shared memory to reflect the +VM's state to the TSM. Further details about which CSRs and GPRs are used by the +TSM and the host can be found in <>. + The layout of NACL shared memory is shown below as `struct nacl_shmem` and `scratch` space layout for TSM is shown as `struct tsm_shmem_scratch`. @@ -260,6 +286,8 @@ interrupt ticking. to check which interrupts are enabled. This is useful in waking up a guest's vcpu when it is sleeping due to a `WFI` instruction. +| hgatp | R | W | Host reflects the address of the page directory to the TSM during the single-step TVM creation. +| vs* | R | W | Host reflects the vCPU state to the TSM during the single-step TVM creation. | *GPRs* | | | | a0 | RW | RW | Used for both passing argument and returning the result for ECALLs forwarded to the host. @@ -278,7 +306,7 @@ interrupt ticking. | a7 | W | R | Used for passing an argument for ECALLs forwarded to the host. | x0-x31 | RW | RW | Any of the GPR used in load/store instruction - trapped for MMIO emulation. + trapped for MMIO emulation. All GPRs are reflected from the host to the TSM during the single-step TVM creation. |=== [TIP] @@ -290,21 +318,38 @@ accessible to the host. ==== ===== vCPU creation -The host must register CPUs/harts with the TSM before they can be used for TVM +During assembly of a TVM, the host must register CPUs/harts with the TSM before they can be used for TVM execution by calling `sbi_covh_create_tvm_vcpu()`. The NACL shared memory interface is used between the host and the TSM for processing TVM exits from -`sbi_covh_run_tvm_vcpu()`. +`sbi_covh_run_tvm_vcpu()`. -===== TVM execution -Following the assignment of memory and vCPU resources, the host can transition -the guest into a `TVM_RUNNABLE` state by calling `sbi_covh_finalize_tvm()`. +[NOTE] +==== +Note that the vCPU creation procedure is not required for TVMs created in a single step +(via `sbi_covh_promote_to_tvm()`) because the TSM creates all TVM's vCPUs at once. +All TVM's vCPUs are in the reset state, except for the vCPU whose state was presented to the TSM +at the time of promotion. The vCPUs initial states are part of the TVM's measurement. +==== + +===== Finalization of TVM creation +The OS/VMM that finishes the assembly of a TVM, i.e., assignment of memory and +vCPU resources to a TVM, transitions the guest into a `TVM_RUNNABLE` state by +calling `sbi_covh_finalize_tvm()`. The host must set up TVM Boot vCPU execution parameters like the entrypoint (`ENTRY_PC`) and boot argument (`ENTRY_ARG`) using arguments to `sbi_covh_finalize_tvm()`. Note that some COVH calls are no longer permissible -after this transition. +after this transition. + +[NOTE] +==== +Note that a TVM created via `sbi_covh_promote_to_tvm()` does not require finalization, +because it is already in the `TVM_RUNNABLE` state with a valid entrypoint, which +corresponds to the vCPU state presented to the TSM at the time of promotion to a TVM. +==== -The host can then call sbi_covh_run_tvm_vcpu()` to begin execution. The host -must boot vCPU `0` first otherwise `sbi_covh_run_tvm_vcpu()` call will fail. +===== TVM execution +The host can then call `sbi_covh_run_tvm_vcpu()` to begin execution. The host +must run the TVM Boot vCPU first otherwise `sbi_covh_run_tvm_vcpu()` call will fail. TVM execution continues until there is an event like an interrupt, or fault that cannot be serviced by the TSM. Some interrupts and exceptions are resumable, and the host can determine specific reason by examining the `scause` @@ -375,7 +420,7 @@ shared memory region. The expectation is that the host will service a subsequent page-fault that results from a TVM-access to the non-confidential region. -===== TVM-defined Shared memory regions +===== TVM-defined shared memory regions TVMs can choose to yield access to confidential memory at runtime and request shared (non-confidential) memory. The TVM must communicate its request to the host to convert confidential to @@ -811,8 +856,53 @@ The possible error codes returned in `sbiret.error` are shown below. | SBI_ERR_FAILED | The operation failed for unknown reasons. |=== +[#sbi_covh_promote_to_tvm] +=== Function: COVE Host Promote to TVM (FID #7) +[source, C] +----- +struct sbiret sbi_covh_promote_to_tvm(unsigned long fdt_addr, + unsigned long tap_addr); +----- +This intrisic is used by the host to promote a VM to a TVM. It is primarily intended for CoVE deployment +models that require single-step TVM creation (e.g., <>). Deployment models that offer multi-step TVM creation +(e.g., <>) may, but are not required to, support this ABI as an additional mechanism for creating TVMs. + +The `fdt_addr` is the 8-bytes aligned guest physical address of the guest flattened device tree (FDT). +The `tap_addr` is the 8-bytes aligned guest physical address of the `TVM attestation payload` used for local attestation. +For VMs that do not require local attestation (only the remote attestation), `tap_addr` must be set to `0`. + +The VM should be promoted as early in the boot process as possible to minimize changes in memory contents so +that the resulting integrity measurement hashes are deterministic. +The VM must not use registers associated with RISC-V extensions, e.g., vector, floating point, before promotion, because these registers will be zeroized during the promotion. +The TSM recreates TVM vCPUs in confidential memory. All TVM vCPUs are in the reset state, except the +TVM Boot vCPU, which is the state of the reflected VM's vCPU. This VM's vCPU state is reflected using +NACL shared memory. The TSM recreates in confidential memory the reflected VM's data and the VM's page +tables, following the page table configuration defined in HGATP. + +Once the TVM's image is completed, the TSM calculates the TVM measurement. +If `tap_addr` is defined, the TSM uses this TVM measurement to authenticate and authorize the TVM +as part of the local attestation procedure. + +If successful, the TSM sets the TVM state to `TVM_RUNNABLE` and returns a unique TVM identifier (`tvm_guest_id`) to the OS/VMM. +The OS/VMM should free the contents of non-confidential memory that contains the VM's data and the page tables. +After this call, the OS/VMM must interact with this TVM via the TSM using the COVH ABI, i.e., resuming the TVM using the `sbi_covh_run_tvm_vcpu()` call. + +If the call fails, the TSM returns the SBI error code in `sbiret.error` to the OS/VMM. The possible error codes are shown below. + +[#table_sbi_covh_promote_to_tvm_errors] +.COVE Host Promote to TVM Errors +[cols="2,3", width=90%, align="center", options="header"] +|=== +| Error code | Description +| SBI_SUCCESS | The operation completed successfully. +| SBI_ERR_INVALID_ADDRESS | `fdt_addr` was invalid. +| SBI_ERR_AUTH | Local attestation failed. +| SBI_ERR_OUT_OF_MEMORY | Not enough confidential memory to store TVM. +| SBI_ERR_FAILED | The operation failed for unknown reasons. +|=== + [#sbi_covh_destroy_tvm] -=== Function: COVE Host Destroy TVM (FID #7) +=== Function: COVE Host Destroy TVM (FID #8) [source, C] ------- struct sbiret sbi_covh_destroy_tvm(unsigned long tvm_guest_id); @@ -843,7 +933,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_add_tvm_memory_region] -=== Function: COVE Host Add TVM Memory Region (FID #8) +=== Function: COVE Host Add TVM Memory Region (FID #9) [source, C] ----- struct sbiret sbi_covh_add_tvm_memory_region(unsigned long tvm_guest_id, @@ -874,7 +964,7 @@ TVM wasn't |=== [#sbi_covh_add_tvm_page_table_pages] -=== Function: COVE Host Add TVM Page Table Pages (FID #9) +=== Function: COVE Host Add TVM Page Table Pages (FID #10) [source, C] ----- struct sbiret sbi_covh_add_tvm_page_table_pages(unsigned long tvm_guest_id, @@ -903,7 +993,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_add_tvm_measured_pages] -=== Function: COVE Host Add TVM Measured Pages (FID #10) +=== Function: COVE Host Add TVM Measured Pages (FID #11) [source, C] ----- struct sbiret sbi_covh_add_tvm_measured_pages(unsigned long tvm_guest_id, @@ -950,7 +1040,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_add_tvm_zero_pages] -=== Function: COVE Host Add TVM Zero Pages (FID #11) +=== Function: COVE Host Add TVM Zero Pages (FID #12) [source, C] ----- struct sbiret sbi_covh_add_tvm_zero_pages(unsigned long tvm_guest_id, @@ -985,7 +1075,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_add_tvm_shared_pages] -=== Function: COVE Host Add TVM Shared Pages (FID #12) +=== Function: COVE Host Add TVM Shared Pages (FID #13) [source, C] ----- struct sbiret sbi_covh_add_tvm_shared_pages(unsigned long tvm_guest_id, @@ -1021,7 +1111,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_create_tvm_vcpu] -=== Function: COVE Host Create TVM vCPU (FID #13) +=== Function: COVE Host Create TVM vCPU (FID #14) [source, C] ----- struct sbiret sbi_covh_create_tvm_vcpu(unsigned long tvm_guest_id, @@ -1049,7 +1139,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_run_tvm_vcpu] -=== Function: COVE Host Run TVM vCPU (FID #14) +=== Function: COVE Host Run TVM vCPU (FID #15) [source, C] ----- struct sbiret sbi_covh_run_tvm_vcpu(unsigned long tvm_guest_id, @@ -1171,7 +1261,7 @@ enum Exception { ------- [#sbi_covh_tvm_fence] -=== Function: COVE Host Initiate TVM Fence (FID #15) +=== Function: COVE Host Initiate TVM Fence (FID #16) [source, C] ----- struct sbiret sbi_covh_tvm_fence(unsigned long tvm_guest_id); @@ -1198,7 +1288,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_tvm_invalidate_pages] -=== Function: COVE Host TVM Invalidate Pages (FID #16) +=== Function: COVE Host TVM Invalidate Pages (FID #17) [source, C] ----- struct sbiret sbi_covh_tvm_invalidate_pages(unsigned long tvm_guest_id, @@ -1239,7 +1329,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_tvm_validate_pages] -=== Function: COVE Host TVM Validate Pages (FID #17) +=== Function: COVE Host TVM Validate Pages (FID #18) [source, C] ----- struct sbiret sbi_covh_tvm_validate_pages(unsigned long tvm_guest_id, @@ -1271,7 +1361,7 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_tvm_remove_pages] -=== Function: COVE Host TVM Remove Pages (FID #18) +=== Function: COVE Host TVM Remove Pages (FID #19) [source, C] ----- struct sbiret sbi_covh_tvm_remove_pages(unsigned long tvm_guest_id, @@ -1725,6 +1815,11 @@ completed. Attempts to run it with `sbi_covh_run_tvm_vcpu()` will fail. Any guest page faults taken by other TVM vCPUs in the invalidated pages continue to be reported to the host. +In CoVE implementations that do not support dynamic page conversions between confidential +and non-confidential memory, the TSM reflects this call to the OS/VMM, which then +allocates contiguous non-confidential pages and returns the host physical address of the first +page to the TSM. The TSM maps the non-confidential pages to the TVM's address space. + Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned. The possible error codes returned in sbiret.error are: @@ -1756,7 +1851,7 @@ shared to confidential. The requested range must lie within an existing region of non-confidential address space, and may or may not be populated. This ECALL results in an exit to the TSM which -enforces the security properties on the mapping and exits to the VMM host. The +enforces the security properties on the mapping and exits to the OS/VMM. The host then removes any non-confidential pages already populated in the region and inserts confidential pages on page-faults. @@ -2042,8 +2137,33 @@ confidential memory. | SBI_ERR_FAILED | The operation failed for unknown reasons. |=== +[#covg_retrieve_secret] +=== Function: COVE Guest Retrieve Secret (FID #9) +[source, C] +----- +struct sbiret covg_retrieve_secret(); +----- +Requests TSM for a secret available after successful local attestation. TSM reads this secret during +local attestation from the `TVM attestation payload` (TAP). TAP is part of the VM image and is +presented to TSM during the TVM creation via `sbi_covh_promote_to_tvm()`. Only the TVMs that were correctly +authenticated and authorized during local attestation can receive the secret embedded in TAP. + +This ABI will become part of the `Sealing Interface` planned for the CoVE in version 2.0. + +If the call fails, the TSM returns SBI error code in `sbiret.error` to the VM. The possible error codes +are shown below. + +[#table_covg_retrieve_secret_errors] +.COVE Guest Retrieve Secret Errors +[cols="2,3", width=90%, align="center", options="header"] +|=== +| Error code | Description +| SBI_SUCCESS | The operation completed successfully. +| SBI_ERR_AUTH | Local attestation failed. +|=== + [#sbi_covg_read_measurement] -=== Function: COVE Guest Read Measurement (FID #9) +=== Function: COVE Guest Read Measurement (FID #10) [source, C] ------- struct sbiret sbi_covg_read_measurememt(unsigned long msmt_buf_addr_out, @@ -2169,7 +2289,7 @@ considered trustworthy. | <> | Add a zero page for an existing mapping for a TVM page (post initialization). This operation adds a zero page into a mapping and keeps the mapping as -pending (i.e. access from the TVM will fault until the TVM accepts that GPA. +pending, i.e., access from the TVM will fault until the TVM accepts that GPA. | <> | Maps the given number of pages of non-confidential memory into the TVM's physical @@ -2249,7 +2369,7 @@ This interface specification is TBD for version 2 of the ABI. |=== -=== Summary of CoVE Interrupt Extension(COVI) +=== Summary of CoVE Interrupt Extension (COVI) |=== | <> | This diff --git a/specification/swlifecycle.adoc b/specification/swlifecycle.adoc index bae2b5b..2a681db 100644 --- a/specification/swlifecycle.adoc +++ b/specification/swlifecycle.adoc @@ -8,13 +8,17 @@ including the OS/VMM interactions with the TSM. === TVM build and initialization -The host OS/VMM must be capable of hosting many TVMs on a CoVE-capable -platform (limited only by the practical limits of the number of cpus and +The host OS/VMM should be capable of hosting many TVMs on a CoVE-capable +platform (limited only by the practical limits of the number of CPUs and the amount of memory available on the system). To that end, the TVM should be able to use all of the system memory as confidential memory, as long as the platform access-control mechanisms are applicable to all the available -memory on the system. The TSM allows the OS/VMM to manage the assignment of -confidential memory by providing a two stage TEE memory management model: +memory on the system. In CoVE implementations that support dynamic conversion +between confidential and non-confidential memory, the TSM might allow the +OS/VMM to manage the assignment of confidential memory by providing a two stage +TEE memory management model, described below. However, this memory management +model is not required for CoVE implementations that statically partition memory into +confidential and non-confidential. 1. Creation of confidential memory regions - this process converts memory pages from non-confidential to confidential memory (and in that process @@ -25,13 +29,22 @@ controls described earlier). confidential memory regions for various purposes like creating TVM workloads etc. -The host OS/VMM may create a new TVM by allocating and initializing a TVM -using the `sbi_covh_create_tvm()` function. An initial set of memory pages are -granted to the TSM and tracked as TEE pages associated with that TVM from -that point onwards until the TVM is destroyed via the `sbi_covh_destroy_tvm()` -function. +This specification defines two ways of creating a TVM, an implementation must +provide at least one of them: (1) Multi-step TVM creation, (2) Single-step TVM creation. +In the multi-step TVM creation, OS/VMM constructs a TVM in a sequence of calls to TSM. +During this process, the TVM is in an intermediate state and cannot be executed until +the explicit call to finalize the TVM creation. In the single-step TVM creation, the TSM +converts an existing VM into a TVM in a single operation by copying the VM data and state +to confidential memory. -A TVM context may be created and initialized by using the +==== Multi-step TVM Creation + +The host OS/VMM may create a new TVM by allocating and initializing a TVM using the +`sbi_covh_create_tvm()` function. An initial set of memory pages are granted to the +TSM and tracked as TEE pages associated with that TVM from that point onwards until +the TVM is destroyed via the `sbi_covh_destroy_tvm()` function. + +When a TVM context is created and initialized by using the `sbi_covh_create_tvm()` function - this global init function allocates a set of pages for the TVM global control structure and resets the control fields that are immutable for the lifetime of the TVM, e.g., configuration of @@ -65,11 +78,29 @@ measurement of the TVM via `sbi_covh_finalize_tvm()`. The TSM prevents any TVM virtual harts from being entered until the TVM initialization is finalized. +==== Single-step TVM Creation + +The OS/VMM can create a new TVM in a single operation by requesting the TSM to construct a TVM +from the existing non-confidential VM. +First, the OS/VMM creates a regular VM by allocating memory for the VM's data and vCPUs and creating +the G-Stage page table configuration. This is a standard process for creating non-confidential VMs +implemented by existing OS/VMMs. In response to a request from a VM, `sbi_promote_to_tvm()`, +the OS/VMM requests the TSM to promote the VM by executing the `sbi_covh_promote_to_tvm()` call. +The TSM creates a new TVM from the reflected VM state (VM data, page table configuration, vCPU state). +After promotion, the OS/VMM releases the previously allocated VM resources and begins interacting with the TVM through the TSM as described in the next sections. + +[NOTE] +==== +The promotion should be done early during the TVM's bootstrap so that the integrity measurements +calculated by TSM over the TVM data in confidential memory are deterministic and therefore meaningful for +attestation. +==== + === TVM execution The VMM uses `sbi_covh_run_tvm_vcpu()` to (re)activate a virtual hart for a specific TVM (identified by the unique identifier). This TEECALL traps into -the TSM-driver which affects the context switch to the TSM - The TSM then +the TSM-driver which affects the context switch to the TSM. The TSM then manages the activation of the virtual hart on the calling physical hart. During this activation the TCB's firmware can enforce that stale TLB entries that govern guest physical to system physical page access @@ -78,9 +109,11 @@ the virtual-harts due to VS-stage translation changes (guest virtual to guest physical) performed by the TVM OS - these are initiated by the TVM OS to cause IPIs to the virtual-harts managed by the TVM OS (and verified by the TVM OS to ensure the IPIs are received by the TVM OS to invalidate the -TLB lazily). This reference architecture requires use of AiA IMSIC <> -to ensure these IPIs are delivered through the IMSIC associated with the -guest TVM. Each TVM is allocated a guest interrupt file during TVM +TLB lazily). + +For implementations using AIA <>, this reference architecture requires +use of AIA IMSIC to ensure these IPIs are delivered through the IMSIC associated +with the guest TVM. Each TVM is allocated a guest interrupt file during TVM initialization. During TVM execution, the hardware enforces TSM-driven policies for memory @@ -113,8 +146,8 @@ confidential memory may be only lazily added via `sbi_covh_add_tvm_zero_pages()` after the TVM measurement has been finalized. The TVM manages its internal memory database to indicate which guest physical page frames are confidential for mapping into VS-stage mappings. There are at -least two use scenarios for this ABI - first, late addition of memory to enable -TVM boot with the minimal measured state, and second, if some memory pages were +least two use scenarios for this ABI: (1) the late addition of memory to enable +TVM boot with the minimal measured state, and (2) if some memory pages were converted to non-confidential by the TVM via `sbi_covg_share_memory_region()`, and at a later point they are converted back to confidential, the VMM may add zero pages for those mappings. @@ -148,26 +181,29 @@ HFENCE.GVMA for the TVM VMID. This sequence is described in more detail in === TVM memory management -The RISC-V architecture supports page types of 4KB, 2MB, 1GB and 512GB. The untrusted OS/VMM may assign memory to the TVM at any architecture-supported -page size. This assignment is enforced via the TSM-driver and the TSM. -Specifically, the TSM-driver configures the memory tracking table (MTT) after -enforcing the security requirements to track the assignment of memory pages to -a supervisor domain/TSM. The TSM manages subsequent assignment of memory to -TVMs. - -Thus, memory access-control is enforced at two levels: - -* Isolation of memory assigned to TEEs - this includes memory assigned to the -TSM as well as any TVMs - this tracking is configured by the firmware TCB -(TSM-driver) via the Memory Tracking Table structure and is enforced by the CPU -MMU. The MTT tracks the access permissions for confidential supervisor domains +page size, i.e., 4KB, 2MB, 1GB and 512GB, according to RISC-V architecture. +This assignment is supervised by the TSM-driver and the TSM and enforced using +a specific hardware memory isolation component. Specifically, memory access-control +is enforced at two levels: + +* Isolation of memory assigned to the confidential supervisor domain (TSM and TVMs). +This tracking is configured by the firmware TCB (TSM-driver) and enforced using a +hardware memory isolation mechanism, e.g., Memory Tracking Table (MTT), PMP. +These mechanisms track access permissions for confidential supervisor domains and hosting supervisor domains for all software-accessible physical memory addresses. -* Isolation of memory between TVMs - memory tracking is augmented by the TSM -via the G-stage translation structures to maintain compatibility with OS/VMM -memory management, and is also enforced by the CPU MMU. The correct operation of -this access-control level is dependent on trusted enforcement of item 1 above. +* Isolation of memory between TVMs within a confidential supervisor domain. +The memory tracking is augmented by the TSM via the G-stage translation structures to +maintain compatibility with OS/VMM memory management, and is also enforced by the CPU's +memory management unit (MMU). The correct operation of this access-control level is dependent +on trusted enforcement of item 1 above. + +In CoVE implementations that support MTT, the TSM-driver configures the MTT +after enforcing the security requirements to track the assignment of memory pages to +a supervisor domain/TSM. The TSM manages subsequent assignment of memory to TVMs. In implementations +that do not implement MTT, memory must be statically partitioned into confidential and +non-confidential and the TSM is required to track assignment of pages in confidential memory to TVMs. ==== Security requirements for TVM memory mappings @@ -199,7 +235,8 @@ the GPA and enforced for memory access for the TVM by the hardware. ==== Information tracked per physical page -The Extended Memory Tracking Table (EMTT) information managed by the TSM +For implementations that utilize MTT the Extended Memory Tracking Table (EMTT) +information managed by the TSM is used to track additional fields of metadata associated with physical addresses. The page size is implicit in the MTT and EMTT lookup - 4KB, 2MB, 1GB, 512GB. @@ -229,26 +266,25 @@ per the global TLB management. If the page is assigned to a TVM, it is versioned the TVM-local TLB management. | Additional meta-data | Locking state |=== -% HGAT above what does it stand for, hypervisor guest address translation? should it be HGATP, or HGATP should be HGAT? ==== Page walk and Translation caching considerations Any caching of the address translation information when the memory tracking for confidential memory is enabled must cache whether the address translation is for -a TEE context or not. A miss in the cached MTT information is expected to cause -a lookup of the MTT structure using the PA and the resolved page size for TEE -access evaluation - which results in the TEE access information that is cached. +a TEE context or not. A miss in the cached address translation information is expected to cause +a lookup of the address translation structure using the physical address (PA) and the resolved page +size for TEE access evaluation - which results in the TEE access information that is cached. -The MTT lookups are performed using the physical address, and must be enforced -for all modes of operation i.e., with paging disabled, one-level paging and -guest-stage paging. - -Any MTT cached information may be flushed as part of HFENCE.GVMA. The TSM and +In CoVE implementations with MTT, the MTT lookups are performed using the physical address, +and must be enforced for all modes of operation i.e., with paging disabled, one-level paging and +guest-stage paging. Any MTT cached information may be flushed as part of HFENCE.GVMA. The TSM and VMM may both issue this operation. TSM issues this fence when memory access -is transferred between TEE and non-TEE domains via sbi_covh_convert_pages. +is transferred between TEE and non-TEE domains via `sbi_covh_convert_pages()`. ==== Page conversion +This section refers to CoVE implementations supporting page conversion, i.e., implementing MTT. + Post measured boot, the system memory map must be available to the TSM on load (accessed as part of initialization of the TSM). This memory map structure may be placed in the memory that is accessible only to the hardware and software TCB. VMM-chosen @@ -304,9 +340,9 @@ TLB version. A similar TLB version is managed associated with the physical address in the EMTT. If the VMM initiates memory conversion to confidential, or any change to an -assigned confidential and present GPA mapping for a TVM, e.g., remove, relocate, -promote etc., then it must execute the following sequence (enforced by TSM) to -affect that change: +assigned confidential and present guest physical address (GPA) mapping for a TVM, +e.g., remove, relocate, promote etc., then it must execute the following sequence +(enforced by TSM) to affect that change: * Invalidate the mapping it wants to modify (page or range of pages). This step prevents new cached mappings from being populated in the TLB. @@ -337,6 +373,9 @@ mapping and validate the mapping. ==== Page Mapping Page Assignment +This section refers to CoVE implementations supporting OS/VMM-initiated page assignment +to a TVM. + The VMM uses this operation to add a hgatp structure page to be used for mapping a guest physical address (GPA) to a physical address (PA). The inputs to this operation are the TVM identifier and the physical address(es) for the new @@ -358,6 +397,9 @@ as valid). ==== Measured page assignment into a TVM memory map +This section refers to CoVE implementations supporting OS/VMM-initiated page assignment +to a TVM. + VMM uses the sbi_covh_add_tvm_zero/measured_pages interfaces to add a 4KB/2MB/1GB page to the TVM. The page assigned to the TVM is identified by its PA. A source page (also PA) may be provided to initialize the page contents. In @@ -611,15 +653,14 @@ executing, the VMM stops TVM execution by issuing an asynchronous interrupt that yields the virtual hart and taking control back into the VMM (without any TVM state leakage as that is context saved by the TSM on the trap due to the interrupt). Once the TVM virtual harts are stopped, the VMM must issue a -sbi_covh_destroy_tvm that can verify that no TVM harts are executing and +`sbi_covh_destroy_tvm()` that can verify that no TVM harts are executing and unassigns all memory assigned to the TVM. -The VMM may grant the confidential memory to another TVM or may -reclaim all memory granted to the TVM via sbi_covh_reclaim_pages which will +CoVE implementations supporting dynamic memory conversion between confidential +and non-confidential, may allow the VMM to grant the confidential memory to another +TVM or reclaim all memory granted to the TVM via `sbi_covh_reclaim_pages()` which will verify the TSM hgatp mapping and tracking for the page and restore it as -a VMM-available page to grant to a non-confidential VM. - -*Reclaim TSM operation*: +a VMM-available page to grant to a non-confidential VM. This reclaim TSM opertaion: * Verifies that the PAs referenced are either Non-confidential (No-operation) or Confidential-Unassigned state. @@ -629,6 +670,10 @@ Confidential-Unassigned state. and returns the PA as an Non-Conf page to the VMM. * VMM translations to the PA (via 1st or G stage mappings) may be created now. +In CoVE implementations that do not support dynamic memory conversion between confidential +and non-confidential, the TSM must scrub page contents and should make these pages available +for future assignments to new TVMs. + === RAS interaction The TSM performs minimal fail-safe tasks when handling RAS events. diff --git a/specification/threatmodel.adoc b/specification/threatmodel.adoc index e83d389..9e6d31b 100644 --- a/specification/threatmodel.adoc +++ b/specification/threatmodel.adoc @@ -80,7 +80,7 @@ memory) via hardware approaches, including via exposed interface/links to other CPU sockets, memory and/or devices assigned to a TVM. T13: Downgrading TEE TCB elements (example TSM-driver, TSM) to older -versions or loading Invalid TEE TCB elements on the platform to enable +versions or loading invalid TEE TCB elements on the platform to enable confidentiality, integrity attacks. T14: Leveraging transient execution side-channel attacks in TSM-driver, @@ -141,16 +141,16 @@ non-normative Reference* Required | MMU, xPMP, MTT | Confidential memory should be dynamically allocated/ unallocated as required | RISC-V Priv. ISA, Supervisor Domains (Sdid, Smmtt) -| TEE CPU State Protection | State Isolation | Required | Priv. levels (M,S,HS, -U) and Execution context (ASID, VMID, SDID) | Prevent non-TCB components from +| TEE CPU State Protection | State Isolation | Required | Priv. levels (M, S, HS, U) +and execution context (ASID, VMID, SDID) | Prevent non-TCB components from arbitrarily accessing/modifying TEE CPU state | Priv ISA w/ virtual memory system, Supervisor Domains (Sdid, Smmtt) -| Memory Confidentiality | Memory isolation (read) | Required | cryptography +| Memory Confidentiality | Memory isolation (read) | Required | Cryptography and/or MMU, xPMP, MTT extension | Prevent non-TCB components from reading TEE memory | Priv ISA w/ virtual memory system, Supervisor Domains (Smmtt) -| Memory Confidentiality | Cipher text read prevention | Required | cryptography +| Memory Confidentiality | Cipher text read prevention | Optional | Cryptography and/or MMU, xPMP, MTT extension | Prevent non-TCB components from accessing encrypted TEE memory | Supervisor Domains @@ -177,16 +177,16 @@ specific | cryptography and/or MMU, xPMP, MTT extension | Prevent hardware attac DRAM-bus attacks and physical attacks that replace TEE memory with tampered / old data | Security Model -| Memory Integrity | Memory isolation (Write exec) | Required | cryptography +| Memory Integrity | Memory isolation (Write exec) | Required | Cryptography and/or MMU, xPMP, MTT | Prevent TEE from executing from normal memory; Enforce integrity of TEE data on writes | Supervisor Domains (Sdid, Smmtt) | Memory Integrity | Rowhammer attack prevention | Implementation-specific | -cryptography and/or memory-specific extension | Prevent non-TCB components from +Cryptography and/or memory-specific extension | Prevent non-TCB components from flipping bits of TEE memory | Security Model | Shared Memory | TEE controls data shared with non-TCB components | Required | -cryptography and/or MMU, xPMP, MTT | Prevent non-TCB code from exfiltrating +Cryptography and/or MMU, xPMP, MTT | Prevent non-TCB code from exfiltrating information without TEE consent/opt-in | Supervisor Domains (Sdid, Smmtt) | Shared Memory | TEE controls data shared with another TEE | Implementation @@ -317,7 +317,7 @@ Workloads | Malicious host tampers with nested VMM policies | Future CoVE ABI Interoperability with security features for TVM workload | Unauthorised security TVM | Future CoVE ABI -| Operational Features | QOS interoperability | Implementation-specific | +| Operational Features | QoS interoperability | Implementation-specific | Interoperability with QoS features for TVM workload | Malicious host uses QoS capabilities as a side-channel | Security Model