docs: add release v2.16.0

This commit is contained in:
malt3 2024-02-29 08:59:55 +00:00 committed by Malte Poll
parent 93eb8f0694
commit c4f27f62ee
72 changed files with 8740 additions and 0 deletions

View file

@ -0,0 +1,62 @@
# Feature status of clouds
What works on which cloud? Currently, Confidential VMs (CVMs) are available in varying quality on the different clouds and software stacks.
For Constellation, the ideal environment provides the following:
1. Ability to run arbitrary software and images inside CVMs
2. CVMs based on AMD SEV-SNP (available in EPYC CPUs since the Milan generation) or Intel TDX (available in Xeon CPUs since the Sapphire Rapids generation)
3. Ability for CVM guests to obtain raw hardware attestation statements
4. Reviewable, open-source firmware inside CVMs
5. Capability of the firmware to attest the integrity of the code it passes control to, e.g., with an embedded virtual TPM (vTPM)
(1) is a functional must-have. (2)--(5) are required for remote attestation that fully keeps the infrastructure/cloud out. Constellation can work without them or with approximations, but won't protect against certain privileged attackers anymore.
The following table summarizes the state of features for different infrastructures as of June 2023.
| **Feature** | **Azure** | **GCP** | **AWS** | **OpenStack (Yoga)** |
|-----------------------------------|-----------|---------|---------|----------------------|
| **1. Custom images** | Yes | Yes | Yes | Yes |
| **2. SEV-SNP or TDX** | Yes | Yes | Yes | Depends on kernel/HV |
| **3. Raw guest attestation** | Yes | Yes | Yes | Depends on kernel/HV |
| **4. Reviewable firmware** | No | No | Yes | Depends on kernel/HV |
| **5. Confidential measured boot** | Yes | No | No | Depends on kernel/HV |
## Microsoft Azure
With its [CVM offering](https://docs.microsoft.com/en-us/azure/confidential-computing/confidential-vm-overview), Azure provides the best foundations for Constellation.
Regarding (3), Azure provides direct access to remote-attestation statements.
The firmware runs in an isolated domain inside the CVM and exposes a vTPM (5), but it's closed source (4).
On SEV-SNP, Azure uses VM Privilege Level (VMPL) isolation for the separation of firmware and the rest of the VM; on TDX, they use TD partitioning.
This firmware is signed by Azure.
The signature is reflected in the remote-attestation statements of CVMs.
Thus, the Azure closed-source firmware becomes part of Constellation's trusted computing base (TCB).
## Google Cloud Platform (GCP)
The [CVMs Generally Available in GCP](https://cloud.google.com/compute/confidential-vm/docs/create-confidential-vm-instance) are based on AMD SEV but don't have SNP features enabled.
CVMs with SEV-SNP enabled are currently in [public preview](https://cloud.google.com/blog/products/identity-security/rsa-snp-vm-more-confidential). Regarding (3), with their SEV-SNP offering Google provides direct access to remote-attestation statements.
However, regarding (5), attestation is partially based on the [Shielded VM vTPM](https://cloud.google.com/compute/shielded-vm/docs/shielded-vm#vtpm) for [measured boot](../architecture/attestation.md#measured-boot), which is a vTPM managed by Google's hypervisor.
Hence, the hypervisor is currently part of Constellation's TCB.
Regarding (4), the CVMs still include closed-source firmware.
In the past, Intel and Google have [collaborated](https://cloud.google.com/blog/products/identity-security/rsa-google-intel-confidential-computing-more-secure) to enhance the security of TDX.
Recently, Google has announced a [private preview for TDX](https://cloud.google.com/blog/products/identity-security/confidential-vms-on-intel-cpus-your-datas-new-intelligent-defense?hl=en).
With TDX on Google, Constellation has a similar TCB and attestation flow as with the current SEV-SNP offering.
## Amazon Web Services (AWS)
Amazon EC2 [supports AMD SEV-SNP](https://aws.amazon.com/de/about-aws/whats-new/2023/04/amazon-ec2-amd-sev-snp/).
Regarding (3), AWS provides direct access to remote-attestation statements.
However, regarding (5), attestation is partially based on the [NitroTPM](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nitrotpm.html) for [measured boot](../architecture/attestation.md#measured-boot), which is a vTPM managed by the Nitro hypervisor.
Hence, the hypervisor is currently part of Constellation's TCB.
Regarding (4), the [firmware is open source](https://github.com/aws/uefi) and can be reproducibly built.
## OpenStack
OpenStack is an open-source cloud and infrastructure management software. It's used by many smaller CSPs and datacenters. In the latest *Yoga* version, OpenStack has basic support for CVMs. However, much depends on the employed kernel and hypervisor. Features (2)--(4) are likely to be a *Yes* with Linux kernel version 6.2. Thus, going forward, OpenStack on corresponding AMD or Intel hardware will be a viable underpinning for Constellation.
## Conclusion
The different clouds and software like the Linux kernel and OpenStack are in the process of building out their support for state-of-the-art CVMs. Azure has already most features in place. For Constellation, the status quo means that the TCB has different shapes on different infrastructures. With broad SEV-SNP support coming to the Linux kernel, we soon expect a normalization of features across infrastructures.

View file

@ -0,0 +1,42 @@
# Confidential Kubernetes
We use the term *Confidential Kubernetes* to refer to the concept of using confidential-computing technology to shield entire Kubernetes clusters from the infrastructure. The three defining properties of this concept are:
1. **Workload shielding**: the confidentiality and integrity of all workload-related data and code are enforced.
2. **Control plane shielding**: the confidentiality and integrity of the cluster's control plane, state, and workload configuration are enforced.
3. **Attestation and verifiability**: the two properties above can be verified remotely based on hardware-rooted cryptographic certificates.
Each of the above properties is equally important. Only with all three in conjunction, an entire cluster can be shielded without gaps.
## Constellation security features
Constellation implements the Confidential Kubernetes concept with the following security features.
* **Runtime encryption**: Constellation runs all Kubernetes nodes inside Confidential VMs (CVMs). This gives runtime encryption for the entire cluster.
* **Network and storage encryption**: Constellation augments this with transparent encryption of the [network](../architecture/networking.md), [persistent storage](../architecture/encrypted-storage.md), and other managed storage like [AWS S3](../architecture/encrypted-storage.md#encrypted-s3-object-storage). Thus, workloads and control plane are truly end-to-end encrypted: at rest, in transit, and at runtime.
* **Transparent key management**: Constellation manages the corresponding [cryptographic keys](../architecture/keys.md) inside CVMs.
* **Node attestation and verification**: Constellation verifies the integrity of each new CVM-based node using [remote attestation](../architecture/attestation.md). Only "good" nodes receive the cryptographic keys required to access the network and storage of a cluster.
* **Confidential computing-optimized images**: A node is "good" if it's running a signed Constellation [node image](../architecture/images.md) inside a CVM and is in the expected state. (Node images are hardware-measured during boot. The measurements are reflected in the attestation statements that are produced by nodes and verified by Constellation.)
* **"Whole cluster" attestation**: Towards the DevOps engineer, Constellation provides a single hardware-rooted certificate from which all of the above can be verified.
With the above, Constellation wraps an entire cluster into one coherent and verifiable *confidential context*. The concept is depicted in the following.
![Confidential Kubernetes](../_media/concept-constellation.svg)
## Contrast: Managed Kubernetes with CVMs
In contrast, managed Kubernetes with CVMs, as it's for example offered in [AKS](https://azure.microsoft.com/en-us/services/kubernetes-service/) and [GKE](https://cloud.google.com/kubernetes-engine), only provides runtime encryption for certain worker nodes. Here, each worker node is a separate (and typically unverified) confidential context. This only provides limited security benefits as it only prevents direct access to a worker node's memory. The large majority of potential attacks through the infrastructure remain unaffected. This includes attacks through the control plane, access to external key management, and the corruption of worker node images. This leaves many problems unsolved. For instance, *Node A* has no means to verify if *Node B* is "good" and if it's OK to share data with it. Consequently, this approach leaves a large attack surface, as is depicted in the following.
![Concept: Managed Kubernetes plus CVMs](../_media/concept-managed.svg)
The following table highlights the key differences in terms of features.
| | Managed Kubernetes with CVMs | Confidential Kubernetes (Constellation✨) |
|-------------------------------------|------------------------------|--------------------------------------------|
| Runtime encryption | Partial (data plane only)| **Yes** |
| Node image verification | No | **Yes** |
| Full cluster attestation | No | **Yes** |
| Transparent network encryption | No | **Yes** |
| Transparent storage encryption | No | **Yes** |
| Confidential key management | No | **Yes** |
| Cloud agnostic / multi-cloud | No | **Yes** |

View file

@ -0,0 +1,33 @@
# License
## Source code
Constellation's source code is available on [GitHub](https://github.com/edgelesssys/constellation) under the [GNU Affero General Public License v3.0](https://github.com/edgelesssys/constellation/blob/main/LICENSE).
## Binaries
Edgeless Systems provides ready-to-use and [signed](../architecture/attestation.md#chain-of-trust) binaries of Constellation. This includes the CLI and the [node images](../architecture/images.md).
These binaries may be used free of charge within the bounds of Constellation's [**Community License**](#community-license). An [**Enterprise License**](#enterprise-license) can be purchased from Edgeless Systems.
The Constellation CLI displays relevant license information when you initialize your cluster. You are responsible for staying within the bounds of your respective license. Constellation doesn't enforce any limits so as not to endanger your cluster's availability.
## Terraform provider
Edgeless Systems provides a [Terraform provider](https://github.com/edgelesssys/terraform-provider-constellation/releases), which may be used free of charge within the bounds of Constellation's [**Community License**](#community-license). An [**Enterprise License**](#enterprise-license) can be purchased from Edgeless Systems.
You are responsible for staying within the bounds of your respective license. Constellation doesn't enforce any limits so as not to endanger your cluster's availability.
## Community License
You are free to use the Constellation binaries provided by Edgeless Systems to create services for internal consumption, evaluation purposes, or non-commercial use. You must not use the Constellation binaries to provide commercial hosted services to third parties. Edgeless Systems gives no warranties and offers no support.
## Enterprise License
Enterprise Licenses don't have the above limitations and come with support and additional features. Find out more at the [product website](https://www.edgeless.systems/products/constellation/).
Once you have received your Enterprise License file, place it in your [Constellation workspace](../architecture/orchestration.md#workspaces) in a file named `constellation.license`.
## CSP Marketplaces
Constellation is available through the Marketplaces of AWS, Azure, GCP, and STACKIT. This allows you to create self-managed Constellation clusters that are billed on a pay-per-use basis (hourly, per vCPU) with your CSP account. You can still get direct support by Edgeless Systems. For more information, please [contact us](https://www.edgeless.systems/enterprise-support/).

View file

@ -0,0 +1,102 @@
# Application benchmarks
## HashiCorp Vault
[HashiCorp Vault](https://www.vaultproject.io/) is a distributed secrets management software that can be deployed to Kubernetes.
HashiCorp maintains a benchmarking tool for vault, [vault-benchmark](https://github.com/hashicorp/vault-benchmark/).
Vault-benchmark generates load on a Vault deployment and measures response times.
This article describes the results from running vault-benchmark on Constellation, AKS, and GKE.
You can find the setup for producing the data discussed in this article in the [vault-benchmarks](https://github.com/edgelesssys/vault-benchmarks) repository.
The Vault API used during benchmarking is the [transits secret engine](https://developer.hashicorp.com/vault/docs/secrets/transit).
This allows services to send data to Vault for encryption, decryption, signing, and verification.
## Results
On each run, vault-benchmark sends requests and measures the latencies.
The measured latencies are aggregated through various statistical features.
After running the benchmark n times, the arithmetic mean over a subset of the reported statistics is calculated.
The selected features are arithmetic mean, 99th percentile, minimum, and maximum.
Arithmetic mean gives a general sense of the latency on each target.
The 99th percentile shows performance in (most likely) erroneous states.
Minimum and maximum mark the range within which latency varies each run.
The benchmark was configured with 1300 workers and 10 seconds per run.
Those numbers were chosen empirically.
The latency was stabilizing at 10 seconds runtime, not changing with further increase.
Increasing the number of workers beyond 1300 leads to request failures, marking the limit Vault was able to handle in this setup.
All results are based on 100 runs.
The following data was generated while running five replicas, one primary, and four standby nodes.
All numbers are in seconds if not indicated otherwise.
```
========== Results AKS ==========
Mean: mean: 1.632200, variance: 0.002057
P99: mean: 5.480679, variance: 2.263700
Max: mean: 6.651001, variance: 2.808401
Min: mean: 0.011415, variance: 0.000133
========== Results GKE ==========
Mean: mean: 1.656435, variance: 0.003615
P99: mean: 6.030807, variance: 3.955051
Max: mean: 7.164843, variance: 3.300004
Min: mean: 0.010233, variance: 0.000111
========== Results C11n ==========
Mean: mean: 1.651549, variance: 0.001610
P99: mean: 5.780422, variance: 3.016106
Max: mean: 6.942997, variance: 3.075796
Min: mean: 0.013774, variance: 0.000228
========== AKS vs C11n ==========
Mean: +1.171577 % (AKS is faster)
P99: +5.185495 % (AKS is faster)
Max: +4.205618 % (AKS is faster)
Min: +17.128781 % (AKS is faster)
========== GKE vs C11n ==========
Mean: -0.295851 % (GKE is slower)
P99: -4.331603 % (GKE is slower)
Max: -3.195248 % (GKE is slower)
Min: +25.710886 % (GKE is faster)
```
**Interpretation**: Latencies are all within ~5% of each other.
AKS performs slightly better than GKE and Constellation (C11n) in all cases except minimum latency.
Minimum latency is the lowest for GKE.
Compared to GKE, Constellation had slightly lower peak latencies (99th percentile and maximum), indicating that Constellation could have handled slightly more concurrent accesses than GKE.
Overall, performance is at comparable levels across all three distributions.
Based on these numbers, you can use a similarly sized Constellation cluster to run your existing Vault deployment.
### Visualization
The following plots visualize the data presented above as [box plots](https://en.wikipedia.org/wiki/Box_plot).
The whiskers denote the minimum and maximum.
The box stretches from the 25th to the 75th percentile, with the dividing bar marking the 50th percentile.
The circles outside the whiskers denote outliers.
<details>
<summary>Mean Latency</summary>
![Mean Latency](../../_media/benchmark_vault/5replicas/mean_latency.png)
</details>
<details>
<summary>99th Percentile Latency</summary>
![99th Percentile Latency](../../_media/benchmark_vault/5replicas/p99_latency.png)
</details>
<details>
<summary>Maximum Latency</summary>
![Maximum Latency](../../_media/benchmark_vault/5replicas/max_latency.png)
</details>
<details>
<summary>Minimum Latency</summary>
![Minimum Latency](../../_media/benchmark_vault/5replicas/min_latency.png)
</details>

View file

@ -0,0 +1,204 @@
# I/O performance benchmarks
To assess the overall performance of Constellation, this benchmark evaluates Constellation v2.6.0 in terms of storage I/O using [`fio`](https://fio.readthedocs.io/en/latest/fio_doc.html) and network performance using the [Kubernetes Network Benchmark](https://github.com/InfraBuilder/k8s-bench-suite#knb--kubernetes-network-be).
This benchmark tested Constellation on Azure and GCP and compared the results against the managed Kubernetes offerings AKS and GKE.
## Configurations
### Constellation
The benchmark was conducted with Constellation v2.6.0, Kubernetes v1.25.7, and Cilium v1.12.
It ran on the following infrastructure configurations.
Constellation on Azure:
- Nodes: 3 (1 Control-plane, 2 Worker)
- Machines: `DC4as_v5`: 3rd Generation AMD EPYC 7763v (Milan) processor with 4 Cores, 16 GiB memory
- CVM: `true`
- Region: `West US`
- Zone: `2`
Constellation on GCP:
- Nodes: 3 (1 Control-plane, 2 Worker)
- Machines: `n2d-standard-4`: 2nd Generation AMD EPYC (Rome) processor with 4 Cores, 16 GiB of memory
- CVM: `true`
- Zone: `europe-west3-b`
### AKS
On AKS, the benchmark used Kubernetes `v1.24.9` and nodes with version `AKSUbuntu-1804gen2containerd-2023.02.15`.
AKS ran with the [`kubenet`](https://learn.microsoft.com/en-us/azure/aks/concepts-network#kubenet-basic-networking) CNI and the [default CSI driver](https://learn.microsoft.com/en-us/azure/aks/azure-disk-csi) for Azure Disk.
The following infrastructure configurations was used:
- Nodes: 2 (2 Worker)
- Machines: `D4as_v5`: 3rd Generation AMD EPYC 7763v (Milan) processor with 4 Cores, 16 GiB memory
- CVM: `false`
- Region: `West US`
- Zone: `2`
### GKE
On GKE, the benchmark used Kubernetes `v1.24.9` and nodes with version `1.24.9-gke.3200`.
GKE ran with the [`kubenet`](https://cloud.google.com/kubernetes-engine/docs/concepts/network-overview) CNI and the [default CSI driver](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver) for Compute Engine persistent disk.
The following infrastructure configurations was used:
- Nodes: 2 (2 Worker)
- Machines: `n2d-standard-4` 2nd Generation AMD EPYC (Rome) processor with 4 Cores, 16 GiB of memory
- CVM: `false`
- Zone: `europe-west3-b`
## Results
### Network
This section gives a thorough analysis of the network performance of Constellation, specifically focusing on measuring TCP and UDP bandwidth.
The benchmark measured the bandwidth of pod-to-pod and pod-to-service connections between two different nodes using [`iperf`](https://iperf.fr/).
GKE and Constellation on GCP had a maximum network bandwidth of [10 Gbps](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machines).
AKS with `Standard_D4as_v5` machines a maximum network bandwidth of [12.5 Gbps](https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series).
The Confidential VM equivalent `Standard_DC4as_v5` currently has a network bandwidth of [1.25 Gbps](https://learn.microsoft.com/en-us/azure/virtual-machines/dcasv5-dcadsv5-series#dcasv5-series-products).
Therefore, to make the test comparable, both AKS and Constellation on Azure were running with `Standard_DC4as_v5` machines and 1.25 Gbps bandwidth.
Constellation on Azure and AKS used an MTU of 1500.
Constellation on GCP used an MTU of 8896. GKE used an MTU of 1450.
The difference in network bandwidth can largely be attributed to two factors.
- Constellation's [network encryption](../../architecture/networking.md) via Cilium and WireGuard, which protects data in-transit.
- [AMD SEV using SWIOTLB bounce buffers](https://lore.kernel.org/all/20200204193500.GA15564@ashkalra_ubuntu_server/T/) for all DMA including network I/O.
#### Pod-to-Pod
In this scenario, the client Pod connects directly to the server pod via its IP address.
```mermaid
flowchart LR
subgraph Node A
Client[Client]
end
subgraph Node B
Server[Server]
end
Client ==>|traffic| Server
```
The results for "Pod-to-Pod" on Azure are as follows:
![Network Pod2Pod Azure benchmark graph](../../_media/benchmark_net_p2p_azure.png)
The results for "Pod-to-Pod" on GCP are as follows:
![Network Pod2Pod GCP benchmark graph](../../_media/benchmark_net_p2p_gcp.png)
#### Pod-to-Service
In this scenario, the client Pod connects to the server Pod via a ClusterIP service. This is more relevant to real-world use cases.
```mermaid
flowchart LR
subgraph Node A
Client[Client] ==>|traffic| Service[Service]
end
subgraph Node B
Server[Server]
end
Service ==>|traffic| Server
```
The results for "Pod-to-Pod" on Azure are as follows:
![Network Pod2SVC Azure benchmark graph](../../_media/benchmark_net_p2svc_azure.png)
The results for "Pod-to-Pod" on GCP are as follows:
![Network Pod2SVC GCP benchmark graph](../../_media/benchmark_net_p2svc_gcp.png)
In our recent comparison of Constellation on GCP with GKE, Constellation has 58% less TCP bandwidth. However, UDP bandwidth was slightly better with Constellation, thanks to its higher MTU.
Similarly, when comparing Constellation on Azure with AKS using CVMs, Constellation achieved approximately 10% less TCP and 40% less UDP bandwidth.
### Storage I/O
Azure and GCP offer persistent storage for their Kubernetes services AKS and GKE via the Container Storage Interface (CSI). CSI storage in Kubernetes is available via `PersistentVolumes` (PV) and consumed via `PersistentVolumeClaims` (PVC).
Upon requesting persistent storage through a PVC, GKE and AKS will provision a PV as defined by a default [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/).
Constellation provides persistent storage on Azure and GCP [that's encrypted on the CSI layer](../../architecture/encrypted-storage.md).
Similarly, upon a PVC request, Constellation will provision a PV via a default storage class.
For Constellation on Azure and AKS, the benchmark ran with Azure Disk storage [Standard SSD](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) of 400 GiB size.
The [DC4as machine type](https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series) with four cores provides the following maximum performance:
- 6400 (20000 burst) IOPS
- 144 MB/s (600 MB/s burst) throughput
However, the performance is bound by the capabilities of the [512 GiB Standard SSD size](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) (the size class of 400 GiB volumes):
- 500 (600 burst) IOPS
- 60 MB/s (150 MB/s burst) throughput
For Constellation on GCP and GKE, the benchmark ran with Compute Engine Persistent Disk Storage [pd-balanced](https://cloud.google.com/compute/docs/disks) of 400 GiB size.
The N2D machine type with four cores and pd-balanced provides the following [maximum performance](https://cloud.google.com/compute/docs/disks/performance#n2d_vms):
- 3,000 read IOPS
- 15,000 write IOPS
- 240 MB/s read throughput
- 240 MB/s write throughput
However, the performance is bound by the capabilities of a [`Zonal balanced PD`](https://cloud.google.com/compute/docs/disks/performance#zonal-persistent-disks) with 400 GiB size:
- 2400 read IOPS
- 2400 write IOPS
- 112 MB/s read throughput
- 112 MB/s write throughput
The [`fio`](https://fio.readthedocs.io/en/latest/fio_doc.html) benchmark consists of several tests.
The benchmark used [`Kubestr`](https://github.com/kastenhq/kubestr) to run `fio` in Kubernetes.
The default test performs randomized access patterns that accurately depict worst-case I/O scenarios for most applications.
The following `fio` settings were used:
- No Cloud caching
- No OS caching
- Single CPU
- 60 seconds runtime
- 10 seconds ramp-up time
- 10 GiB file
- IOPS: 4 KB blocks and 128 iodepth
- Bandwidth: 1024 KB blocks and 128 iodepth
For more details, see the [`fio` test configuration](https://github.com/edgelesssys/constellation/blob/main/.github/actions/e2e_benchmark/fio.ini).
The results for IOPS on Azure are as follows:
![I/O IOPS Azure benchmark graph](../../_media/benchmark_fio_azure_iops.png)
The results for IOPS on GCP are as follows:
![I/O IOPS GCP benchmark graph](../../_media/benchmark_fio_gcp_iops.png)
The results for bandwidth on Azure are as follows:
![I/O bandwidth Azure benchmark graph](../../_media/benchmark_fio_azure_bw.png)
The results for bandwidth on GCP are as follows:
![I/O bandwidth GCP benchmark graph](../../_media/benchmark_fio_gcp_bw.png)
On GCP, the results exceed the maximum performance guarantees of the chosen disk type. There are two possible explanations for this. The first is that there may be cloud caching in place that isn't configurable. Alternatively, the underlying provisioned disk size may be larger than what was requested, resulting in higher performance boundaries.
When comparing Constellation on GCP with GKE, Constellation has similar bandwidth but about 10% less IOPS performance. On Azure, Constellation has similar IOPS performance compared to AKS, where both likely hit the maximum storage performance. However, Constellation has approximately 15% less read and write bandwidth.
## Conclusion
Despite the added [security benefits](../security-benefits.md) that Constellation provides, it only incurs a slight performance overhead when compared to managed Kubernetes offerings such as AKS and GKE. In most compute benchmarks, Constellation is on par with it's alternatives.
While it may be slightly slower in certain I/O scenarios due to network and storage encryption, there is ongoing work to reduce this overhead to single digits.
For instance, storage encryption only adds between 10% to 15% overhead in terms of bandwidth and IOPS.
Meanwhile, the biggest performance impact that Constellation currently faces is network encryption, which can incur up to 58% overhead on a 10 Gbps network.
However, the Cilium team has conducted [benchmarks with Cilium using WireGuard encryption](https://docs.cilium.io/en/latest/operations/performance/benchmark/#encryption-wireguard-ipsec) on a 100 Gbps network that yielded over 15 Gbps.
We're confident that Constellation will provide a similar level of performance with an upcoming release.
Overall, Constellation strikes a great balance between security and performance, and we're continuously working to improve its performance capabilities while maintaining its high level of security.

View file

@ -0,0 +1,25 @@
# Performance analysis of Constellation
This section provides a comprehensive examination of the performance characteristics of Constellation, encompassing various aspects, including runtime encryption, I/O benchmarks, and real-world applications.
## Impact of runtime encryption on performance
All nodes in a Constellation cluster are executed inside Confidential VMs (CVMs). Consequently, the performance of Constellation is inherently linked to the performance of these CVMs.
### AMD and Azure benchmarking
AMD and Azure have collectively released a [performance benchmark](https://community.amd.com/t5/business/microsoft-azure-confidential-computing-powered-by-3rd-gen-epyc/ba-p/497796) for CVMs that utilize 3rd Gen AMD EPYC processors (Milan) with SEV-SNP. This benchmark, which included a variety of mostly compute-intensive tests such as SPEC CPU 2017 and CoreMark, demonstrated that CVMs experience only minor performance degradation (ranging from 2% to 8%) when compared to standard VMs. Such results are indicative of the performance that can be expected from compute-intensive workloads running with Constellation on Azure.
### AMD and Google benchmarking
Similarly, AMD and Google have jointly released a [performance benchmark](https://www.amd.com/system/files/documents/3rd-gen-epyc-gcp-c2d-conf-compute-perf-brief.pdf) for CVMs employing 3rd Gen AMD EPYC processors (Milan) with SEV-SNP. With high-performance computing workloads such as WRF, NAMD, Ansys CFS, and Ansys LS_DYNA, they observed analogous findings, with only minor performance degradation (between 2% and 4%) compared to standard VMs. These outcomes are reflective of the performance that can be expected for compute-intensive workloads running with Constellation on GCP.
## I/O performance benchmarks
We evaluated the [I/O performance](io.md) of Constellation, utilizing a collection of synthetic benchmarks targeting networking and storage.
We further compared this performance to native managed Kubernetes offerings from various cloud providers, to better understand how Constellation stands in relation to standard practices.
## Application benchmarking
To gauge Constellation's applicability to well-known applications, we performed a [benchmark of HashiCorp Vault](application.md) running on Constellation.
The results were then compared to deployments on the managed Kubernetes offerings from different cloud providers, providing a tangible perspective on Constellation's performance in actual deployment scenarios.

View file

@ -0,0 +1,12 @@
# Product features
Constellation is a Kubernetes engine that aims to provide the best possible data security in combination with enterprise-grade scalability and reliability features---and a smooth user experience.
From a security perspective, Constellation implements the [Confidential Kubernetes](confidential-kubernetes.md) concept and corresponding security features, which shield your entire cluster from the underlying infrastructure.
From an operational perspective, Constellation provides the following key features:
* **Native support for different clouds**: Constellation works on Microsoft Azure, Google Cloud Platform (GCP), Amazon Web Services (AWS), and STACKIT. Support for OpenStack-based environments is coming with a future release. Constellation securely interfaces with the cloud infrastructure to provide [cluster autoscaling](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler), [dynamic persistent volumes](https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/), and [service load balancing](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer).
* **High availability**: Constellation uses a [multi-master architecture](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/) with a [stacked etcd topology](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/#stacked-etcd-topology) to ensure high availability.
* **Integrated Day-2 operations**: Constellation lets you securely [upgrade](../workflows/upgrade.md) your cluster to a new release. It also lets you securely [recover](../workflows/recovery.md) a failed cluster. Both with a single command.
* **Support for Terraform**: Constellation includes a [Terraform provider](../workflows/terraform-provider.md) that lets you manage the full lifecycle of your cluster via Terraform.

View file

@ -0,0 +1,22 @@
# Security benefits and threat model
Constellation implements the [Confidential Kubernetes](confidential-kubernetes.md) concept and shields entire Kubernetes deployments from the infrastructure. More concretely, Constellation decreases the size of the trusted computing base (TCB) of a Kubernetes deployment. The TCB is the totality of elements in a computing environment that must be trusted not to be compromised. A smaller TCB results in a smaller attack surface. The following diagram shows how Constellation removes the *cloud & datacenter infrastructure* and the *physical hosts*, including the hypervisor, the host OS, and other components, from the TCB (red). Inside the confidential context (green), Kubernetes remains part of the TCB, but its integrity is attested and can be [verified](../workflows/verify-cluster.md).
![TCB comparison](../_media/tcb.svg)
Given this background, the following describes the concrete threat classes that Constellation addresses.
## Insider access
Employees and third-party contractors of cloud service providers (CSPs) have access to different layers of the cloud infrastructure.
This opens up a large attack surface where workloads and data can be read, copied, or manipulated. With Constellation, Kubernetes deployments are shielded from the infrastructure and thus such accesses are prevented.
## Infrastructure-based attacks
Malicious cloud users ("hackers") may break out of their tenancy and access other tenants' data. Advanced attackers may even be able to establish a permanent foothold within the infrastructure and access data over a longer period. Analogously to the *insider access* scenario, Constellation also prevents access to a deployment's data in this scenario.
## Supply chain attacks
Supply chain security is receiving lots of attention recently due to an [increasing number of recorded attacks](https://www.enisa.europa.eu/news/enisa-news/understanding-the-increase-in-supply-chain-security-attacks). For instance, a malicious actor could attempt to tamper Constellation node images (including Kubernetes and other software) before they're loaded in the confidential VMs of a cluster. Constellation uses [remote attestation](../architecture/attestation.md) in conjunction with public [transparency logs](../workflows/verify-cli.md) to prevent this.
In the future, Constellation will extend this feature to customer workloads. This will enable cluster owners to create auditable policies that precisely define which containers can run in a given deployment.