docs: update benchmarks with v2.6.0

Co-authored-by: Thomas Tendyck <51411342+thomasten@users.noreply.github.com>
2025-08-21 13:08:07 -04:00 · 2023-03-23 11:23:57 +01:00 · 2023-03-23 11:23:57 +01:00 · db32251daa
commit db32251daa
parent a1f5e0e53d
17 changed files with 232 additions and 86 deletions
--- a/.github/actions/e2e_benchmark/README.md
+++ b/.github/actions/e2e_benchmark/README.md
@ -51,7 +51,7 @@ Follow the [Azure documentation](https://learn.microsoft.com/en-us/azure/aks/lea

 For example:
 ```bash
-az aks create -g moritz-constellation -n benchmark --node-count 2
+az aks create -g moritz-constellation -n benchmark --node-count 2 -s Standard_DC4as_v5
 az aks get-credentials -g moritz-constellation -n benchmark
 ```

@ -71,10 +71,29 @@ curl -fsSLO https://github.com/kastenhq/kubestr/releases/download/v${KUBESTR_VER
 tar -xzf kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
 install kubestr /usr/local/bin

+# Clone Constellation
+git clone https://github.com/edgelesssys/constellation.git
+
+# Create storage class without cloud caching
+cat <<EOF | kubectl apply -f -
+apiVersion: storage.k8s.io/v1
+kind: StorageClass
+metadata:
+  name: default-no-cache
+allowVolumeExpansion: true
+allowedTopologies: []
+mountOptions: []
+parameters:
+  skuname: StandardSSD_LRS
+  cachingMode: None
+provisioner: disk.csi.azure.com
+reclaimPolicy: Delete
+volumeBindingMode: WaitForFirstConsumer
+EOF

 # Run kubestr
 mkdir -p out
-kubestr fio -e "out/fio-AKS.json" -o json -s default -z 400Gi
+kubestr fio -e "out/fio-AKS.json" -o json -s default -z 400Gi -f constellation/.github/actions/e2e_benchmark/fio.ini

 # Run knb
 workers="$(kubectl get nodes | grep nodepool)"
@ -86,7 +105,6 @@ knb -f "out/knb-AKS.json" -o json --server-node $server --client-node $client
 # Benchmarks done, do processing.

 # Parse
-git clone https://github.com/edgelesssys/constellation.git
 mkdir -p benchmarks
 export BDIR=benchmarks
 export CSP=azure
@ -96,7 +114,7 @@ export BENCH_RESULTS=out/
 python constellation/.github/actions/e2e_benchmark/evaluate/parse.py

 # Upload result to S3
-S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks
+S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks/<version>
 aws s3 cp benchmarks/AKS.json ${S3_PATH}/AKS.json
 ```

@ -111,18 +129,6 @@ gcloud container clusters create benchmark \
    --machine-type n2d-standard-4 \
    --num-nodes 2
 gcloud container clusters get-credentials benchmark --region europe-west3-b
-# create storage class for pd-standard
-cat <<EOF | kubectl apply -f -
-apiVersion: storage.k8s.io/v1
-kind: StorageClass
-metadata:
-  name: pd-standard
-provisioner: pd.csi.storage.gke.io
-volumeBindingMode: WaitForFirstConsumer
-allowVolumeExpansion: true
-parameters:
-  type: pd-standard
-EOF
 ```

 Once the cluster is ready, set up managing access via `kubectl` and take the benchmark:
@ -141,9 +147,12 @@ curl -fsSLO https://github.com/kastenhq/kubestr/releases/download/v${KUBESTR_VER
 tar -xzf kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
 install kubestr /usr/local/bin

+# Clone Constellation
+git clone https://github.com/edgelesssys/constellation.git
+
 # Run kubestr
 mkdir -p out
-kubestr fio -e "out/fio-GKE.json" -o json -s pd-standard -z 400Gi
+kubestr fio -e "out/fio-GKE.json" -o json -s standard-rwo -z 400Gi -f constellation/.github/actions/e2e_benchmark/fio.ini

 # Run knb
 workers="$(kubectl get nodes | grep default-pool)"
@ -153,7 +162,6 @@ knb -f "out/knb-GKE.json" -o json --server-node "$server" --client-node "$client


 # Parse
-git clone https://github.com/edgelesssys/constellation.git
 mkdir -p benchmarks
 export BDIR=benchmarks
 export CSP=gcp
@ -163,7 +171,7 @@ export BENCH_RESULTS=out/
 python constellation/.github/actions/e2e_benchmark/evaluate/parse.py

 # Upload result to S3
-S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks
+S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks/<version>
 aws s3 cp benchmarks/GKE.json ${S3_PATH}/GKE.json
 ```

--- a/.github/actions/e2e_benchmark/action.yml
+++ b/.github/actions/e2e_benchmark/action.yml
@ -46,13 +46,50 @@ runs:
        ref: 1698974913b7b18ad54cf5860838029c295c77b1
        path: k8s-bench-suite

-    - name: Run FIO benchmark
+
+    - name: Run FIO benchmark without caching in Azure
+      if: inputs.cloudProvider == 'azure'
      shell: bash
      env:
        KUBECONFIG: ${{ inputs.kubeconfig }}
      run: |
+        cat <<EOF | kubectl apply -f -
+        apiVersion: storage.k8s.io/v1
+        kind: StorageClass
+        metadata:
+          name: encrypted-rwo-no-cache
+        allowVolumeExpansion: true
+        allowedTopologies: []
+        mountOptions: []
+        parameters:
+          skuname: StandardSSD_LRS
+          cachingMode: None
+        provisioner: azuredisk.csi.confidential.cloud
+        reclaimPolicy: Delete
+        volumeBindingMode: Immediate
+        EOF
        mkdir -p out
-        kubestr fio -e "out/fio-constellation-${{ inputs.cloudProvider }}.json" -o json -s encrypted-rwo -z 400Gi
+        kubestr fio -e "out/fio-constellation-${{ inputs.cloudProvider }}.json" -o json -s encrypted-rwo-no-cache -z 400Gi -f .github/actions/e2e_benchmark/fio.ini
+
+    - name: Run FIO benchmark
+      if: inputs.cloudProvider == 'gcp'
+      shell: bash
+      env:
+        KUBECONFIG: ${{ inputs.kubeconfig }}
+      run: |
+        cat <<EOF | kubectl apply -f -
+        apiVersion: storage.k8s.io/v1
+        kind: StorageClass
+        metadata:
+          name: encrypted-balanced-rwo
+        provisioner: gcp.csi.confidential.cloud
+        volumeBindingMode: Immediate
+        allowVolumeExpansion: true
+        parameters:
+          type: pd-balanced
+        EOF
+        mkdir -p out
+        kubestr fio -e "out/fio-constellation-${{ inputs.cloudProvider }}.json" -o json -s encrypted-balanced-rwo -z 400Gi -f .github/actions/e2e_benchmark/fio.ini

    - name: Upload raw FIO benchmark results
      if: (!env.ACT)
--- a/.github/actions/e2e_benchmark/evaluate/.gitignore
+++ b/.github/actions/e2e_benchmark/evaluate/.gitignore
@ -0,0 +1,4 @@
+__pycache__
+benchmarks/
+results/
+out/
--- a/.github/actions/e2e_benchmark/evaluate/graph.py
+++ b/.github/actions/e2e_benchmark/evaluate/graph.py
@ -29,10 +29,7 @@ BAR_COLORS = ['#90FF99', '#929292', '#8B04DD', '#000000']

 FONT_URL = "https://github.com/google/fonts/raw/main/apache/roboto/static/Roboto-Regular.ttf"
 FONT_NAME = "Roboto-Regular.ttf"
-
-# Rotate bar labels by X degrees
-LABEL_ROTATE_BY = 30
-LABEL_FONTSIZE = 9
+FONT_SIZE = 13

 # Some lookup dictionaries for x axis
 fio_iops_unit = 'IOPS'
@ -58,7 +55,7 @@ def configure() -> str:
    return out_dir


-def bar_chart(data, title='', unit='', x_label= ''):
+def bar_chart(data, title='', unit='', x_label=''):
    #     """Draws a bar chart with multiple bars per data point.

    #     Args:
@ -72,7 +69,7 @@ def bar_chart(data, title='', unit='', x_label= ''):

    # Create plot and set configs
    plt.rcdefaults()
-    plt.rc('font', family=FONT_NAME)
+    plt.rc('font', family=FONT_NAME, size=FONT_SIZE)
    fig, ax = plt.subplots(figsize=(10, 5))

    # Calculate y positions
@ -105,12 +102,14 @@ def bar_chart(data, title='', unit='', x_label= ''):
    ax.set_xlabel(x_label, fontdict={"fontsize": 12})
    if unit != '':
        unit = f"({unit})"
-    ax.set_title(f'{title} {unit}', fontdict={"fontsize": 20, 'weight': 'bold'})
+    ax.set_title(f'{title} {unit}', fontdict={
+                 "fontsize": 20, 'weight': 'bold'})

    plt.tight_layout()
-    #plt.show()
+    # plt.show()
    return fig

+
 def main():
    """ Download and setup fonts"""
    path = Path(tempfile.mkdtemp())
@ -136,91 +135,90 @@ def main():
                'Failed reading {subject} benchmark records: {e}'.format(subject=test, e=e))

    # Network charts
-    ## P2P TCP
+    # P2P TCP
    net_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        net_data[l] = int(combined_results[s]['knb']['pod2pod']['tcp_bw_mbit'])
    bar_chart(data=net_data,
              title='K8S CNI Benchmark - Pod to Pod - TCP - Bandwidth',
              unit=net_unit,
-              x_label = f" TCP Bandwidth in {net_unit} - Higher is better")
+              x_label=f" TCP Bandwidth in {net_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_net_p2p_tcp.png')
    plt.savefig(save_name)

-    ## P2P TCP
+    # P2P TCP
    net_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        net_data[l] = int(combined_results[s]['knb']['pod2pod']['udp_bw_mbit'])
    bar_chart(data=net_data,
              title='K8S CNI Benchmark - Pod to Pod - UDP - Bandwidth',
              unit=net_unit,
-              x_label = f" UDP Bandwidth in {net_unit} - Higher is better")
+              x_label=f" UDP Bandwidth in {net_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_net_p2p_udp.png')
    plt.savefig(save_name)

-
-    ## P2SVC TCP
+    # P2SVC TCP
    net_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        net_data[l] = int(combined_results[s]['knb']['pod2svc']['tcp_bw_mbit'])
    bar_chart(data=net_data,
              title='K8S CNI Benchmark - Pod to Service - TCP - Bandwidth',
              unit=net_unit,
-              x_label = f" TCP Bandwidth in {net_unit} - Higher is better")
+              x_label=f" TCP Bandwidth in {net_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_net_p2svc_tcp.png')
    plt.savefig(save_name)

-    ## P2SVC UDP
+    # P2SVC UDP
    net_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        net_data[l] = int(combined_results[s]['knb']['pod2svc']['udp_bw_mbit'])
    bar_chart(data=net_data,
              title='K8S CNI Benchmark - Pod to Service - UDP - Bandwidth',
              unit=net_unit,
-              x_label = f" UDP Bandwidth in {net_unit} - Higher is better")
+              x_label=f" UDP Bandwidth in {net_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_net_p2svc_udp.png')
    plt.savefig(save_name)

    # FIO chart
-    ## Read IOPS
+    # Read IOPS
    fio_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        fio_data[l] = int(combined_results[s]['fio']['read_iops']['iops'])
    bar_chart(data=fio_data,
              title='FIO Benchmark - Read - IOPS',
-              x_label = f" Read {fio_iops_unit} - Higher is better")
+              x_label=f" Read {fio_iops_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_fio_read_iops.png')
    plt.savefig(save_name)

-    ## Write IOPS
+    # Write IOPS
    fio_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        fio_data[l] = int(combined_results[s]['fio']['write_iops']['iops'])
    bar_chart(data=fio_data,
              title='FIO Benchmark - Write - IOPS',
-              x_label = f" Write {fio_iops_unit} - Higher is better")
+              x_label=f" Write {fio_iops_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_fio_write_iops.png')
    plt.savefig(save_name)

-    ## Read Bandwidth
+    # Read Bandwidth
    fio_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        fio_data[l] = int(combined_results[s]['fio']['read_bw']['bw_kbytes'])
    bar_chart(data=fio_data,
              title='FIO Benchmark - Read - Bandwidth',
              unit=fio_bw_unit,
-              x_label = f" Read Bandwidth in {fio_bw_unit} - Higher is better")
+              x_label=f" Read Bandwidth in {fio_bw_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_fio_read_bw.png')
    plt.savefig(save_name)

-    ## Write Bandwidth
+    # Write Bandwidth
    fio_data = {}
    for s, l in zip(SUBJECTS, LEGEND_NAMES):
        fio_data[l] = int(combined_results[s]['fio']['write_bw']['bw_kbytes'])
    bar_chart(data=fio_data,
              title='FIO Benchmark - Write - Bandwidth',
              unit=fio_bw_unit,
-              x_label = f" Write Bandwidth in {fio_bw_unit} - Higher is better")
+              x_label=f" Write Bandwidth in {fio_bw_unit} - Higher is better")
    save_name = os.path.join(out_dir, 'benchmark_fio_write_bw.png')
    plt.savefig(save_name)

--- a/.github/actions/e2e_benchmark/fio.ini
+++ b/.github/actions/e2e_benchmark/fio.ini
@ -0,0 +1,36 @@
+[global]
+direct=1
+ioengine=libaio
+runtime=60s
+ramp_time=10s
+size=10Gi
+time_based=1
+group_reporting
+thread
+cpus_allowed=1
+
+
+[read_iops]
+stonewall
+readwrite=randread
+bs=4k
+iodepth=128
+
+[write_iops]
+stonewall
+readwrite=randwrite
+bs=4k
+iodepth=128
+
+[read_bw]
+stonewall
+readwrite=randread
+bs=1024k
+iodepth=128
+
+
+[write_bw]
+stonewall
+readwrite=randwrite
+bs=1024k
+iodepth=128
--- a/docs/docs/_media/benchmark_fio_read_bw.png
+++ b/docs/docs/_media/benchmark_fio_read_bw.png
--- a/docs/docs/_media/benchmark_fio_read_iops.png
+++ b/docs/docs/_media/benchmark_fio_read_iops.png
--- a/docs/docs/_media/benchmark_fio_write_bw.png
+++ b/docs/docs/_media/benchmark_fio_write_bw.png
--- a/docs/docs/_media/benchmark_fio_write_iops.png
+++ b/docs/docs/_media/benchmark_fio_write_iops.png
--- a/docs/docs/_media/benchmark_net_p2p_tcp.png
+++ b/docs/docs/_media/benchmark_net_p2p_tcp.png
--- a/docs/docs/_media/benchmark_net_p2p_udp.png
+++ b/docs/docs/_media/benchmark_net_p2p_udp.png
--- a/docs/docs/_media/benchmark_net_p2svc_tcp.png
+++ b/docs/docs/_media/benchmark_net_p2svc_tcp.png
--- a/docs/docs/_media/benchmark_net_p2svc_udp.png
+++ b/docs/docs/_media/benchmark_net_p2svc_udp.png
--- a/docs/docs/_media/benchmark_p2p_concept.webp
+++ b/docs/docs/_media/benchmark_p2p_concept.webp
--- a/docs/docs/_media/benchmark_p2svc_concept.webp
+++ b/docs/docs/_media/benchmark_p2svc_concept.webp
--- a/docs/docs/overview/performance.md
+++ b/docs/docs/overview/performance.md
@ -8,23 +8,22 @@ All nodes in a Constellation cluster run inside Confidential VMs (CVMs). Thus, C

 AMD and Azure jointly released a [performance benchmark](https://community.amd.com/t5/business/microsoft-azure-confidential-computing-powered-by-3rd-gen-epyc/ba-p/497796) for CVMs based on 3rd Gen AMD EPYC processors (Milan) with SEV-SNP. With a range of mostly compute-intensive benchmarks like SPEC CPU 2017 and CoreMark, they found that CVMs only have a small (2%--8%) performance degradation compared to standard VMs. You can expect to see similar performance for compute-intensive workloads running with Constellation on Azure.

-Similary, AMD and Google jointly released a [performance benchmark](https://www.amd.com/system/files/documents/3rd-gen-epyc-gcp-c2d-conf-compute-perf-brief.pdf) for CVMs based on 3rd Gen AMD EPYC processors (Milan) with SEV-SNP. With high performance computing workloads like WRF, NAMD, Ansys CFS, and Ansys LS_DYNA, they found similar results with only small (2%--4%) performance degradation compared to standard VMs. You can expect to see similar performance for compute-intensive workloads running with Constellation GCP.
+Similarly, AMD and Google jointly released a [performance benchmark](https://www.amd.com/system/files/documents/3rd-gen-epyc-gcp-c2d-conf-compute-perf-brief.pdf) for CVMs based on 3rd Gen AMD EPYC processors (Milan) with SEV-SNP. With high-performance computing workloads like WRF, NAMD, Ansys CFS, and Ansys LS_DYNA, they found similar results with only small (2%--4%) performance degradation compared to standard VMs. You can expect to see similar performance for compute-intensive workloads running with Constellation on GCP.

-## Performance analysis of I/O and network
+## Performance impact from I/O and network
+
+To assess the overall performance of Constellation, we benchmarked Constellation v2.6.0 in terms of storage I/O using [`fio`](https://fio.readthedocs.io/en/latest/fio_doc.html) and network performance using the [Kubernetes Network Benchmark](https://github.com/InfraBuilder/k8s-bench-suite#knb--kubernetes-network-be).
+
+We tested Constellation on Azure and GCP and compared the results against the managed Kubernetes offerings AKS and GKE.

-To assess the overall performance of Constellation, we benchmarked Constellation v2.6.0 in terms of storage I/O using [FIO via Kubestr](https://github.com/kastenhq/kubestr), and network performance using the [Kubernetes Network Benchmark](https://github.com/InfraBuilder/k8s-bench-suite#knb--kubernetes-network-be)
 ### Configurations

-We ran the benchmark with Constellation v2.6.0 and Kubernetes v1.25.7.
-Cilium v1.12 was used for encrypted networking via eBPF and WireGuard.
-For storage we utilized Constellation's [Azure Disk CSI driver with encryption](https://github.com/edgelesssys/constellation-azuredisk-csi-driver) v1.1.2 on Azure and Constellation's [GCP Persistent Disk CSI Driver with encryption](https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver) v1.1.2 on GCP.
+#### Constellation

-We ran the benchmark on AKS with Kubernetes `v1.24.9` and nodes with version `AKSUbuntu-1804gen2containerd-2023.02.15`.
-On GKE we used Kubernetes `v1.24.9` and nodes with version `1.24.9-gke.3200`.
+The benchmark was conducted with Constellation v2.6.0, Kubernetes v1.25.7, and Cilium v1.12.
+It ran on the following infrastructure configurations.

-We used the following infrastructure configurations for the benchmarks.
-
-#### Constellation Azure
+Constellation on Azure:

 - Nodes: 3 (1 Control-plane, 2 Worker)
 - Machines: `DC4as_v5`: 3rd Generation AMD EPYC 7763v (Milan) processor with 4 Cores, 16 GiB memory
@ -32,15 +31,20 @@ We used the following infrastructure configurations for the benchmarks.
 - Region: `West US`
 - Zone: `2`

-#### Constellation and GKE on GCP
+Constellation on GCP:

 - Nodes: 3 (1 Control-plane, 2 Worker)
- Machines: `n2d-standard-4` 2nd Generation AMD EPYC (Rome) processor with 4 Cores, 16 GiB of memory
+- Machines: `n2d-standard-4`: 2nd Generation AMD EPYC (Rome) processor with 4 Cores, 16 GiB of memory
 - CVM: `true`
 - Zone: `europe-west3-b`

 #### AKS

+On AKS, we ran the benchmark with Kubernetes `v1.24.9` and nodes with version `AKSUbuntu-1804gen2containerd-2023.02.15`.
+The version we tested on AKS ran with the [`kubenet`](https://learn.microsoft.com/en-us/azure/aks/concepts-network#kubenet-basic-networking) CNI and the [default CSI driver](https://learn.microsoft.com/en-us/azure/aks/azure-disk-csi) for Azure Disk.
+
+We used the following infrastructure configurations.
+
 - Nodes: 2 (2 Worker)
 - Machines: `D4as_v5`: 3rd Generation AMD EPYC 7763v (Milan) processor with 4 Cores, 16 GiB memory
 - CVM: `false`
@ -49,6 +53,11 @@ We used the following infrastructure configurations for the benchmarks.

 #### GKE

+On GKE, we used Kubernetes `v1.24.9` and nodes with version `1.24.9-gke.3200`.
+The version we tested on GKE ran with the [`kubenet`](https://cloud.google.com/kubernetes-engine/docs/concepts/network-overview) CNI and the [default CSI driver](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver) for Compute Engine persistent disk.
+
+We used the following infrastructure configurations.
+
 - Nodes: 2 (2 Worker)
 - Machines: `n2d-standard-4` 2nd Generation AMD EPYC (Rome) processor with 4 Cores, 16 GiB of memory
 - CVM: `false`
@ -59,25 +68,37 @@ We used the following infrastructure configurations for the benchmarks.

 #### Network

-We conducted a thorough analysis of the network performance of Constellation, specifically focusing on measuring the bandwidth of TCP and UDP over a 10Gbit/s network.
-The benchmark measured the bandwidth of pod to pod as well as pod to service connections between two different nodes.
-The tests use [`iperf`](https://iperf.fr/) to measure the bandwidth.
+We performed a thorough analysis of the network performance of Constellation, specifically focusing on measuring TCP and UDP bandwidth.
+The benchmark measured the bandwidth of pod-to-pod and pod-to-service connections between two different nodes using [`iperf`](https://iperf.fr/).
+
+GKE and Constellation on GCP had a maximum network bandwidth of [10 Gbps](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machineshttps://cloud.google.com/compute/docs/general-purpose-machines#n2d_machines).
+AKS with `Standard_D4as_v5` machines a maximum network bandwidth of [12.5 Gbps](https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series).
+The Confidential VM equivalent `Standard_DC4as_v5` currently  has a network bandwidth of [1.25 Gbps](https://learn.microsoft.com/en-us/azure/virtual-machines/dcasv5-dcadsv5-series#dcasv5-series-products).
+Therefore, to make the test comparable we ran both AKS and Constellation on Azure with `Standard_DC4as_v5` machines and 1.25 Gbps bandwidth.

 Constellation on Azure and AKS used an MTU of 1500.
-Constellation GCP and GKE used an MTU of 8896.
+Constellation on GCP used an MTU of 8896. GKE used an MTU of 1450.


-The difference can largely be attributed to two factos.
+The difference in network bandwidth can largely be attributed to two factors.

-1. Constellation's [network encryption](../architecture/networking.md) via Cilium and WireGuard that protects data in-transit.
-2. [AMD SEV using SWIOTLB bounce buffers](https://lore.kernel.org/all/20200204193500.GA15564@ashkalra_ubuntu_server/T/) for all DMA including network I/O.
+* Constellation's [network encryption](../architecture/networking.md) via Cilium and WireGuard, which protects data in-transit.
+* [AMD SEV using SWIOTLB bounce buffers](https://lore.kernel.org/all/20200204193500.GA15564@ashkalra_ubuntu_server/T/) for all DMA including network I/O.

 ##### Pod-to-Pod

-In this scenario, the client Pod connects directly to the server pod in its IP address.
-
-![Pod2Pod concept](../_media/benchmark_p2p_concept.webp)
+In this scenario, the client Pod connects directly to the server pod via its IP address.

+```mermaid
+flowchart LR
+    subgraph Node A
+    Client[Client]
+    end
+    subgraph Node B
+    Server[Server]
+    end
+    Client ==>|traffic| Server
+```

 The results for "Pod-to-Pod" TCP are as follows:

@ -89,7 +110,18 @@ The results for "Pod-to-Pod" UDP are as follows:

 ##### Pod-to-Service

-In this section, the client Pod connects to the server Pod via a ClusterIP service. This is more relevant to real-world use cases.
+Tn this scenario, the client Pod connects to the server Pod via a ClusterIP service. This is more relevant to real-world use cases.
+
+```mermaid
+flowchart LR
+    subgraph Node A
+    Client[Client] ==>|traffic| Service[Service]
+    end
+    subgraph Node B
+    Server[Server]
+    end
+    Service ==>|traffic| Server
+```

 The results for “Pod-to-Service” TCP are as follows:

@ -99,49 +131,78 @@ The results for “Pod-to-Service” UDP are as follows:

 ![Network Pod2SVC TCP benchmark graph](../_media/benchmark_net_p2svc_udp.png)

+Comparing Constellation on GCP with GKE, Constellation has 58% less TCP bandwidth.
+UDP bandwidth is slightly better with Constellation due to the higher MTU.
+Constellation on Azure compared against AKS with CVMs achieves ~10% less TCP and ~40% less UDP bandwidth.
 #### Storage I/O

 Azure and GCP offer persistent storage for their Kubernetes services AKS and GKE via the Container Storage Interface (CSI). CSI storage in Kubernetes is available via `PersistentVolumes` (PV) and consumed via `PersistentVolumeClaims` (PVC).
 Upon requesting persistent storage through a PVC, GKE and AKS will provision a PV as defined by a default [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/).
 Constellation provides persistent storage on Azure and GCP [that's encrypted on the CSI layer](../architecture/encrypted-storage.md).
-Similarly, Constellation will provision a PV via a default storage class upon a PVC request.
+Similarly, upon a PVC request, Constellation will provision a PV via a default storage class.

-For Constellation on Azure and AKS we ran the benchmark with Azure Disk Storage [Standard SSD](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) of 400GB size.
-With our DC4as machine type with 4 cores standard-ssd provides the following maximum performance:
+For Constellation on Azure and AKS, we ran the benchmark with Azure Disk storage [Standard SSD](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) of 400 GiB size.
+The [DC4as machine type](https://learn.microsoft.com/en-us/azure/virtual-machines/dasv5-dadsv5-series#dasv5-series) with four cores provides the following maximum performance:
+- 6400 (20000 burst) IOPS
+- 144 MB/s (600 MB/s burst) throughput
+
+However, the performance is bound by the capabilities of the [512 GiB Standard SSD size](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) (the size we get when we allocate 400 GiB volumes):
 - 500 (600 burst) IOPS
 - 60 MB/s (150 MB/s burst) throughput

-For Constellation on GCP and GKE we ran the benchmark with Google Persistent Disk Storage [pd-standard](https://cloud.google.com/compute/docs/disks) of 400GB size.
-With our N2D machine type with 4 cores pd-standard provides the following [maximum performance](https://cloud.google.com/compute/docs/disks/performance#n2d_vms):
- 15,000 write IOPS
+For Constellation on GCP and GKE, we ran the benchmark with Compute Engine Persistent Disk Storage [pd-balanced](https://cloud.google.com/compute/docs/disks) of 400 GiB size.
+The N2D machine type with four cores and pd-balanced provides the following [maximum performance](https://cloud.google.com/compute/docs/disks/performance#n2d_vms):
 - 3,000 read IOPS
- 240 MB/s write throughput
+- 15,000 write IOPS
 - 240 MB/s read throughput
+- 240 MB/s write throughput
+
+However, the performance is bound by the capabilities of a [`Zonal balanced PD`](https://cloud.google.com/compute/docs/disks/performance#zonal-persistent-disks) with 400 GiB size:
+- 2400 read IOPS
+- 2400 write IOPS
+- 112 MB/s read throughput
+- 112 MB/s write throughput

 The [`fio`](https://fio.readthedocs.io/en/latest/fio_doc.html) benchmark consists of several tests.
-We selected a tests that performs asynchronous access patterns because we believe they most accurately depict real-world I/O access for most applications.
-We measured IOPS, read, and write bandwidth.
+We used [`Kubestr`](https://github.com/kastenhq/kubestr) to run `fio` in Kubernetes.
+The default test performs randomized access patterns that accurately depict worst-case I/O scenarios for most applications.

-The results for "Async Read" IOPS are as follows:
+The following `fio` settings were used:
+
+- No Cloud caching
+- No OS caching
+- Single CPU
+- 60 seconds runtime
+- 10 seconds ramp-up time
+- 10 GiB file
+- IOPS: 4 KB blocks and 128 iodepth
+- Bandwidth: 1024 KB blocks and 128 iodepth
+
+For more details, see the [`fio` test configuration](../../../.github/actions/e2e_benchmark/fio.ini).
+
+
+The results for "Rand Read" IOPS are as follows:

 ![I/O read IOPS benchmark graph](../_media/benchmark_fio_read_iops.png)

-The results for "Async Write" IOPS are as follows:
+The results for "Rand Write" IOPS are as follows:

 ![I/O write IOPS benchmark graph](../_media/benchmark_fio_write_iops.png)

-The results for "Async Read" bandwidth are as follows:
+The results for "Rand Read" bandwidth are as follows:

 ![I/O read bandwidth benchmark graph](../_media/benchmark_fio_read_bw.png)

-The results for "Async Write" bandwidth are as follows:
+The results for "Rand Write" bandwidth are as follows:

 ![I/O write bandwidth benchmark graph](../_media/benchmark_fio_write_bw.png)

-Comparing Constellation on GCP with GKE, you see that Constellation offers similar read/write speeds in all scenarios.
+On GCP, we can see that we exceed the maximum performance guarantees of the chosen disk type.
+There are two possible explanations for this. (1) There is some cloud caching in place we don't control. (2) The underlying provisioned disk size is larger than the requested on, which would yield higher performance boundaries.

-Constellation on Azure and AKS, however, partially differ. In read-write mixes, Constellation on Azure outperforms AKS in terms of I/O. On full-write access, Constellation and AKS have the same speed.
+Comparing Constellation on GCP with GKE, Constellation has a similar bandwidth but ~10% less IOPS performance.
+Constellation on Azure has a similar IOPS performance compared to AKS, where both probably hit the maximum storage performance. Constellation has ~15% less read and write bandwidth.

 ## Conclusion

-Despite providing substantial [security benefits](./security-benefits.md), Constellation overall only has a slight performance overhead over the managed Kubernetes offerings AKS and GKE. Constellation is on par in most benchmarks, but is slightly slower in certain scenarios due to network and storage encryption. When it comes to API latencies, Constellation even outperforms the less security-focused competition.
+Despite providing substantial [security benefits](./security-benefits.md), Constellation overall only has a slight performance overhead over the managed Kubernetes offerings AKS and GKE. Constellation is on par in most benchmarks but is slightly slower in certain scenarios due to network and storage encryption.
--- a/docs/styles/Vocab/constellation/accept.txt
+++ b/docs/styles/Vocab/constellation/accept.txt
@ -1,5 +1,6 @@
 Affero
 agent
+Ansys
 Asciinema
 auditable
 autoscaler
@ -27,6 +28,7 @@ gcp
 Grype
 iam
 IAM
+iodepth
 initramfs
 [Kk]3s
 Kata