ci: add kubestr and knb based e2e_benchmark action

Co-authored-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
This commit is contained in:
Moritz Eckert 2023-02-28 10:13:26 +01:00
parent 8ad04f7dbb
commit 0481c039f7
9 changed files with 609 additions and 5 deletions

144
.github/actions/e2e_benchmark/README.md vendored Normal file
View File

@ -0,0 +1,144 @@
# Perf-Bench
## Continuous Benchmarking
The Benchmark action runs performance benchmarks on Constellation clusters.
The benchmark suite records storage and network benchmarks.
After testing, the action compares the results of the benchmarks to previous results of Constellation on the same cloud provider. That way, it is possible to evaluate performance progression throughout the development.
The data of previous benchmarks is stored in the private S3 artifact store.
In order to support encrypted storage, the action deploys the [Azure CSI](https://github.com/edgelesssys/constellation-azuredisk-csi-driver) and [GCP CSI](https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver) drivers.
For the network benchmark we utilize the `knb` tool of the [k8s-bench-suite](https://github.com/InfraBuilder/k8s-bench-suite).
For the storage benchmark we utilize [kubestr](https://github.com/kastenhq/kubestr) to run FIO tests.
### Displaying Performance Progression
The action creates a summary of the action and attaches it the workflow execution log.
The table compares the current benchmark results of Constellation on the selected cloud provider to the previous records (of Constellation on the cloud provider).
The hashes of the two commits that are the base for the comparison are prepended to the table.
Example table:
<details>
- Commit of current benchmark: 8eb0a6803bc431bcebc2f6766ab2c6376500e106
- Commit of previous benchmark: 8f733daaf5c5509f024745260220d89ef8e6e440
| Benchmark suite | Metric | Current | Previous | Ratio |
|-|-|-|-|-|
| read_iops | iops (IOPS) | 213.6487 | 216.74684 | 0.986 ⬇️ |
| write_iops | iops (IOPS) | 24.412066 | 18.051243 | 1.352 ⬆️ |
| read_bw | bw_kbytes (KiB/s) | 28302.0 | 28530.0 | 0.992 ⬇️ |
| write_bw | bw_kbytes (KiB/s) | 4159.0 | 2584.0 | 1.61 ⬆️ |
| pod2pod | tcp_bw_mbit (MiB/s) | 20450.0 | 929.0 | 22.013 ⬆️ |
| pod2pod | upd_bw_mbit (MiB/s) | 1138.0 | 750.0 | 1.517 ⬆️ |
| pod2svc | tcp_bw_mbit (MiB/s) | 21188.0 | 905.0 | 23.412 ⬆️ |
| pod2svc | upd_bw_mbit (MiB/s) | 1137.0 | 746.0 | 1.524 ⬆️ |
</details>
### Drawing Performance Charts
The action also draws graphs as used in the [Constellation docs](https://docs.edgeless.systems/constellation/next/overview/performance). The graphs compare the performance of Constellation to the performance of managed Kubernetes clusters.
Graphs are created with every run of the benchmarking action. The action attaches them to the `benchmark` artifact of the workflow run.
## Updating Stored Records
### Managed Kubernetes
One must manually update the stored benchmark records of managed Kubernetes:
### AKS
Follow the [Azure documentation](https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-portal?tabs=azure-cli) to create an AKS cluster of desired benchmarking settings (region, instance types). If comparing against Constellation clusters with CVM instances, make sure to select the matching CVM instance type on Azure as well.
Once the cluster is ready, set up managing access via `kubectl` and take the benchmark:
```bash
# Setup knb
git clone https://github.com/InfraBuilder/k8s-bench-suite.git
cd k8s-bench-suite
install knb /usr/local/bin
cd ..
# Setup kubestr
HOSTOS="$(go env GOOS)"
HOSTARCH="$(go env GOARCH)"
curl -fsSLO https://github.com/kastenhq/kubestr/releases/download/v${KUBESTR_VER}/kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
tar -xzf kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
install kubestr /usr/local/bin
# Run kubestr
mkdir -p out
kubestr fio -e "out/fio-constellation-aks.json" -o json -s encrypted-rwo -z 400Gi
# Run knb
workers=$(kubectl get nodes | grep worker)
server=$(echo $workers | head -1 | tail -1 |cut -d ' ' -f1|tr '\n' ' ')
client=$(echo $workers | head -2 | tail -1 |cut -d ' ' -f1|tr '\n' ' ')
knb -f "out/knb-constellation-aks.json" -o json --server-node $server --client-node $client
# Benchmarks done, do processing.
# Parse
git clone https://github.com/edgelesssys/constellation.git
mkdir -p benchmarks
BDIR=benchmarks
EXT_NAME=AKS
KBENCH_RESULTS=out/
python constellation/.github/actions/e2e_benchmark/evaluate/parse.py
# Upload result to S3
S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks
aws s3 cp benchmarks/AKS.json ${S3_PATH}/AKS.json
```
### GKE
Create a GKE cluster of desired benchmarking settings (region, instance types). If comparing against Constellation clusters with CVM instances, make sure to select the matching CVM instance type on GCP and enable **confidential** VMs as well.
Once the cluster is ready, set up managing access via `kubectl` and take the benchmark:
```bash
# Setup knb
git clone https://github.com/InfraBuilder/k8s-bench-suite.git
cd k8s-bench-suite
install knb /usr/local/bin
cd ..
# Setup kubestr
HOSTOS="$(go env GOOS)"
HOSTARCH="$(go env GOARCH)"
curl -fsSLO https://github.com/kastenhq/kubestr/releases/download/v${KUBESTR_VER}/kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
tar -xzf kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
install kubestr /usr/local/bin
# Run kubestr
mkdir -p out
kubestr fio -e "out/fio-constellation-gke.json" -o json -s encrypted-rwo -z 400Gi
# Run knb
workers=$(kubectl get nodes | grep worker)
server=$(echo $workers | head -1 | tail -1 |cut -d ' ' -f1|tr '\n' ' ')
client=$(echo $workers | head -2 | tail -1 |cut -d ' ' -f1|tr '\n' ' ')
knb -f "out/knb-constellation-gke.json" -o json --server-node $server --client-node $client
# Parse
git clone https://github.com/edgelesssys/constellation.git
mkdir -p benchmarks
BDIR=benchmarks
EXT_NAME=GKE
KBENCH_RESULTS=out/
python constellation/.github/actions/e2e_benchmark/evaluate/parse.py
# Upload result to S3
S3_PATH=s3://edgeless-artifact-store/constellation/benchmarks
aws s3 cp benchmarks/GKE.json ${S3_PATH}/GKE.json
```
### Constellation
The action updates the stored Constellation records for the selected cloud provider when running on the main branch.

155
.github/actions/e2e_benchmark/action.yml vendored Normal file
View File

@ -0,0 +1,155 @@
name: benchmark
description: "Run benchmarks"
inputs:
cloudProvider:
description: "Which cloud provider to use."
required: true
kubeconfig:
description: "The kubeconfig of the cluster to test."
required: true
runs:
using: "composite"
steps:
- name: Setup python
uses: actions/setup-python@d27e3f3d7c64b4bbf8e4abfb9b63b83e846e0435 # v4.5.0
with:
python-version: "3.10"
- name: Install kubestr
shell: bash
env:
KUBESTR_VER: "0.4.37"
run: |
HOSTOS="$(go env GOOS)"
HOSTARCH="$(go env GOARCH)"
curl -fsSLO https://github.com/kastenhq/kubestr/releases/download/v${KUBESTR_VER}/kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
tar -xzf kubestr_${KUBESTR_VER}_${HOSTOS}_${HOSTARCH}.tar.gz
install kubestr /usr/local/bin
- name: Checkout k8s-bench-suite
uses: actions/checkout@ac593985615ec2ede58e132d2e21d2b1cbd6127c # v3.3.0
with:
fetch-depth: 0
repository: "InfraBuilder/k8s-bench-suite"
ref: 1698974913b7b18ad54cf5860838029c295c77b1
path: k8s-bench-suite
- name: Run FIO benchmark
shell: bash
env:
KUBECONFIG: ${{ inputs.kubeconfig }}
run: |
mkdir -p out
kubestr fio -e "out/fio-constellation-${{ inputs.cloudProvider }}.json" -o json -s encrypted-rwo -z 400Gi
- name: Upload raw FIO benchmark results
if: (!env.ACT)
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
with:
path: "out/fio-constellation-${{ inputs.cloudProvider }}.json"
name: "fio-constellation-${{ inputs.cloudProvider }}.json"
- name: Run knb benchmark
shell: bash
env:
KUBECONFIG: ${{ inputs.kubeconfig }}
TERM: xterm-256color
run: |
workers="$(kubectl get nodes -o name | grep worker)"
echo -e "Found workers:\n$workers"
server="$(echo "$workers" | tail +1 | head -1 | cut -d '/' -f2)"
echo "Server: $server"
client="$(echo "$workers" | tail +2 | head -1 | cut -d '/' -f2)"
echo "Client: $client"
k8s-bench-suite/knb -f "out/knb-constellation-${{ inputs.cloudProvider }}.json" -o json --server-node "$server" --client-node "$client"
- name: Upload raw knb benchmark results
if: (!env.ACT)
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
with:
path: "out/knb-constellation-${{ inputs.cloudProvider }}.json"
name: "knb-constellation-${{ inputs.cloudProvider }}.json"
- name: Assume AWS role to retrieve and update benchmarks in S3
uses: aws-actions/configure-aws-credentials@67fbcbb121271f7775d2e7715933280b06314838 # tag=v1.7.0
with:
role-to-assume: arn:aws:iam::795746500882:role/GithubActionUpdateBenchmarks
aws-region: us-east-2
- name: Set S3 artifact store
shell: bash
env:
ARTIFACT_BUCKET_CONSTELLATION: "edgeless-artifact-store/constellation"
run: echo S3_PATH=s3://${ARTIFACT_BUCKET_CONSTELLATION}/benchmarks >> $GITHUB_ENV
- name: Get previous benchmark records from S3
shell: bash
env:
CSP: ${{ inputs.cloudProvider }}
run: |
mkdir -p benchmarks
aws s3 cp --recursive ${S3_PATH} benchmarks --no-progress
if [[ -f benchmarks/constellation-${CSP}.json ]]; then
mv benchmarks/constellation-${CSP}.json benchmarks/constellation-${CSP}-previous.json
else
echo "::warning::Couldn't retrieve previous benchmark records from s3"
fi
- name: Parse results, create diagrams and post the progression summary
shell: bash
env:
# Original result directory
BENCH_RESULTS: out/
# Working directory containing the previous results as JSON and to contain the graphs
BDIR: benchmarks
# Paths to benchmark results as JSON of the previous run and the current run
PREV_BENCH: benchmarks/constellation-${{ inputs.cloudProvider }}-previous.json
CURR_BENCH: benchmarks/constellation-${{ inputs.cloudProvider }}.json
CSP: ${{ inputs.cloudProvider }}
run: |
python .github/actions/e2e_benchmark/evaluate/parse.py
export BENCHMARK_SUCCESS=true
if [[ -f "$PREV_BENCH" ]]; then
# Sets $BENCHMARK_SUCCESS=false if delta is bigger than defined in compare.py
python .github/actions/e2e_benchmark/evaluate/compare.py >> $GITHUB_STEP_SUMMARY
fi
echo BENCHMARK_SUCCESS=$BENCHMARK_SUCCESS >> $GITHUB_ENV
- name: Upload benchmark results to action run
if: (!env.ACT)
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
with:
path: |
benchmarks/constellation-${{ inputs.cloudProvider }}.json
name: "benchmarks"
- name: Check performance comparison result
shell: bash
run: |
if [ $COMPARISON_SUCCESS = true ] ; then
echo "Comparison successful"
else
echo "Comparison failed"
exit 1
fi
- name: Update benchmark records in S3
if: github.ref_name == 'main'
shell: bash
env:
CSP: ${{ inputs.cloudProvider }}
run: |
aws s3 cp benchmarks/constellation-${CSP}.json ${S3_PATH}/constellation-${CSP}.json
- name: Check performance comparison result
shell: bash
run: |
if [[ $BENCHMARK_SUCCESS == true ]] ; then
echo "Benchmark successful, all metrics in the expected range."
else
echo "::error::Benchmark failed, some metrics are outside of the expected range."
exit 1
fi

View File

View File

@ -0,0 +1,152 @@
"""Compare the current benchmark data against the previous."""
import os
import json
from typing import Tuple
# Progress indicator icons
PROGRESS = ['⬇️', '⬆️']
# List of benchmarks for which higher numbers are better
BIGGER_BETTER = [
'iops',
'bw_kbytes',
'tcp_bw_mbit',
'upd_bw_mbit',
]
# Lookup for test suite -> unit
UNIT_STR = {
'iops': 'IOPS',
'bw_kbytes': 'KiB/s',
'tcp_bw_mbit': 'Mbit/s',
'upd_bw_mbit': 'Mbit/s',
}
# API units are ms, so this is shorter than cluttering the dictionary:
API_UNIT_STR = "ms"
# List of allowed deviation
ALLOWED_RATIO_DELTA = {
'iops': 0.7,
'bw_kbytes': 0.7,
'tcp_bw_mbit': 0.7,
'upd_bw_mbit': 0.7,
}
def is_bigger_better(bench_suite: str) -> bool:
return bench_suite in BIGGER_BETTER
def get_paths() -> Tuple[str, str]:
"""Read the benchmark data paths.
Expects ENV vars (required):
- PREV_BENCH=/path/to/previous.json
- CURR_BENCH=/path/to/current.json
Raises TypeError if at least one of them is missing.
Returns: a tuple of (prev_bench_path, curr_bench_path).
"""
path_prev = os.environ.get('PREV_BENCH', None)
path_curr = os.environ.get('CURR_BENCH', None)
if not path_prev or not path_curr:
raise TypeError(
'Both ENV variables PREV_BENCH and CURR_BENCH are required.')
return path_prev, path_curr
def main() -> None:
"""Compare the current benchmark data against the previous.
Create a markdown table showing the benchmark progressions.
Print the result to stdout.
"""
path_prev, path_curr = get_paths()
try:
with open(path_prev) as f_prev:
bench_prev = json.load(f_prev)
with open(path_curr) as f_curr:
bench_curr = json.load(f_curr)
except OSError as e:
raise ValueError('Failed reading benchmark file: {e}'.format(e=e))
try:
name = bench_curr['provider']
except KeyError:
raise ValueError(
'Current benchmark record file does not contain provider.')
try:
prev_name = bench_prev['provider']
except KeyError:
raise ValueError(
'Previous benchmark record file does not contain provider.')
if name != prev_name:
raise ValueError(
'Cloud providers of previous and current benchmark data do not match.')
if 'fio' not in bench_prev.keys() or 'fio' not in bench_curr.keys():
raise ValueError('Benchmarks do not both contain fio records.')
if 'knb' not in bench_prev.keys() or 'knb' not in bench_curr.keys():
raise ValueError('Benchmarks do not both contain knb records.')
md_lines = [
'# {name}'.format(name=name),
'',
'<details>',
'',
'- Commit of current benchmark: [{ch}](https://github.com/edgelesssys/constellation/commit/{ch})'.format(ch=bench_curr['metadata']['github.sha']),
'- Commit of previous benchmark: [{ch}](https://github.com/edgelesssys/constellation/commit/{ch})'.format(ch=bench_prev['metadata']['github.sha']),
'',
'| Benchmark suite | Metric | Current | Previous | Ratio |',
'|-|-|-|-|-|',
]
# compare FIO results
for subtest, metrics in bench_prev['fio'].items():
for metric in metrics.keys():
md_lines.append(compare_test('fio', subtest, metric, bench_prev, bench_curr))
# compare knb results
for subtest, metrics in bench_prev['knb'].items():
for metric in metrics.keys():
md_lines.append(compare_test('knb', subtest, metric, bench_prev, bench_curr))
md_lines += ['', '</details>']
print('\n'.join(md_lines))
def compare_test(test, subtest, metric, bench_prev, bench_curr) -> str:
if subtest not in bench_curr[test]:
raise ValueError(
'Benchmark record from previous benchmark not in current.')
val_prev = bench_prev[test][subtest][metric]
val_curr = bench_curr[test][subtest][metric]
# get unit string or use default API unit string
unit = UNIT_STR.get(metric, API_UNIT_STR)
if val_curr == 0 or val_prev == 0:
ratio = 'N/A'
else:
if is_bigger_better(bench_suite=metric):
ratio_num = val_curr / val_prev
if ratio_num < ALLOWED_RATIO_DELTA.get(metric, 1):
set_failed()
else:
ratio_num = val_prev / val_curr
if ratio_num > ALLOWED_RATIO_DELTA.get(metric, 1):
set_failed()
ratio_num = round(ratio_num, 3)
emoji = PROGRESS[int(ratio_num >= 1)]
ratio = f'{ratio_num} {emoji}'
return f'| {subtest} | {metric} ({unit}) | {val_curr} | {val_prev} | {ratio} |'
def set_failed() -> None:
os.environ['COMPARISON_SUCCESS'] = str(False)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,36 @@
"""Parse the fio logs.
Extracts the bandwidth for I/O,
from a fio benchmark json output.
"""
from typing import Dict
import json
def evaluate(log_path) -> Dict[str, Dict[str, float]]:
with open(log_path) as f:
fio = json.load(f)
if not fio:
raise Exception(
f"Error: Empty fio log {log_path}?")
if len(fio) != 1:
raise Exception(
"Error: Unexpected fio log format"
)
tests = fio[0]['Raw']['result']['jobs']
result = {}
for test in tests:
if test['jobname'] == 'read_iops':
result[test['jobname']] = {'iops': float(test['read']['iops'])}
elif test['jobname'] == 'write_iops':
result[test['jobname']] = {'iops': float(test['write']['iops'])}
elif test['jobname'] == 'read_bw':
result[test['jobname']] = {'bw_kbytes': float(test['read']['bw'])}
elif test['jobname'] == 'write_bw':
result[test['jobname']] = {'bw_kbytes': float(test['write']['bw'])}
else:
raise Exception(
f"Error: Unexpected fio test: {test['jobname']}"
)
return result

View File

@ -0,0 +1,25 @@
"""Parse the knb logs.
Extracts the bandwidth for sending and receiving,
from k8s-bench-suite network benchmarks.
"""
import json
from typing import Dict
def evaluate(log_path) -> Dict[str, Dict[str, float]]:
with open(log_path) as f:
knb = json.load(f)
if not knb:
raise Exception(
f"Error: Empty knb log {log_path}?"
)
data = knb['data']
result = {'pod2pod': {}, 'pod2svc': {}}
result['pod2pod']['tcp_bw_mbit'] = float(data['pod2pod']['tcp']['bandwidth'])
result['pod2pod']['upd_bw_mbit'] = float(data['pod2pod']['udp']['bandwidth'])
result['pod2svc']['tcp_bw_mbit'] = float(data['pod2svc']['tcp']['bandwidth'])
result['pod2svc']['upd_bw_mbit'] = float(data['pod2svc']['udp']['bandwidth'])
return result

View File

@ -0,0 +1,92 @@
"""Parse logs of K-Bench tests and generate performance graphs."""
import json
import os
from typing import Tuple
from datetime import datetime
from evaluators import fio, knb
def configure() -> Tuple[str, str, str, str | None, str, str, str, str]:
"""Read the benchmark data paths.
Expects ENV vars (required):
- BENCH_RESULTS=/path/to/bench/out
- CSP=azure
- BDIR=benchmarks
Optional:
- EXT_NAME=AKS # Overrides "constellation-$CSP" naming to parse results from managed Kubernetes
- GITHUB_SHA=ffac5... # Set by GitHub actions, stored in the result JSON.
Raises TypeError if at least one of them is missing.
Returns: a tuple of (base_path, csp, out_dir, ext_provider_name).
"""
base_path = os.environ.get('BENCH_RESULTS', None)
csp = os.environ.get('CSP', None)
out_dir = os.environ.get('BDIR', None)
if not base_path or not csp or not out_dir:
raise TypeError(
'ENV variables BENCH_RESULTS, CSP, BDIR are required.')
ext_provider_name = os.environ.get('EXT_NAME', None)
commit_hash = os.environ.get('GITHUB_SHA', 'N/A')
commit_ref = os.environ.get('GITHUB_REF_NAME', 'N/A')
actor = os.environ.get('GITHUB_ACTOR', 'N/A')
workflow = os.environ.get('GITHUB_WORKFLOW', 'N/A')
return base_path, csp, out_dir, ext_provider_name, commit_hash, commit_ref, actor, workflow
def main() -> None:
"""Read and parse the K-Bench tests.
Write results of the current environment to a JSON file.
"""
base_path, csp, out_dir, ext_provider_name, commit_hash, commit_ref, actor, workflow = configure()
if ext_provider_name is None:
# Constellation benchmark.
ext_provider_name = f'constellation-{csp}'
# Expect the results in directory:
fio_path = os.path.join(
base_path,
f'fio-{ext_provider_name}.json',
)
knb_path = os.path.join(
base_path,
f'knb-{ext_provider_name}.json',
)
out_file_name = f'{ext_provider_name}.json'
if not os.path.exists(fio_path) or not os.path.exists(knb_path):
raise ValueError(
f'Benchmarks do not exist at {fio_path} or {knb_path}.')
# Parse subtest
knb_results = knb.evaluate(knb_path)
fio_results = fio.evaluate(fio_path)
combined_results = {'metadata': {
'github.sha': commit_hash,
'github.ref-name': commit_ref,
'github.actor': actor,
'github.workflow': workflow,
'created': str(datetime.now()),
},
'provider': ext_provider_name,
'fio': {},
'knb': {}}
combined_results['knb'].update(knb_results)
combined_results['fio'].update(fio_results)
# Write the compact results.
save_path = os.path.join(out_dir, out_file_name)
with open(save_path, 'w+') as w:
json.dump(combined_results, fp=w, sort_keys=False, indent=2)
if __name__ == '__main__':
main()

View File

@ -72,19 +72,19 @@ runs:
using: "composite"
steps:
- name: Check input
if: (!contains(fromJson('["sonobuoy full", "sonobuoy quick", "autoscaling", "k-bench", "verify", "lb", "recover", "nop", "iamcreate"]'), inputs.test))
if: (!contains(fromJson('["sonobuoy full", "sonobuoy quick", "autoscaling", "perf-bench", "verify", "lb", "recover", "nop", "iamcreate"]'), inputs.test))
shell: bash
run: |
echo "Invalid input for test field: ${{ inputs.test }}"
exit 1
# K-Bench's network benchmarks require at least two distinct worker nodes.
- name: Validate k-bench inputs
if: inputs.test == 'k-bench'
# Perf-bench's network benchmarks require at least two distinct worker nodes.
- name: Validate perf-bench inputs
if: inputs.test == 'perf-bench'
shell: bash
run: |
if [[ "${{ inputs.workerNodesCount }}" -lt 2 ]]; then
echo "::error::Test K-Bench requires at least 2 worker nodes."
echo "::error::Test Perf-Bench requires at least 2 worker nodes."
exit 1
fi