Release docs for v2.1 (#222)

This commit is contained in:
Paul Meyer 2022-10-07 12:14:55 +02:00 committed by GitHub
parent 7f96b82288
commit fd63ca1251
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
53 changed files with 6173 additions and 0 deletions

View file

@ -0,0 +1,72 @@
# Create your cluster
Creating your cluster requires two steps:
1. Creating the necessary resources in your cloud environment
2. Bootstrapping the Constellation cluster and setting up a connection
See the [architecture](../architecture/orchestration.md) section for details on the inner workings of this process.
## The *create* step
This step creates the necessary resources for your cluster in your cloud environment.
### Configuration
Generate a configuration file for your cloud service provider (CSP):
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
```bash
constellation config generate azure
```
</tabItem>
<tabItem value="gcp" label="GCP">
```bash
constellation config generate gcp
```
</tabItem>
</tabs>
This creates the file `constellation-conf.yaml` in the current directory. [Fill in your CSP-specific information](../getting-started/first-steps.md#create-a-cluster) before you continue.
Next, download the trusted measurements for your configured image.
```bash
constellation config fetch-measurements
```
For details, see the [verification section](../workflows/verify-cluster.md).
### Create
Choose the initial size of your cluster.
The following command creates a cluster with one control-plane and two worker nodes:
```bash
constellation create --control-plane-nodes 1 --worker-nodes 2
```
For details on the flags, consult the command help via `constellation create -h`.
*create* stores your cluster's configuration to a file named [`constellation-state.json`](../architecture/orchestration.md#installation-process) in your current directory.
## The *init* step
The following command initializes and bootstraps your cluster:
```bash
constellation init
```
Next, configure `kubectl` for your cluster:
```bash
export KUBECONFIG="$PWD/constellation-admin.conf"
```
🏁 That's it. You've successfully created a Constellation cluster.

View file

@ -0,0 +1 @@
# Expose services

View file

@ -0,0 +1,117 @@
# Recover your cluster
Recovery of a Constellation cluster means getting it back into a healthy state after too many concurrent node failures in the control plane.
Reasons for an unhealthy cluster can vary from a power outage, or planned reboot, to migration of nodes and regions.
Recovery events are rare, because Constellation is built for high availability and automatically and securely replaces failed nodes. When a node is replaced, Constellation's control plane first verifies the new node before it sends the node the cryptographic keys required to decrypt its [state disk](../architecture/images.md#state-disk).
Constellation provides a recovery mechanism for cases where the control plane has failed and is unable to replace nodes.
The `constellation recover` command securely connects to all nodes in need of recovery using [attested TLS](../architecture/attestation.md#attested-tls-atls) and provides them with the keys to decrypt their state disks and continue booting.
## Identify unhealthy clusters
The first step to recovery is identifying when a cluster becomes unhealthy.
Usually, this can be first observed when the Kubernetes API server becomes unresponsive.
You can check the health status of the nodes via the cloud service provider (CSP).
Constellation provides logging information on the boot process and status via [cloud logging](troubleshooting.md#cloud-logging).
In the following, you'll find detailed descriptions for identifying clusters stuck in recovery for each CSP.
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
In the Azure portal, find the cluster's resource group.
Inside the resource group, open the control plane *Virtual machine scale set* `constellation-scale-set-controlplanes-<suffix>`.
On the left, go to **Settings** > **Instances** and check that enough members are in a *Running* state.
Second, check the boot logs of these *Instances*.
In the scale set's *Instances* view, open the details page of the desired instance.
On the left, go to **Support + troubleshooting** > **Serial console**.
In the serial console output, search for `Waiting for decryption key`.
Similar output to the following means your node was restarted and needs to decrypt the [state disk](../architecture/images.md#state-disk):
```json
{"level":"INFO","ts":"2022-09-08T09:56:41Z","caller":"cmd/main.go:55","msg":"Starting disk-mapper","version":"2.0.0","cloudProvider":"azure"}
{"level":"INFO","ts":"2022-09-08T09:56:43Z","logger":"setupManager","caller":"setup/setup.go:72","msg":"Preparing existing state disk"}
{"level":"INFO","ts":"2022-09-08T09:56:43Z","logger":"recoveryServer","caller":"recoveryserver/server.go:59","msg":"Starting RecoveryServer"}
{"level":"INFO","ts":"2022-09-08T09:56:43Z","logger":"rejoinClient","caller":"rejoinclient/client.go:65","msg":"Starting RejoinClient"}
```
The node will then try to connect to the [*JoinService*](../architecture/components.md#joinservice) and obtain the decryption key.
If this fails due to an unhealthy control plane, you will see log messages similar to the following:
```json
{"level":"INFO","ts":"2022-09-08T09:56:43Z","logger":"rejoinClient","caller":"rejoinclient/client.go:77","msg":"Received list with JoinService endpoints","endpoints":["10.9.0.5:30090","10.9.0.6:30090"]}
{"level":"INFO","ts":"2022-09-08T09:56:43Z","logger":"rejoinClient","caller":"rejoinclient/client.go:96","msg":"Requesting rejoin ticket","endpoint":"10.9.0.5:30090"}
{"level":"WARN","ts":"2022-09-08T09:57:03Z","logger":"rejoinClient","caller":"rejoinclient/client.go:101","msg":"Failed to rejoin on endpoint","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 10.9.0.5:30090: i/o timeout\"","endpoint":"10.9.0.5:30090"}
{"level":"INFO","ts":"2022-09-08T09:57:03Z","logger":"rejoinClient","caller":"rejoinclient/client.go:96","msg":"Requesting rejoin ticket","endpoint":"10.9.0.6:30090"}
{"level":"WARN","ts":"2022-09-08T09:57:23Z","logger":"rejoinClient","caller":"rejoinclient/client.go:101","msg":"Failed to rejoin on endpoint","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 10.9.0.6:30090: i/o timeout\"","endpoint":"10.9.0.6:30090"}
{"level":"ERROR","ts":"2022-09-08T09:57:23Z","logger":"rejoinClient","caller":"rejoinclient/client.go:110","msg":"Failed to rejoin on all endpoints"}
```
This means that you have to recover the node manually.
</tabItem>
<tabItem value="gcp" label="GCP">
First, check that the control plane *Instance Group* has enough members in a *Ready* state.
In the GCP Console, go to **Instance Groups** and check the group for the cluster's control plane `<cluster-name>-control-plane-<suffix>`.
Second, check the status of the *VM Instances*.
Go to **VM Instances** and open the details of the desired instance.
Check the serial console output of that instance by opening the **Logs** > **Serial port 1 (console)** page:
![GCP portal serial console link](../_media/recovery-gcp-serial-console-link.png)
In the serial console output, search for `Waiting for decryption key`.
Similar output to the following means your node was restarted and needs to decrypt the [state disk](../architecture/images.md#state-disk):
```json
{"level":"INFO","ts":"2022-09-08T10:21:53Z","caller":"cmd/main.go:55","msg":"Starting disk-mapper","version":"2.0.0","cloudProvider":"gcp"}
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"setupManager","caller":"setup/setup.go:72","msg":"Preparing existing state disk"}
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"rejoinClient","caller":"rejoinclient/client.go:65","msg":"Starting RejoinClient"}
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"recoveryServer","caller":"recoveryserver/server.go:59","msg":"Starting RecoveryServer"}
```
The node will then try to connect to the [*JoinService*](../architecture/components.md#joinservice) and obtain the decryption key.
If this fails due to an unhealthy control plane, you will see log messages similar to the following:
```json
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"rejoinClient","caller":"rejoinclient/client.go:77","msg":"Received list with JoinService endpoints","endpoints":["192.168.178.4:30090","192.168.178.2:30090"]}
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"rejoinClient","caller":"rejoinclient/client.go:96","msg":"Requesting rejoin ticket","endpoint":"192.168.178.4:30090"}
{"level":"WARN","ts":"2022-09-08T10:21:53Z","logger":"rejoinClient","caller":"rejoinclient/client.go:101","msg":"Failed to rejoin on endpoint","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 192.168.178.4:30090: connect: connection refused\"","endpoint":"192.168.178.4:30090"}
{"level":"INFO","ts":"2022-09-08T10:21:53Z","logger":"rejoinClient","caller":"rejoinclient/client.go:96","msg":"Requesting rejoin ticket","endpoint":"192.168.178.2:30090"}
{"level":"WARN","ts":"2022-09-08T10:22:13Z","logger":"rejoinClient","caller":"rejoinclient/client.go:101","msg":"Failed to rejoin on endpoint","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 192.168.178.2:30090: i/o timeout\"","endpoint":"192.168.178.2:30090"}
{"level":"ERROR","ts":"2022-09-08T10:22:13Z","logger":"rejoinClient","caller":"rejoinclient/client.go:110","msg":"Failed to rejoin on all endpoints"}
```
This means that you have to recover the node manually.
</tabItem>
</tabs>
## Recover a cluster
Recovering a cluster requires the following parameters:
* The `constellation-id.json` file in your working directory or the cluster's load balancer IP address
* The master secret of the cluster
A cluster can be recovered like this:
```bash
$ constellation recover --master-secret constellation-mastersecret.json
Pushed recovery key.
Pushed recovery key.
Pushed recovery key.
Recovered 3 control-plane nodes.
```
In the serial console output of the node you'll see a similar output to the following:
```json
{"level":"INFO","ts":"2022-09-08T10:26:59Z","logger":"recoveryServer","caller":"recoveryserver/server.go:93","msg":"Received recover call"}
{"level":"INFO","ts":"2022-09-08T10:26:59Z","logger":"recoveryServer","caller":"recoveryserver/server.go:125","msg":"Received state disk key and measurement secret, shutting down server"}
{"level":"INFO","ts":"2022-09-08T10:26:59Z","logger":"recoveryServer.gRPC","caller":"zap/server_interceptors.go:61","msg":"finished streaming call with code OK","grpc.start_time":"2022-09-08T10:26:59Z","system":"grpc","span.kind":"server","grpc.service":"recoverproto.API","grpc.method":"Recover","peer.address":"192.0.2.3:41752","grpc.code":"OK","grpc.time_ms":15.701}
{"level":"INFO","ts":"2022-09-08T10:27:13Z","logger":"rejoinClient","caller":"rejoinclient/client.go:87","msg":"RejoinClient stopped"}
```

View file

@ -0,0 +1,94 @@
# Scale your cluster
Constellation provides all features of a Kubernetes cluster including scaling and autoscaling.
## Worker node scaling
### Autoscaling
Constellation comes with autoscaling disabled by default. To enable autoscaling, find the scaling group of
worker nodes:
```bash
worker_group=$(kubectl get scalinggroups -o json | jq -r '.items[].metadata.name | select(contains("worker"))')
echo "The name of your worker scaling group is '$worker_group'"
```
Then, patch the `autoscaling` field of the scaling group resource to `true`:
```bash
kubectl patch scalinggroups $worker_group --patch '{"spec":{"autoscaling": true}}' --type='merge'
kubectl get scalinggroup $worker_group -o jsonpath='{.spec}' | jq
```
The cluster autoscaler now automatically provisions additional worker nodes so that all pods have a place to run.
You can configure the minimum and maximum number of worker nodes in the scaling group by patching the `min` or
`max` fields of the scaling group resource:
```bash
kubectl patch scalinggroups $worker_group --patch '{"spec":{"max": 5}}' --type='merge'
kubectl get scalinggroup $worker_group -o jsonpath='{.spec}' | jq
```
The cluster autoscaler will now never provision more than 5 worker nodes.
If you want to see the autoscaling in action, try to add a deployment with a lot of replicas, like the
following Nginx deployment. The number of replicas needed to trigger the autoscaling depends on the size of
and count of your worker nodes. Wait for the rollout of the deployment to finish and compare the number of
worker nodes before and after the deployment:
```bash
kubectl create deployment nginx --image=nginx --replicas 150
kubectl -n kube-system get nodes
kubectl rollout status deployment nginx
kubectl -n kube-system get nodes
```
### Manual scaling
Alternatively, you can manually scale your cluster up or down:
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
1. Find your Constellation resource group.
2. Select the `scale-set-workers`.
3. Go to **settings** and **scaling**.
4. Set the new **instance count** and **save**.
</tabItem>
<tabItem value="gcp" label="GCP">
1. In Compute Engine go to [Instance Groups](https://console.cloud.google.com/compute/instanceGroups/).
2. **Edit** the **worker** instance group.
3. Set the new **number of instances** and **save**.
</tabItem>
</tabs>
## Control-plane node scaling
Control-plane nodes can **only be scaled manually and only scaled up**!
To increase the number of control-plane nodes, follow these steps:
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
1. Find your Constellation resource group.
2. Select the `scale-set-controlplanes`.
3. Go to **settings** and **scaling**.
4. Set the new (increased) **instance count** and **save**.
</tabItem>
<tabItem value="gcp" label="GCP">
1. In Compute Engine go to [Instance Groups](https://console.cloud.google.com/compute/instanceGroups/).
2. **Edit** the **control-plane** instance group.
3. Set the new (increased) **number of instances** and **save**.
</tabItem>
</tabs>
If you scale down the number of control-planes nodes, the removed nodes won't be able to exit the `etcd` cluster correctly. This will endanger the quorum that's required to run a stable Kubernetes control plane.

View file

@ -0,0 +1,59 @@
# Manage SSH keys
Constellation allows you to create UNIX users that can connect to both control-plane and worker nodes over SSH. As the system partitions are read-only, users need to be re-created upon each restart of a node. This is automated by the *Access Manager*.
On cluster initialization, users defined in the `ssh-users` section of the Constellation configuration file are created and stored in the `ssh-users` ConfigMap in the `kube-system` namespace. For a running cluster, you can add or remove users by modifying the ConfigMap and restarting a node.
## Access Manager
The Access Manager supports all OpenSSH key types. These are RSA, ECDSA (using the `nistp256`, `nistp384`, `nistp521` curves) and Ed25519.
:::note
All users are automatically created with `sudo` capabilities.
:::
The Access Manager is deployed as a DaemonSet called `constellation-access-manager`, running as an `initContainer` and afterward running a `pause` container to avoid automatic restarts. While technically killing the Pod and letting it restart works for the (re-)creation of users, it doesn't automatically remove users. Thus, a node restart is required after making changes to the ConfigMap.
When a user is deleted from the ConfigMap, it won't be re-created after the next restart of a node. The home directories of the affected users will be moved to `/var/evicted`.
You can update the ConfigMap by:
```bash
kubectl edit configmap -n kube-system ssh-users
```
Or alternatively, by modifying and re-applying it with the definition listed in the examples.
## Examples
You can add a user `myuser` in `constellation-config.yaml` like this:
```yaml
# Create SSH users on Constellation nodes upon the first initialization of the cluster.
sshUsers:
myuser: "ssh-rsa AAAA...mgNJd9jc="
```
This user is then created upon the first initialization of the cluster, and translated into a ConfigMap as shown below:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ssh-users
namespace: kube-system
data:
myuser: "ssh-rsa AAAA...mgNJd9jc="
```
You can add users by adding `data` entries:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ssh-users
namespace: kube-system
data:
myuser: "ssh-rsa AAAA...mgNJd9jc="
anotheruser: "ssh-ed25519 AAAA...CldH"
```
Similarly, removing any entries causes users to be evicted upon the next restart of the node.

View file

@ -0,0 +1,271 @@
# Use persistent storage
Persistent storage in Kubernetes requires cloud-specific configuration.
For abstraction of container storage, Kubernetes offers [volumes](https://kubernetes.io/docs/concepts/storage/volumes/),
allowing users to mount storage solutions directly into containers.
The [Container Storage Interface (CSI)](https://kubernetes-csi.github.io/docs/) is the standard interface for exposing arbitrary block and file storage systems into containers in Kubernetes.
Cloud service providers (CSPs) offer their own CSI-based solutions for cloud storage.
### Confidential storage
Most cloud storage solutions support encryption, such as [GCE Persistent Disks (PD)](https://cloud.google.com/kubernetes-engine/docs/how-to/using-cmek).
Constellation supports the available CSI-based storage options for Kubernetes engines in Azure and GCP.
However, their encryption takes place in the storage backend and is managed by the CSP.
Thus, using the default CSI drivers for these storage types means trusting the CSP with your persistent data.
To address this, Constellation provides CSI drivers for Azure Disk and GCE PD, offering [encryption on the node level](../architecture/keys.md#storage-encryption). They enable transparent encryption for persistent volumes without needing to trust the cloud backend. Plaintext data never leaves the confidential VM context, offering you confidential storage.
For more details see [encrypted persistent storage](../architecture/encrypted-storage.md).
## CSI drivers
Constellation supports the following drivers, which offer node-level encryption and optional integrity protection.
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
**Constellation CSI driver for Azure Disk**:
Mount Azure [Disk Storage](https://azure.microsoft.com/en-us/services/storage/disks/#overview) into your Constellation cluster. See the instructions on how to [install the Constellation CSI driver](#installation) or check out the [repository](https://github.com/edgelesssys/constellation-azuredisk-csi-driver) for more information. Since Azure Disks are mounted as ReadWriteOnce, they're only available to a single pod.
</tabItem>
<tabItem value="gcp" label="GCP">
**Constellation CSI driver for GCP Persistent Disk**:
Mount [Persistent Disk](https://cloud.google.com/persistent-disk) block storage into your Constellation cluster.
This includes support for [volume snapshots](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/volume-snapshots), which let you create copies of your volume at a specific point in time.
You can use them to bring a volume back to a prior state or provision new volumes.
Follow the instructions on how to [install the Constellation CSI driver](#installation) or check out the [repository](https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver) for information about the configuration.
</tabItem>
</tabs>
Note that in case the options above aren't a suitable solution for you, Constellation is compatible with all other CSI-based storage options. For example, you can use [Azure Files](https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction) or [GCP Filestore](https://cloud.google.com/filestore) with Constellation out of the box. Constellation is just not providing transparent encryption on the node level for these storage types yet.
## Installation
The following installation guide gives an overview of how to securely use CSI-based cloud storage for persistent volumes in Constellation.
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
1. Install the CSI driver:
```bash
helm install azuredisk-csi-driver https://raw.githubusercontent.com/edgelesssys/constellation-azuredisk-csi-driver/main/charts/edgeless/latest/azuredisk-csi-driver.tgz \
--namespace kube-system \
--set linux.distro=fedora \
--set controller.replicas=1
```
2. Create a [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/) for your driver
A storage class configures the driver responsible for provisioning storage for persistent volume claims.
A storage class only needs to be created once and can then be used by multiple volumes.
The following snippet creates a simple storage class using [Standard SSDs](https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) as the backing storage device when the first Pod claiming the volume is created.
```bash
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: encrypted-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: azuredisk.csi.confidential.cloud
parameters:
skuName: StandardSSD_LRS
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
```
</tabItem>
<tabItem value="gcp" label="GCP">
1. Install the CSI driver:
```bash
kubectl apply -k github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver/deploy/kubernetes/overlays/edgeless/latest
```
2. Create a [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/) for your driver
A storage class configures the driver responsible for provisioning storage for persistent volume claims.
A storage class only needs to be created once and can then be used by multiple volumes.
The following snippet creates a simple storage class using [balanced persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs) as the backing storage device when the first Pod claiming the volume is created.
```bash
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: encrypted-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: gcp.csi.confidential.cloud
parameters:
type: pd-standard
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
```
</tabItem>
</tabs>
:::info
By default, integrity protection is disabled for performance reasons. If you want to enable integrity protection, add `csi.storage.k8s.io/fstype: ext4-integrity` to `parameters`. Alternatively, you can use another filesystem by specifying another file system type with the suffix `-integrity`. Note that volume expansion isn't supported for integrity-protected disks.
:::
3. Create a [persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
A [persistent volume claim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) is a request for storage with certain properties.
It can refer to a storage class.
The following creates a persistent volume claim, requesting 20 GB of storage via the previously created storage class:
```bash
cat <<EOF | kubectl apply -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-example
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: encrypted-storage
resources:
requests:
storage: 20Gi
EOF
```
4. Create a Pod with persistent storage
You can assign a persistent volume claim to an application in need of persistent storage.
The mounted volume will persist restarts.
The following creates a pod that uses the previously created persistent volume claim:
```bash
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: web-server
namespace: default
spec:
containers:
- name: web-server
image: nginx
volumeMounts:
- mountPath: /var/lib/www/html
name: mypvc
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: pvc-example
readOnly: false
EOF
```
### Set the default storage class
The examples above are defined to be automatically set as the default storage class. The default storage class is responsible for all persistent volume claims that don't explicitly request `storageClassName`. In case you need to change the default, follow the steps below:
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
1. List the storage classes in your cluster:
```bash
kubectl get storageclass
```
The output is similar to this:
```shell-session
NAME PROVISIONER AGE
some-storage (default) disk.csi.azure.com 1d
encrypted-storage azuredisk.csi.confidential.cloud 1d
```
The default storage class is marked by `(default)`.
2. Mark old default storage class as non default
If you previously used another storage class as the default, you will have to remove that annotation:
```bash
kubectl patch storageclass <name-of-old-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
```
3. Mark new class as the default
```bash
kubectl patch storageclass <name-of-new-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
```
4. Verify that your chosen storage class is default:
```bash
kubectl get storageclass
```
The output is similar to this:
```shell-session
NAME PROVISIONER AGE
some-storage disk.csi.azure.com 1d
encrypted-storage (default) azuredisk.csi.confidential.cloud 1d
```
</tabItem>
<tabItem value="gcp" label="GCP">
1. List the storage classes in your cluster:
```bash
kubectl get storageclass
```
The output is similar to this:
```shell-session
NAME PROVISIONER AGE
some-storage (default) pd.csi.storage.gke.io 1d
encrypted-storage gcp.csi.confidential.cloud 1d
```
The default storage class is marked by `(default)`.
2. Mark old default storage class as non default
If you previously used another storage class as the default, you will have to remove that annotation:
```bash
kubectl patch storageclass <name-of-old-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
```
3. Mark new class as the default
```bash
kubectl patch storageclass <name-of-new-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
```
4. Verify that your chosen storage class is default:
```bash
kubectl get storageclass
```
The output is similar to this:
```shell-session
NAME PROVISIONER AGE
some-storage pd.csi.storage.gke.io 1d
encrypted-storage (default) gcp.csi.confidential.cloud 1d
```
</tabItem>
</tabs>

View file

@ -0,0 +1,25 @@
# Terminate your cluster
You can terminate your cluster using the CLI. For this, you need the state file of your running cluster named `constellation-state.json` in the current directory.
:::danger
All ephemeral storage and state of your cluster will be lost. Make sure any data is safely stored in persistent storage. Constellation can recreate your cluster and the associated encryption keys, but won't backup your application data automatically.
:::
Terminate the cluster by running:
```bash
constellation terminate
```
This deletes all resources created by Constellation in your cloud environment.
All local files created by the `create` and `init` commands are deleted as well, except for `constellation-mastersecret.json` and the configuration file.
:::caution
Termination can fail if additional resources have been created that depend on the ones managed by Constellation. In this case, you need to delete these additional
resources manually. Just run the `terminate` command again afterward to continue the termination process of the cluster.
:::

View file

@ -0,0 +1,39 @@
# Troubleshooting
This section aids you in finding problems when working with Constellation.
## Cloud logging
To provide information during early stages of the node's boot process, Constellation logs messages into the cloud providers' log systems. Since these offerings **aren't** confidential, only generic information without any sensitive values are stored. This provides administrators with a high level understanding of the current state of a node.
You can view these information in the follow places:
<tabs groupId="csp">
<tabItem value="azure" label="Azure">
1. In your Azure subscription find the Constellation resource group.
2. Inside the resource group find the Application Insights resource called `constellation-insights-*`.
3. On the left-hand side go to `Logs`, which is located in the section `Monitoring`.
+ Close the Queries page if it pops up.
5. In the query text field type in `traces`, and click `Run`.
To **find the disk UUIDs** use the following query: `traces | where message contains "Disk UUID"`
</tabItem>
<tabItem value="gcp" label="GCP">
1. Select the project that hosts Constellation.
2. Go to the `Compute Engine` service.
3. On the right-hand side of a VM entry select `More Actions` (a stacked ellipsis)
+ Select `View logs`
To **find the disk UUIDs** use the following query: `resource.type="gce_instance" text_payload=~"Disk UUID:.*\n" logName=~".*/constellation-boot-log"`
:::info
Constellation uses the default bucket to store logs. Its [default retention period is 30 days](https://cloud.google.com/logging/quotas#logs_retention_periods).
:::
</tabItem>
</tabs>

View file

@ -0,0 +1,53 @@
# Use Azure trusted launch VMs
Constellation also supports [trusted launch VMs](https://docs.microsoft.com/en-us/azure/virtual-machines/trusted-launch) on Microsoft Azure. Trusted launch VMs don't offer the same level of security as Confidential VMs, but are available in more regions and in larger quantities. The main difference between trusted launch VMs and normal VMs is that the former offer vTPM-based remote attestation. When used with trusted launch VMs, Constellation relies on vTPM-based remote attestation to verify nodes.
:::caution
Trusted launch VMs don't provide runtime encryption and don't keep the cloud service provider (CSP) out of your trusted computing base.
:::
Constellation supports trusted launch VMs with instance types `Standard_D*_v4` and `Standard_E*_v4`. Run `constellation config instance-types` for a list of all supported instance types.
## VM images
Azure currently doesn't support [community galleries for trusted launch VMs](https://docs.microsoft.com/en-us/azure/virtual-machines/share-gallery-community). Thus, you need to manually import the Constellation node image into your cloud subscription.
The latest image is available at [https://public-edgeless-constellation.s3.us-east-2.amazonaws.com/azure_image_exports/2.0.0](https://public-edgeless-constellation.s3.us-east-2.amazonaws.com/azure_image_exports/2.0.0). Simply adjust the last three digits to download a different version.
After you've downloaded the image, create a resource group `constellation-images` in your Azure subscription and import the image.
You can use a script to do this:
```bash
wget https://raw.githubusercontent.com/edgelesssys/constellation/main/hack/importAzure.sh
chmod +x importAzure.sh
AZURE_IMAGE_VERSION=2.0.0 AZURE_RESOURCE_GROUP_NAME=constellation-images AZURE_IMAGE_FILE=./2.0.0 ./importAzure.sh
```
The script creates the following resources:
1. A new image gallery with the default name `constellation-import`
2. A new image definition with the default name `constellation`
3. The actual image with the provided version. In this case `2.0.0`
Once the import is completed, use the `ID` of the image version in your `constellation-conf.yaml` for the `image` field. Set `confidentialVM` to `false`.
Fetch the image measurements:
```bash
IMAGE_VERSION=2.0.0
URL=https://public-edgeless-constellation.s3.us-east-2.amazonaws.com//CommunityGalleries/ConstellationCVM-b3782fa0-0df7-4f2f-963e-fc7fc42663df/Images/constellation/Versions/$IMAGE_VERSION/measurements.yaml
constellation config fetch-measurements -u$URL -s$URL.sig
```
:::info
The [constellation create](create.md) command will issue a warning because manually imported images aren't recognized as production grade images:
```shell-session
Configured image doesn't look like a released production image. Double check image before deploying to production.
```
Please ignore this warning.
:::

View file

@ -0,0 +1,35 @@
# Upgrade your cluster
Constellation provides an easy way to upgrade to the next release.
This involves choosing a new VM image to use for all nodes in the cluster and updating the cluster's expected measurements.
## Plan the upgrade
If you don't already know the image you want to upgrade to, use the `upgrade plan` command to pull a list of available updates.
```bash
constellation upgrade plan
```
The command lets you interactively choose from a list of available updates and prepares your Constellation config file for the next step.
To use the command in scripts, use the `--file` flag to compile the available options into a YAML file.
You can then set the chosen upgrade option in your Constellation config file.
:::caution
`constellation upgrade plan` only works for official Edgeless release images.
If your cluster is using a custom image, the Constellation CLI will fail to find compatible images.
However, you may still use the `upgrade execute` command by manually selecting a compatible image and setting it in your config file.
:::
## Execute the upgrade
Once your config file has been prepared with the new image and measurements, use the `upgrade execute` command to initiate the upgrade.
```bash
constellation upgrade execute
```
After the command has finished, the cluster will automatically replace old nodes using a rolling update strategy to ensure no downtime of the control or data plane.

View file

@ -0,0 +1,90 @@
# Verify the CLI
Edgeless Systems uses [sigstore](https://www.sigstore.dev/) to ensure supply-chain security for the Constellation CLI and node images ("artifacts"). sigstore consists of three components: [Cosign](https://docs.sigstore.dev/cosign/overview), [Rekor](https://docs.sigstore.dev/rekor/overview), and Fulcio. Edgeless Systems uses Cosign to sign artifacts. All signatures are uploaded to the public Rekor transparency log, which resides at https://rekor.sigstore.dev/.
:::note
The public key for Edgeless Systems' long-term code-signing key is:
```
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEf8F1hpmwE+YCFXzjGtaQcrL6XZVT
JmEe5iSLvG1SyQSAew7WdMKF6o9t8e2TFuCkzlOhhlws2OHWbiFZnFWCFw==
-----END PUBLIC KEY-----
```
The public key is also available for download at https://edgeless.systems/es.pub and in the Twitter profile [@EdgelessSystems](https://twitter.com/EdgelessSystems).
:::
The Rekor transparency log is a public append-only ledger that verifies and records signatures and associated metadata. The Rekor transparency log enables everyone to observe the sequence of (software) signatures issued by Edgeless Systems and many other parties. The transparency log allows for the public identification of dubious or malicious signatures.
You should always ensure that (1) your CLI executable was signed with the private key corresponding to the above public key and that (2) there is a corresponding entry in the Rekor transparency log. Both can be done as described in the following.
:::info
You don't need to verify the Constellation node images. This is done automatically by your CLI and the rest of Constellation.
:::
## Verify the signature
First, [install the Cosign CLI](https://docs.sigstore.dev/cosign/installation). Next, [download](https://github.com/edgelesssys/constellation/releases) and verify the signature that accompanies your CLI executable, for example:
```shell-session
$ cosign verify-blob --key https://edgeless.systems/es.pub --signature constellation-linux-amd64.sig constellation-linux-amd64
Verified OK
```
The above performs an offline verification of the provided public key, signature, and executable. To also verify that a corresponding entry exists in the public Rekor transparency log, add the variable `COSIGN_EXPERIMENTAL=1`:
```shell-session
$ COSIGN_EXPERIMENTAL=1 cosign verify-blob --key https://edgeless.systems/es.pub --signature constellation-linux-amd64.sig constellation-linux-amd64
tlog entry verified with uuid: afaba7f6635b3e058888692841848e5514357315be9528474b23f5dcccb82b13 index: 3477047
Verified OK
```
🏁 You now know that your CLI executable was officially released and signed by Edgeless Systems.
## Optional: Manually inspect the transparency log
To further inspect the public Rekor transparency log, [install the Rekor CLI](https://docs.sigstore.dev/rekor/installation). A search for the CLI executable should give a single UUID. (Note that this UUID contains the UUID from the previous `cosign` command.)
```shell-session
$ rekor-cli search --artifact constellation-linux-amd64
Found matching entries (listed by UUID):
362f8ecba72f4326afaba7f6635b3e058888692841848e5514357315be9528474b23f5dcccb82b13
```
With this UUID you can get the full entry from the transparency log:
```shell-session
$ rekor-cli get --uuid=362f8ecba72f4326afaba7f6635b3e058888692841848e5514357315be9528474b23f5dcccb82b13
LogID: c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d
Index: 3477047
IntegratedTime: 2022-09-12T22:28:16Z
UUID: afaba7f6635b3e058888692841848e5514357315be9528474b23f5dcccb82b13
Body: {
"HashedRekordObj": {
"data": {
"hash": {
"algorithm": "sha256",
"value": "40e137b9b9b8204d672642fd1e181c6d5ccb50cfc5cc7fcbb06a8c2c78f44aff"
}
},
"signature": {
"content": "MEUCIQCSER3mGj+j5Pr2kOXTlCIHQC3gT30I7qkLr9Awt6eUUQIgcLUKRIlY50UN8JGwVeNgkBZyYD8HMxwC/LFRWoMn180=",
"publicKey": {
"content": "LS0tLS1CRUdJTiBQVUJMSUMgS0VZLS0tLS0KTUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFZjhGMWhwbXdFK1lDRlh6akd0YVFjckw2WFpWVApKbUVlNWlTTHZHMVN5UVNBZXc3V2RNS0Y2bzl0OGUyVEZ1Q2t6bE9oaGx3czJPSFdiaUZabkZXQ0Z3PT0KLS0tLS1FTkQgUFVCTElDIEtFWS0tLS0tCg=="
}
}
}
}
```
The field `publicKey` should contain Edgeless Systems' public key in Base64 encoding.
You can get an exhaustive list of artifact signatures issued by Edgeless Systems via the following command:
```bash
rekor-cli search --public-key https://edgeless.systems/es.pub --pki-format x509
```
Edgeless Systems monitors this list to detect potential unauthorized use of its private key.

View file

@ -0,0 +1,48 @@
# Verify your cluster
Constellation's [attestation feature](../architecture/attestation.md) allows you, or a third party, to verify the integrity and confidentiality of your Constellation cluster.
## Fetch measurements
To verify the integrity of Constellation you need trusted measurements to verify against. For each node image released by Edgeless Systems, there are signed measurements, which you can download using the CLI:
```bash
constellation config fetch-measurements
```
This command performs the following steps:
1. Download the signed measurements for the configured image. By default, this will use Edgeless Systems' public measurement registry.
2. Verify the signature of the measurements. This will use Edgeless Systems' [public key](https://edgeless.systems/es.pub).
3. Write measurements into configuration file.
## The *verify* command
:::note
The steps below are purely optional. They're automatically executed by `constellation init` when you initialize your cluster. The `constellation verify` command mostly has an illustrative purpose.
:::
The `verify` command obtains and verifies an attestation statement from a running Constellation cluster.
```bash
constellation verify [--cluster-id ...]
```
From the attestation statement, the command verifies the following properties:
* The cluster is using the correct Confidential VM (CVM) type.
* Inside the CVMs, the correct node images are running. The node images are identified through the measurements obtained in the previous step.
* The unique ID of the cluster matches the one from your `constellation-id.json` file or passed in via `--cluster-id`.
Once the above properties are verified, you know that you are talking to the right Constellation cluster and it's in a good and trustworthy shape.
### Custom arguments
The `verify` command also allows you to verify any Constellation deployment that you have network access to. For this you need the following:
* The IP address of a running Constellation cluster's [VerificationService](../architecture/components.md#verification-service). The `VerificationService` is exposed via a `NodePort` service using the external IP address of your cluster. Run `kubectl get nodes -o wide` and look for `EXTERNAL-IP`.
* The cluster's *clusterID*. See [cluster identity](../architecture/keys.md#cluster-identity) for more details.
For example:
```shell-session
constellation verify -e 192.0.2.1 --cluster-id Q29uc3RlbGxhdGlvbkRvY3VtZW50YXRpb25TZWNyZXQ=
```