mirror of
https://github.com/edgelesssys/constellation.git
synced 2025-08-17 11:10:37 -04:00
Add docs to repo (#38)
This commit is contained in:
parent
50d3f3ca7f
commit
b95f3dbc91
180 changed files with 13401 additions and 67 deletions
105
docs/docs/workflows/create.md
Normal file
105
docs/docs/workflows/create.md
Normal file
|
@ -0,0 +1,105 @@
|
|||
# Create your cluster
|
||||
|
||||
Creating your cluster requires two steps:
|
||||
|
||||
1. Creating the necessary resources in your cloud environment
|
||||
2. Bootstrapping the Constellation cluster and setting up a connection
|
||||
|
||||
See the [architecture](../architecture/orchestration.md) section for details on the inner workings of this process.
|
||||
|
||||
## The *create* step
|
||||
|
||||
This step creates the necessary resources for your cluster in your cloud environment.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Before creating your cluster you need to decide on
|
||||
|
||||
* the size of your cluster (the number of control-plane and worker nodes)
|
||||
* the machine type of your nodes (depending on the availability in your cloud environment)
|
||||
* whether to enable autoscaling for your cluster (automatically adding and removing nodes depending on resource demands)
|
||||
|
||||
You can find the currently supported machine types for your cloud environment in the [installation guide](../architecture/orchestration.md).
|
||||
|
||||
### Configuration
|
||||
|
||||
Constellation can generate a configuration file for your cloud provider:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
```bash
|
||||
constellation config generate azure
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
```bash
|
||||
constellation config generate gcp
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
This creates the file `constellation-conf.yaml` in the current directory. You must edit it before you can execute the next steps. See the [reference section](../reference/config.md) for details.
|
||||
|
||||
Next, download the latest trusted measurements for your configured image.
|
||||
|
||||
```bash
|
||||
constellation config fetch-measurements
|
||||
```
|
||||
|
||||
For more details, see the [verification section](../workflows/verify.md).
|
||||
|
||||
### Create
|
||||
|
||||
The following command creates a cluster with one control-plane and two worker nodes:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
```bash
|
||||
constellation create azure --control-plane-nodes 1 --worker-nodes 2 --instance-type Standard_D4a_v4 -y
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
```bash
|
||||
constellation create gcp --control-plane-nodes 1 --worker-nodes 2 --instance-type n2d-standard-2 -y
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
For details on the flags and a list of supported instance types, consult the command help via `constellation create -h`.
|
||||
|
||||
*create* will store your cluster's configuration to a file named [`constellation-state.json`](../architecture/orchestration.md#installation-process) in your current directory.
|
||||
|
||||
## The *init* step
|
||||
|
||||
This step bootstraps your cluster and configures your Kubernetes client.
|
||||
|
||||
### Init
|
||||
|
||||
The following command initializes and bootstraps your cluster:
|
||||
|
||||
```bash
|
||||
constellation init
|
||||
```
|
||||
|
||||
To enable autoscaling in your cluster, add the `--autoscale` flag:
|
||||
|
||||
```bash
|
||||
constellation init --autoscale
|
||||
```
|
||||
|
||||
Next, configure kubectl for your Constellation cluster:
|
||||
|
||||
```bash
|
||||
export KUBECONFIG="$PWD/constellation-admin.conf"
|
||||
kubectl get nodes -o wide
|
||||
```
|
||||
|
||||
That's it. You've successfully created a Constellation cluster.
|
1
docs/docs/workflows/lb.md
Normal file
1
docs/docs/workflows/lb.md
Normal file
|
@ -0,0 +1 @@
|
|||
# Expose services
|
147
docs/docs/workflows/recovery.md
Normal file
147
docs/docs/workflows/recovery.md
Normal file
|
@ -0,0 +1,147 @@
|
|||
# Recovery
|
||||
|
||||
Recovery of a Constellation cluster means getting a cluster back into a healthy state after it became unhealthy due to the underlying infrastructure.
|
||||
Reasons for an unhealthy cluster can vary from a power outage, or planned reboot, to migration of nodes and regions.
|
||||
Constellation keeps all stateful data protected and encrypted in a [stateful disk](../architecture/images.md#stateful-disk) attached to each node.
|
||||
The stateful disk will be persisted across reboots.
|
||||
The data restored from that disk contains the entire Kubernetes state including the application deployments.
|
||||
Meaning after a successful recovery procedure the applications can continue operating without redeploying everything from scratch.
|
||||
|
||||
Recovery events are rare because Constellation is built for high availability and contains mechanisms to automatically replace and join nodes to the cluster.
|
||||
Once a node reboots, the [*Bootstrapper*](../architecture/components.md#bootstrapper) will try to authenticate to the cluster's [*JoinService*](../architecture/components.md#joinservice) using remote attestation.
|
||||
If successful the *JoinService* will return the encryption key for the stateful disk as part of the initialization response.
|
||||
This process ensures that Constellation nodes can securely recover and rejoin a cluster autonomously.
|
||||
|
||||
In case of a disaster, where the control plane itself becomes unhealthy, Constellation provides a mechanism to recover that cluster and bring it back into a healthy state.
|
||||
The `constellation recover` command connects to a node, establishes a secure connection using [attested TLS](../architecture/attestation.md#attested-tls-atls), and provides that node with the key to decrypt its stateful disk and continue booting.
|
||||
This process has to be repeated until enough nodes are back running for establishing a [member quorum for etcd](https://etcd.io/docs/v3.5/faq/#what-is-failure-tolerance) and the Kubernetes state can be recovered.
|
||||
|
||||
## Identify unhealthy clusters
|
||||
|
||||
The first step to recovery is identifying when a cluster becomes unhealthy.
|
||||
Usually, that's first observed when the Kubernetes API server becomes unresponsive.
|
||||
The causes can vary but are often related to issues in the underlying infrastructure.
|
||||
Recovery in Constellation becomes necessary if not enough control-plane nodes are in a healthy state to keep the control plane operational.
|
||||
|
||||
The health status of the Constellation nodes can be checked and monitored via the cloud service provider.
|
||||
Constellation provides logging information on the boot process and status via [cloud logging](troubleshooting.md#cloud-logging).
|
||||
In the following, you'll find detailed descriptions for identifying clusters stuck in recovery for each cloud environment.
|
||||
Once you've identified that your cluster is in an unhealthy state you can use the [recovery](recovery.md#recover-your-cluster) command of the Constellation CLI to restore it.
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
In the Azure cloud portal find the cluster's resource group `<cluster-name>-<suffix>`
|
||||
Inside the resource group check that the control plane *Virtual machine scale set* `constellation-scale-set-controlplanes-<suffix>` has enough members in a *Running* state.
|
||||
Open the scale set details page, on the left go to `Settings -> Instances` and check the *Status* field.
|
||||
|
||||
Second, check the boot logs of these *Instances*.
|
||||
In the scale set's *Instances* view, open the details page of the desired instance.
|
||||
Check the serial console output of that instance.
|
||||
On the left open the *"Support + troubleshooting" -> "Serial console"* page:
|
||||
|
||||
In the serial console output search for `Waiting for decryption key`.
|
||||
Similar output to the following means your node was restarted and needs to decrypt the [state disk](../architecture/images.md#state-disk):
|
||||
|
||||
```shell
|
||||
{"level":"INFO","ts":"2022-08-01T08:02:20Z","caller":"cmd/main.go:46","msg":"Starting disk-mapper","version":"0.0.0","cloudProvider":"azure"}
|
||||
{"level":"INFO","ts":"2022-08-01T08:02:20Z","logger":"setupManager","caller":"setup/setup.go:57","msg":"Preparing existing state disk"}
|
||||
{"level":"INFO","ts":"2022-08-01T08:02:20Z","logger":"keyService","caller":"keyservice/keyservice.go:92","msg":"Waiting for decryption key. Listening on: [::]:9000"}
|
||||
```
|
||||
|
||||
The node will then try to connect to the [*JoinService*](../architecture/components.md#joinservice) and obtain the decryption key.
|
||||
If that fails, because the control plane is unhealthy, you will see log messages similar to the following:
|
||||
|
||||
```shell
|
||||
{"level":"INFO","ts":"2022-08-01T08:02:21Z","logger":"keyService","caller":"keyservice/keyservice.go:118","msg":"Received list with JoinService endpoints: [10.9.0.5:30090 10.9.0.6:30090 10.9.0.7:30090 10.9.0.8:30090 10.9.0.9:30090 10.9.0.10:30090 10.9.0.11:30090 10.9.0.12:30090 10.9.0.13:30090 10.9.0.14:30090 10.9.0.15:30090 10.9.0.16:30090 10.9.0.17:30090 10.9.0.18:30090 10.9.0.19:30090 10.9.0.20:30090 10.9.0.21:30090 10.9.0.22:30090 10.9.0.23:30090]"}
|
||||
{"level":"INFO","ts":"2022-08-01T08:02:21Z","logger":"keyService","caller":"keyservice/keyservice.go:145","msg":"Requesting rejoin ticket","endpoint":"10.9.0.5:30090"}
|
||||
{"level":"ERROR","ts":"2022-08-01T08:02:21Z","logger":"keyService","caller":"keyservice/keyservice.go:148","msg":"Failed to request rejoin ticket","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 10.9.0.5:30090: connect: connection refused\"","endpoint":"10.9.0.5:30090"}
|
||||
```
|
||||
|
||||
That means you have to recover that node manually.
|
||||
Before you continue with the [recovery process](#recover-your-cluster) you need to know the node's IP address and state disk's UUID.
|
||||
For the IP address, return to the instances *Overview* page and find the *Private IP address*.
|
||||
For the UUID open the [Cloud logging](troubleshooting.md#azure) explorer.
|
||||
Type `traces | where message contains "Disk UUID"` and click `Run`.
|
||||
Find the entry corresponding to that instance `{"instance-name":"<cluster-name>-control-plane-<suffix>"}` and take the UUID from the message field `Disk UUID: <UUID>`.
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
First, check that the control plane *Instance Group* has enough members in a *Ready* state.
|
||||
Go to *Instance Groups* and check the group for the cluster's control plane `<cluster-name>-control-plane-<suffix>`.
|
||||
|
||||
Second, check the status of the *VM Instances*.
|
||||
Go to *VM Instances* and open the details of the desired instance.
|
||||
Check the serial console output of that instance by opening the *logs -> "Serial port 1 (console)"* page:
|
||||
|
||||

|
||||
|
||||
In the serial console output search for `Waiting for decryption key`.
|
||||
Similar output to the following means your node was restarted and needs to decrypt the [state disk](../architecture/images.md#state-disk):
|
||||
|
||||
```shell
|
||||
{"level":"INFO","ts":"2022-07-29T09:45:55Z","caller":"cmd/main.go:46","msg":"Starting disk-mapper","version":"0.0.0","cloudProvider":"gcp"}
|
||||
{"level":"INFO","ts":"2022-07-29T09:45:55Z","logger":"setupManager","caller":"setup/setup.go:57","msg":"Preparing existing state disk"}
|
||||
{"level":"INFO","ts":"2022-07-29T09:45:55Z","logger":"keyService","caller":"keyservice/keyservice.go:92","msg":"Waiting for decryption key. Listening on: [::]:9000"}
|
||||
```
|
||||
|
||||
The node will then try to connect to the [*JoinService*](../architecture/components.md#joinservice) and obtain the decryption key.
|
||||
If that fails, because the control plane is unhealthy, you will see log messages similar to the following:
|
||||
|
||||
```shell
|
||||
{"level":"INFO","ts":"2022-07-29T09:46:15Z","logger":"keyService","caller":"keyservice/keyservice.go:118","msg":"Received list with JoinService endpoints: [192.168.178.2:30090]"}
|
||||
{"level":"INFO","ts":"2022-07-29T09:46:15Z","logger":"keyService","caller":"keyservice/keyservice.go:145","msg":"Requesting rejoin ticket","endpoint":"192.168.178.2:30090"}
|
||||
{"level":"ERROR","ts":"2022-07-29T09:46:15Z","logger":"keyService","caller":"keyservice/keyservice.go:148","msg":"Failed to request rejoin ticket","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 192.168.178.2:30090: connect: connection refused\"","endpoint":"192.168.178.2:30090"}
|
||||
```
|
||||
|
||||
That means you have to recover that node manually.
|
||||
Before you continue with the [recovery process](#recover-your-cluster) you need to know the node's IP address and state disk's UUID.
|
||||
For the IP address go to the *"VM Instance" -> "network interfaces"* page and take the address from *"Primary internal IP address."*
|
||||
For the UUID open the [Cloud logging](troubleshooting.md#cloud-logging) explorer, you'll find that right above the serial console link (see the picture above).
|
||||
Search for `Disk UUID: <UUID>`.
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
## Recover your cluster
|
||||
|
||||
Depending on the size of your cluster and the number of unhealthy control plane nodes the following process needs to be repeated until a [member quorum for etcd](https://etcd.io/docs/v3.5/faq/#what-is-failure-tolerance) is established.
|
||||
For example, assume you have 5 control-plane nodes in your cluster and 4 of them have been rebooted due to a maintenance downtime in the cloud environment.
|
||||
You have to run through the following process for 2 of these nodes and recover them manually to recover the quorum.
|
||||
From there, your cluster will auto heal the remaining 2 control-plane nodes and the rest of your cluster.
|
||||
|
||||
Recovering a node requires the following parameters:
|
||||
|
||||
* The node's IP address
|
||||
* The node's state disk UUID
|
||||
* Access to the master secret of the cluster
|
||||
|
||||
See the [Identify unhealthy clusters](#identify-unhealthy-clusters) description of how to obtain the node's IP address and state disk UUID.
|
||||
Note that the recovery command needs to connect to the recovering nodes.
|
||||
Nodes only have private IP addresses in the VPC of the cluster, hence, the command needs to be issued from within the VPC network of the cluster.
|
||||
The easiest approach is to set up a jump host connected to the VPC network and perform the recovery from there.
|
||||
|
||||
Given these prerequisites a node can be recovered like this:
|
||||
|
||||
```bash
|
||||
$ constellation recover -e 34.107.89.208 --disk-uuid b27f817c-6799-4c0d-81d8-57abc8386b70 --master-secret constellation-mastersecret.json
|
||||
Pushed recovery key.
|
||||
```
|
||||
|
||||
In the serial console output of the node you'll see a similar output to the following:
|
||||
|
||||
```shell
|
||||
[ 3225.621753] EXT4-fs (dm-1): INFO: recovery required on readonly filesystem
|
||||
[ 3225.628807] EXT4-fs (dm-1): write access will be enabled during recovery
|
||||
[ 3226.295816] EXT4-fs (dm-1): recovery complete
|
||||
[ 3226.301618] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
|
||||
[ 3226.338157] systemd[1]: run-state.mount: Deactivated successfully.
|
||||
[[0;32m OK [[ 3226.347833] systemd[1]: Finished Prepare encrypted state disk.
|
||||
0m] Finished [0;1;39mPrepare encrypted state disk[0m.
|
||||
Startin[ 3226.363705] systemd[1]: Starting OSTree Prepare OS/...
|
||||
g [0;1;39mOSTre[ 3226.370625] ostree-prepare-root[939]: preparing sysroot at /sysroot
|
||||
e Prepare OS/[0m...
|
||||
```
|
||||
|
||||
After enough control plane nodes have been recovered and the Kubernetes cluster becomes healthy again, the rest of the cluster will start auto healing using the mechanism described above.
|
56
docs/docs/workflows/scale.md
Normal file
56
docs/docs/workflows/scale.md
Normal file
|
@ -0,0 +1,56 @@
|
|||
# Scale your cluster
|
||||
|
||||
Constellation provides all features of a Kubernetes cluster including scaling and autoscaling.
|
||||
|
||||
## Worker node scaling
|
||||
|
||||
[During cluster initialization](create.md#init) you can choose to deploy the [cluster autoscaler](https://github.com/kubernetes/autoscaler). It automatically provisions additional worker nodes so that all pods have a place to run.
|
||||
|
||||
Alternatively, you can choose to manually scale your cluster:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. Find your Constellation resource group.
|
||||
2. Select the `scale-set-workers`.
|
||||
3. Go to **settings** and **scaling**.
|
||||
4. Set the new **instance count** and **save**.
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. In Compute Engine go to [Instance Groups](https://console.cloud.google.com/compute/instanceGroups/).
|
||||
2. **Edit** the **worker** instance group.
|
||||
3. Set the new **number of instances** and **save**.
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
This works for scaling your worker nodes up and down.
|
||||
|
||||
## Control-plane node scaling
|
||||
|
||||
Control-plane nodes can **only be scaled manually and only scaled up**!
|
||||
|
||||
To increase the number of control-plane nodes, follow these steps:
|
||||
|
||||
<tabs>
|
||||
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. Find your Constellation resource group.
|
||||
2. Select the `scale-set-controlplanes`.
|
||||
3. Go to **settings** and **scaling**.
|
||||
4. Set the new (increased) **instance count** and **save**.
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. In Compute Engine go to [Instance Groups](https://console.cloud.google.com/compute/instanceGroups/).
|
||||
2. **Edit** the **control-plane** instance group.
|
||||
3. Set the new (increased) **number of instances** and **save**.
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
If you scale down the number of control-planes nodes, the removed nodes won't be able to exit the `etcd` cluster correctly. This will endanger the quorum that's required to run a stable Kubernetes control plane.
|
59
docs/docs/workflows/ssh.md
Normal file
59
docs/docs/workflows/ssh.md
Normal file
|
@ -0,0 +1,59 @@
|
|||
# Managing SSH Keys
|
||||
|
||||
Constellation gives you the capability to create UNIX users which can connect to the cluster nodes over SSH, allowing you to access both control-plane as well as worker nodes. While the data partition is persistent, the system partition is read-only, meaning that users need to be re-created upon each restart of a node. This is where the Access Manager comes into effect, ensuring the automatic (re-)creation of all users whenever a node is restarted.
|
||||
|
||||
During the initial creation of the cluster, all users defined in the `ssh-users` section of the Constellation configuration (see the [reference section](../reference/config.md) for details) are automatically created during the initialization process.
|
||||
|
||||
For persistence, they're transferred into a ConfigMap called `ssh-users`, residing in the `kube-system` namespace. When no users are initially defined, the ConfigMap will still be created with no entries. After the initial definition in the Constellation configuration, users can be added and removed by modifying the entries of the ConfigMap and performing a restart of a node.
|
||||
|
||||
## Access Manager
|
||||
The Access Manager doesn't restrict users on the use of certain key formats, meaning that all underlying formats the OpenSSH server supports are accepted. These are RSA, ECDSA (using the `nistp256`, `nistp384`, `nistp521` curves) and Ed25519. No validation is performed on the side of the Access Manager too, passing them directly to the authorized key lists as defined.
|
||||
|
||||
Note that all users are automatically created with `sudo` capabilities, so make sure no one unintended has permissions to modify the `ssh-users` ConfigMap.
|
||||
|
||||
The Access Manager is deployed as a DaemonSet called `constellation-access-manager`, running as an `initContainer` and afterward running a `pause` container to avoid automatic restarts. While technically killing the Pod and letting it restart works for the (re-)creation of users, it doesn't automatically remove users. Therefore, a complete node restart is important to ensure the correct modification of users on the system and needs to be executed manually after making changes to the ConfigMap.
|
||||
|
||||
When a user is deleted from the ConfigMap, it won't be re-created after the next restart of a node. The home directories of the affected users will be moved to `/var/evicted`, with the owner of each directory and its content being modified to `root`.
|
||||
|
||||
You can update the ConfigMap by:
|
||||
```bash
|
||||
kubectl edit configmap -n kube-system ssh-users
|
||||
```
|
||||
|
||||
Or alternatively, by modifying and re-applying it with the definition listed in the examples.
|
||||
|
||||
## Examples
|
||||
An example to create an user called `myuser` as part of the `constellation-config.yaml` looks like this:
|
||||
|
||||
```yaml
|
||||
# Create SSH users on Constellation nodes upon the first initialization of the cluster.
|
||||
sshUsers:
|
||||
myuser: "ssh-rsa AAAA...mgNJd9jc="
|
||||
```
|
||||
|
||||
This user is then created upon the first initialization of the cluster, and translated into a ConfigMap as shown below:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: ssh-users
|
||||
namespace: kube-system
|
||||
data:
|
||||
myuser: "ssh-rsa AAAA...mgNJd9jc="
|
||||
```
|
||||
|
||||
Entries can be added simply by adding `data` entries:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: ssh-users
|
||||
namespace: kube-system
|
||||
data:
|
||||
myuser: "ssh-rsa AAAA...mgNJd9jc="
|
||||
anotheruser: "ssh-ed25519 AAAA...CldH"
|
||||
```
|
||||
|
||||
Similarly, removing any entries causes users to be evicted upon the next restart of the node.
|
282
docs/docs/workflows/storage.md
Normal file
282
docs/docs/workflows/storage.md
Normal file
|
@ -0,0 +1,282 @@
|
|||
# Use persistent storage
|
||||
|
||||
Persistent storage in Kubernetes requires configuration based on your cloud provider of choice.
|
||||
For abstraction of container storage, Kubernetes offers [volumes](https://kubernetes.io/docs/concepts/storage/volumes/),
|
||||
allowing users to mount storage solutions directly into containers.
|
||||
The [Container Storage Interface (CSI)](https://kubernetes-csi.github.io/docs/) is the standard interface for exposing arbitrary block and file storage systems into containers in Kubernetes.
|
||||
Cloud providers offer their own CSI-based solutions for cloud storage.
|
||||
|
||||
### Confidential storage
|
||||
|
||||
Most cloud storage solutions support encryption, such as [GCE Persistent Disks (PD)](https://cloud.google.com/kubernetes-engine/docs/how-to/using-cmek).
|
||||
Constellation supports the available CSI-based storage options for Kubernetes engines in Azure and GCP.
|
||||
However, their encryption takes place in the storage backend and is managed by the cloud provider.
|
||||
This mode of storage encryption doesn't provide confidential storage.
|
||||
Using the default CSI drivers for these storage types means trusting the CSP with your persistent data.
|
||||
|
||||
Constellation provides CSI drivers for Azure Disk and GCE PD, offering [encryption on the node level](../architecture/keys.md#storage-encryption). They enable transparent encryption for persistent volumes without needing to trust the cloud backend. Plaintext data never leaves the confidential VM context, offering you confidential storage.
|
||||
|
||||
For more details see [encrypted persistent storage](../architecture/encrypted-storage.md).
|
||||
|
||||
## CSI Drivers
|
||||
|
||||
Constellation can use the following drivers which offer node level encryption and optional integrity protection.
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. [Azure Disk Storage](https://github.com/edgelesssys/constellation-azuredisk-csi-driver)
|
||||
|
||||
Mount Azure [Disk Storage](https://azure.microsoft.com/en-us/services/storage/disks/#overview) into your Constellation cluster. See the example below on how to install the modified Azure Disk CSI driver or check out the [repository](https://github.com/edgelesssys/constellation-azuredisk-csi-driver) for installation and more information about the Constellation-managed version of the driver. Since Azure Disks are mounted as ReadWriteOnce, they're only available to a single pod.
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. [Persistent Disk](https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver):
|
||||
|
||||
Mount GCP [Persistent Disk](https://cloud.google.com/persistent-disk) block storage into your Constellation cluster.
|
||||
This includes support for [volume snapshots](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/volume-snapshots), which let you create copies of your volume at a specific point in time.
|
||||
You can use them to bring a volume back to a prior state or provision new volumes.
|
||||
Follow the examples listed below to setup the modified GCP PD CSI driver, or check out the [repository](https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver) for information about the configuration.
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
Note that in case the options above aren't a suitable solution for you, Constellation is compatible with all other CSI-based storage options. For example, you can use [Azure Files](https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction) or [GCP Filestore](https://cloud.google.com/filestore) with Constellation out of the box. Constellation is just not providing transparent encryption on the node level for these storage types yet.
|
||||
|
||||
## Installation
|
||||
|
||||
The following installation guide gives a brief overview of using CSI-based confidential cloud storage for persistent volumes in Constellation.
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. Install the CSI driver:
|
||||
|
||||
```bash
|
||||
helm install azuredisk-csi-driver charts/edgeless/latest/azuredisk-csi-driver.tgz \
|
||||
--namespace kube-system \
|
||||
--set linux.distro=fedora \
|
||||
--set controller.replicas=1
|
||||
```
|
||||
|
||||
2. Create a [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/) for your driver
|
||||
|
||||
A storage class configures the driver responsible for provisioning storage for persistent volume claims.
|
||||
A storage class only needs to be created once and can then be used by multiple volumes.
|
||||
The following snippet creates a simple storage class using a [Standard SSD](https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types#standard-ssds) as the backing storage device when the first Pod claiming the volume is created.
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: encrypted-storage
|
||||
annotations:
|
||||
storageclass.kubernetes.io/is-default-class: "true"
|
||||
provisioner: azuredisk.csi.confidential.cloud
|
||||
parameters:
|
||||
skuName: StandardSSD_LRS
|
||||
reclaimPolicy: Delete
|
||||
volumeBindingMode: WaitForFirstConsumer
|
||||
EOF
|
||||
```
|
||||
|
||||
:::info
|
||||
|
||||
By default, integrity protection is disabled for performance reasons. If you want to enable integrity protection, add `csi.storage.k8s.io/fstype: ext4-integrity` to `parameters`. Alternatively, you can use another filesystem by specifying another file system type with the suffix `-integrity`. Note that volume expansion isn't supported for integrity-protected disks.
|
||||
|
||||
:::
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. Install the CSI driver:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/edgelesssys/constellation-gcp-compute-persistent-disk-csi-driver.git
|
||||
cd constellation-gcp-compute-persistent-disk-csi-driver
|
||||
kubectl apply -k ./deploy/kubernetes/overlays/edgeless/latest
|
||||
```
|
||||
|
||||
2. Create a [storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/) for your driver
|
||||
|
||||
A storage class configures the driver responsible for provisioning storage for persistent volume claims.
|
||||
A storage class only needs to be created once and can then be used by multiple volumes.
|
||||
The following snippet creates a simple storage class for the GCE PD driver, utilizing [balanced persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs) as the storage backend device when the first Pod claiming the volume is created.
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: encrypted-storage
|
||||
annotations:
|
||||
storageclass.kubernetes.io/is-default-class: "true"
|
||||
provisioner: gcp.csi.confidential.cloud
|
||||
parameters:
|
||||
type: pd-standard
|
||||
reclaimPolicy: Delete
|
||||
volumeBindingMode: WaitForFirstConsumer
|
||||
EOF
|
||||
```
|
||||
|
||||
:::info
|
||||
|
||||
By default, integrity protection is disabled for performance reasons. If you want to enable integrity protection, add `csi.storage.k8s.io/fstype: ext4-integrity` to `parameters`. Alternatively, you can use another filesystem by specifying another file system type with the suffix `-integrity`. Note that volume expansion isn't supported for integrity-protected disks.
|
||||
|
||||
:::
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
3. Create a [persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
|
||||
|
||||
A [persistent volume claim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) is a request for storage with certain properties.
|
||||
It can refer to a storage class.
|
||||
The following creates a persistent volume claim, requesting 20 GB of storage via the previously created storage class:
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
kind: PersistentVolumeClaim
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: pvc-example
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: encrypted-storage
|
||||
resources:
|
||||
requests:
|
||||
storage: 20Gi
|
||||
EOF
|
||||
```
|
||||
|
||||
4. Create a Pod with persistent storage
|
||||
|
||||
You can assign a persistent volume claim to an application in need of persistent storage.
|
||||
The mounted volume will persist restarts.
|
||||
The following creates a pod that uses the previously created persistent volume claim:
|
||||
|
||||
```bash
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: web-server
|
||||
namespace: default
|
||||
spec:
|
||||
containers:
|
||||
- name: web-server
|
||||
image: nginx
|
||||
volumeMounts:
|
||||
- mountPath: /var/lib/www/html
|
||||
name: mypvc
|
||||
volumes:
|
||||
- name: mypvc
|
||||
persistentVolumeClaim:
|
||||
claimName: pvc-example
|
||||
readOnly: false
|
||||
EOF
|
||||
```
|
||||
|
||||
### Set the default storage class
|
||||
The examples above are defined to be automatically set as the default storage class. The default storage class is responsible for all persistent volume claims that don't explicitly request `storageClassName`. In case you need to change the default, follow the steps below:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. List the storage classes in your cluster:
|
||||
|
||||
```bash
|
||||
kubectl get storageclass
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
|
||||
```shell-session
|
||||
NAME PROVISIONER AGE
|
||||
some-storage (default) disk.csi.azure.com 1d
|
||||
encrypted-storage azuredisk.csi.confidential.cloud 1d
|
||||
```
|
||||
|
||||
The default storage class is marked by `(default)`.
|
||||
|
||||
2. Mark old default storage class as non default
|
||||
|
||||
If you previously used another storage class as the default, you will have to remove that annotation:
|
||||
|
||||
```bash
|
||||
kubectl patch storageclass <name-of-old-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
|
||||
```
|
||||
|
||||
3. Mark new class as the default
|
||||
|
||||
```bash
|
||||
kubectl patch storageclass <name-of-new-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
||||
```
|
||||
|
||||
4. Verify that your chosen storage class is default:
|
||||
|
||||
```bash
|
||||
kubectl get storageclass
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
|
||||
```shell-session
|
||||
NAME PROVISIONER AGE
|
||||
some-storage disk.csi.azure.com 1d
|
||||
encrypted-storage (default) azuredisk.csi.confidential.cloud 1d
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. List the storage classes in your cluster:
|
||||
|
||||
```bash
|
||||
kubectl get storageclass
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
|
||||
```shell-session
|
||||
NAME PROVISIONER AGE
|
||||
some-storage (default) pd.csi.storage.gke.io 1d
|
||||
encrypted-storage gcp.csi.confidential.cloud 1d
|
||||
```
|
||||
|
||||
The default storage class is marked by `(default)`.
|
||||
|
||||
2. Mark old default storage class as non default
|
||||
|
||||
If you previously used another storage class as the default, you will have to remove that annotation:
|
||||
|
||||
```bash
|
||||
kubectl patch storageclass <name-of-old-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
|
||||
```
|
||||
|
||||
3. Mark new class as the default
|
||||
|
||||
```bash
|
||||
kubectl patch storageclass <name-of-new-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
||||
```
|
||||
|
||||
4. Verify that your chosen storage class is default:
|
||||
|
||||
```bash
|
||||
kubectl get storageclass
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
|
||||
```shell-session
|
||||
NAME PROVISIONER AGE
|
||||
some-storage pd.csi.storage.gke.io 1d
|
||||
encrypted-storage (default) gcp.csi.confidential.cloud 1d
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
26
docs/docs/workflows/terminate.md
Normal file
26
docs/docs/workflows/terminate.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
# Terminate your cluster
|
||||
|
||||
You can terminate your cluster using the CLI.
|
||||
You need the state file of your running cluster named `constellation-state.json` in the current directory.
|
||||
|
||||
:::danger
|
||||
|
||||
All ephemeral storage and state of your cluster will be lost. Make sure any data is safely stored in persistent storage. Constellation can recreate your cluster and the associated encryption keys, but won't backup your application data automatically.
|
||||
|
||||
:::
|
||||
|
||||
Terminate the cluster by running:
|
||||
|
||||
```bash
|
||||
constellation terminate
|
||||
```
|
||||
|
||||
This deletes all resources created by Constellation in your cloud environment.
|
||||
All local files created by the `create` and `init` commands are deleted as well, except the *master secret* `constellation-mastersecret.json` and the configuration file.
|
||||
|
||||
:::caution
|
||||
|
||||
Termination can fail if additional resources have been created that depend on the ones managed by Constellation. In this case, you need to delete these additional
|
||||
resources manually. Just run the `terminate` command again afterward to continue the termination process of the cluster.
|
||||
|
||||
:::
|
39
docs/docs/workflows/troubleshooting.md
Normal file
39
docs/docs/workflows/troubleshooting.md
Normal file
|
@ -0,0 +1,39 @@
|
|||
# Troubleshooting
|
||||
|
||||
This section aids you in finding problems when working with Constellation.
|
||||
|
||||
## Cloud logging
|
||||
|
||||
To provide information during early stages of the node's boot process, Constellation logs messages into the cloud providers' log systems. Since these offerings **aren't** confidential, only generic information without any sensitive values are stored. This provides administrators with a high level understanding of the current state of a node.
|
||||
|
||||
You can view these information in the follow places:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
1. In your Azure subscription find the Constellation resource group.
|
||||
2. Inside the resource group find the Application Insights resource called `constellation-insights-*`.
|
||||
3. On the left-hand side go to `Logs`, which is located in the section `Monitoring`.
|
||||
+ Close the Queries page if it pops up.
|
||||
5. In the query text field type in `traces`, and click `Run`.
|
||||
|
||||
To **find the disk UUIDs** use the following query: `traces | where message contains "Disk UUID"`
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
1. Select the project that hosts Constellation.
|
||||
2. Go to the `Compute Engine` service.
|
||||
3. On the right-hand side of a VM entry select `More Actions` (a stacked ellipsis)
|
||||
+ Select `View logs`
|
||||
|
||||
To **find the disk UUIDs** use the following query: `resource.type="gce_instance" text_payload=~"Disk UUID:.*\n" logName=~".*/constellation-boot-log"`
|
||||
|
||||
:::info
|
||||
|
||||
Constellation uses the default bucket to store logs. Its [default retention period is 30 days](https://cloud.google.com/logging/quotas#logs_retention_periods).
|
||||
|
||||
:::
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
35
docs/docs/workflows/upgrade.md
Normal file
35
docs/docs/workflows/upgrade.md
Normal file
|
@ -0,0 +1,35 @@
|
|||
# Upgrade you cluster
|
||||
|
||||
Constellation provides an easy way to upgrade from one release to the next.
|
||||
This involves choosing a new VM image to use for all nodes in the cluster and updating the cluster's expected measurements.
|
||||
|
||||
## Plan the upgrade
|
||||
|
||||
If you don't already know the image you want to upgrade to, use the `upgrade plan` command to pull in a list of available updates.
|
||||
|
||||
```bash
|
||||
constellation upgrade plan
|
||||
```
|
||||
|
||||
The command will let you interactively choose from a list of available updates and prepare your Constellation config file for the next step.
|
||||
|
||||
If you plan to use the command in scripts, use the `--file` flag to compile the available options into a YAML file.
|
||||
You can then manually set the chosen upgrade option in your Constellation config file.
|
||||
|
||||
:::caution
|
||||
|
||||
The `constellation upgrade plan` will only work for official Edgeless release images.
|
||||
If your cluster is using a custom image or a debug image, the Constellation CLI will fail to find compatible images.
|
||||
However, you may still use the `upgrade execute` command by manually selecting a compatible image and setting it in your config file.
|
||||
|
||||
:::
|
||||
|
||||
## Execute the upgrade
|
||||
|
||||
Once your config file has been prepared with the new image and measurements, use the `upgrade execute` command to initiate the upgrade.
|
||||
|
||||
```bash
|
||||
constellation upgrade execute
|
||||
```
|
||||
|
||||
After the command has finished, the cluster will automatically replace old nodes using a rolling update strategy to ensure no downtime of the control or data plane.
|
78
docs/docs/workflows/verify.md
Normal file
78
docs/docs/workflows/verify.md
Normal file
|
@ -0,0 +1,78 @@
|
|||
# Verify your cluster
|
||||
|
||||
Constellation's [attestation feature](../architecture/attestation.md) allows you, or a third party, to verify the confidentiality and integrity of your Constellation.
|
||||
|
||||
## Fetch measurements
|
||||
|
||||
To verify the integrity of Constellation you need trusted measurements to verify against. For each of the released images there are signed measurements, which you can download using the CLI:
|
||||
|
||||
```bash
|
||||
constellation config fetch-measurements
|
||||
```
|
||||
|
||||
This command performs the following steps:
|
||||
1. Look up the signed measurements for the configured image.
|
||||
2. Download the measurements.
|
||||
3. Verify the signature.
|
||||
4. Write measurements into configuration file.
|
||||
|
||||
### Custom arguments
|
||||
|
||||
To comply with regulations and policies it may be necessary that you need to generate the measurements yourself. You can either manually write these measurements to the configuration file or download them from a custom location using this command:
|
||||
|
||||
```bash
|
||||
constellation config fetch-measurements -u http://my.storage/measurements.yaml -s http://my.storage/measurements.yaml.sig -p "$(cat cosign.pub)"
|
||||
```
|
||||
|
||||
For more details consult the [CLI reference](../reference/cli.md).
|
||||
|
||||
## The *verify* command
|
||||
|
||||
Once measurements are configured, this command verifies an attestation statement issued by a Constellation, thereby verifying the integrity and confidentiality of the whole cluster.
|
||||
|
||||
The following command performs attestation on the Constellation in your current workspace:
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
```bash
|
||||
constellation verify azure
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
```bash
|
||||
constellation verify gcp
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
||||
|
||||
The command makes sure the value passed to `-cluster-id` matches the *clusterID* presented in the attestation statement.
|
||||
This allows you to verify that you are connecting to a specific Constellation instance
|
||||
Additionally, the confidential computing capabilities, as well as the VM image, are verified to match the expected configurations.
|
||||
|
||||
### Custom arguments
|
||||
|
||||
You can provide additional arguments for `verify` to verify any Constellation you have network access to. This requires you to provide:
|
||||
|
||||
* The IP address of a running Constellation's [VerificationService](../architecture/components.md#verification-service). The *VerificationService* is exposed via a NodePort service using the external IP address of your cluster. Run `kubectl get nodes -o wide` and look for `EXTERNAL-IP`.
|
||||
* The Constellation's *clusterID*. See [cluster identity](../architecture/keys.md#cluster-identity) for more details.
|
||||
|
||||
<tabs>
|
||||
<tabItem value="azure" label="Azure" default>
|
||||
|
||||
```bash
|
||||
constellation verify azure -e 192.0.2.1 --cluster-id Q29uc3RlbGxhdGlvbkRvY3VtZW50YXRpb25TZWNyZXQ=
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
<tabItem value="gcp" label="GCP" default>
|
||||
|
||||
```bash
|
||||
constellation verify gcp -e 192.0.2.1 --cluster-id Q29uc3RlbGxhdGlvbkRvY3VtZW50YXRpb25TZWNyZXQ=
|
||||
```
|
||||
|
||||
</tabItem>
|
||||
</tabs>
|
Loading…
Add table
Add a link
Reference in a new issue