diff --git a/rfc/XXX-trusted-kubernetes-images.md b/rfc/XXX-trusted-kubernetes-images.md new file mode 100644 index 000000000..d5d77ede4 --- /dev/null +++ b/rfc/XXX-trusted-kubernetes-images.md @@ -0,0 +1,148 @@ +# RFC XXX: Trusted Kubernetes Container Images + +Kubernetes control plane images should be verified by the Constellation installation. + +## The Problem + +When we bootstrap the Constellation Kubernetes cluster, `kubeadm` places a set +of static pods for the control plane components into the filesystem. The +manifests refer to images in a registry beyond the users' control, and the +container image content is not reproducible. + +This is obviously a trust issue, because the Kubernetes control plane is +part of Constellation's TCB, but it is also a problem when Constellation is set +up in a restricted environment where this repo is not available. + +## Requirements + +1. In a default installation, Constellation must verify Kubernetes control plane images. + +Out of scope: + +- Customization of the image repository or image content. + - This is orthogonal to image verification and will be subject of a separate + RFC. +- Reproducibility from github.com/kubernetes/kubernetes to registry.k8s.io and + the associated chain of trust. + - It is not clear whether Kubernetes images can be reproduced at all [1]. + Either way, the likely threat is a machine-in-the-middle attack between + Constellation and registry.k8s.io. A desirable addition to this proposal + could be verification of image signatures [2]. +- Container registry trust & CA certificates. + - This is also orthogonal to image verification. + +[1]: https://github.com/kubernetes/kubernetes/blob/master/build/README.md#reproducibility +[2]: https://kubernetes.io/docs/tasks/administer-cluster/verify-signed-artifacts/ + +## Solution + +Kubernetes control plane images are going to be pinned by a hash, which is verified by +the CRI. Image hashes are added to the Constellation codebase when support for +a new version is added. During installation, the `kubeadm` configuration is +modified so that images are pinned. + +### Image Hashes + +We are concerned with the following control plane images (tags for v1.27.7): + +- registry.k8s.io/kube-apiserver:v1.27.7 +- registry.k8s.io/kube-controller-manager:v1.27.7 +- registry.k8s.io/kube-scheduler:v1.27.7 +- registry.k8s.io/coredns/coredns:v1.10.1 +- registry.k8s.io/etcd:3.5.9-0 + +When a new Kubernetes version is added to `/internal/versions/versions.go`, we +generate the corresponding list of images with `kubeadm config images list` and +probe their hashes on registry.k8s.io. Generating the list of images this way +must happen offline to prevent `kubeadm` from being clever. These hashes are +added to `versions.go` as a mapping from component to pinned image: + +```golang +V1_27: { + ClusterVersion: "v1.27.7", + // ... + KubernetesImages: { + "kube-apiserver": "registry.k8s.io/kube-apiserver:v1.27.7@sha256:<...>", + "kube-controller-manager": "registry.k8s.io/kube-controller-manager:v1.27.7@sha256:<...>", + "kube-scheduler": "registry.k8s.io/kube-scheduler:v1.27.7@sha256:<...>", + "etcd": "registry.k8s.io/etcd:3.5.9-0@sha256:<...>", + "coredns": "registry.k8s.io/coredns/coredns:v1.10.1@sha256:<...>", + }, +} +``` + +### Cluster Init + +During cluster initialization, we need to tell `kubeadm` that we want to use +the embedded image references instead of the default ones. For that, we +populate the +[`InitConfiguration.Patches`](https://pkg.go.dev/k8s.io/kubernetes@v1.27.7/cmd/kubeadm/app/apis/kubeadm/v1beta3#InitConfiguration) +with a list of [JSON Patch](https://datatracker.ietf.org/doc/html/rfc6902) +files that replace the container image with the pinned alternative. + +The patches need to be written to the stateful filesystem by the +bootstrapper. This is very similar to `components.Component`, which also +place Kubernetes-related data onto the filesystem: + +```go +type Component struct { + URL string + Hash string + InstallPath string + Extract bool +} +``` + +Components are handled by the installer, where the convention currently expects +HTTP URLs that are to be downloaded. We can extend this by allowing other forms +of URI schemes: +[data URLs](https://developer.mozilla.org/en-US/docs/web/http/basics_of_http/data_urls). +A patch definition as Component would look like this: + +```go +patch := &components.Component{ + URL: "data:application/json;base64,W3sib3AiOiJyZXBsYWNlIiwicGF0aCI6Ii9zcGVjL2NvbnRhaW5lcnMvMC9pbWFnZSIsInZhbHVlIjoicmVnaXN0cnkuazhzLmlvL215LWNvbnRyb2wtcGxhbmUtaW1hZ2U6djEuMjcuN0BzaGEyNTY6Li4uIn1dCg==" + InstallPath: "/opt/kubernetes/patches/kube-apiserver+json.json" + // Hash and Extract deliberately left empty. +} +``` + +This method does not work for coredns, which can only be customized with the +[`ClusterConfiguration.DNS`](https://pkg.go.dev/k8s.io/kubernetes@v1.27.7/cmd/kubeadm/app/apis/kubeadm/v1beta3#ClusterConfiguration) +section. However, we can split the image into repository, path and tag+hash +and use these values as `DNS.ImageMeta`. + +### Upgrade + +The upgrade agent currently receives a kubeadm URI and hash, and internally +assembles this to a `Component`. We change the upgrade proto to accept +a full `components.Components`, which then would also include the new patches. +The components would be populated from the ConfigMap, as is already the case. + +The CoreDNS config would need to be updated in the `kube-system/kubeadm-config` +ConfigMap. + +## Alternatives Considered + +### Exposing more of KubeadmConfig + +We could allow users to supply their own patches to `KubeadmConfig` for finer +control over the installation. We don't want to do this because: + +1. It does not solve the problem of image verification - we'd still need to + derive image hashes from somewhere. +2. It's easy to accidentally leave charted territory when applying config + overrides, and responsibilities are unclear in that case: should users be + allowed to configure network, etcd, etc.? +3. The way Kubernetes exposes the configuration is an organically grown mess: + registries are now in multiple structs, path names are hard-coded to some + extent and versions come from somewhere else entirely (cf. + kubernetes/kubernetes#102502). + +### Ship the container images with the OS + +We could bundle all control plane images in our OS image and configure kubeadm +to never pull images. This would make Constellation independent of external +image resources at the expense of flexibility: overriding the control plane +images in development setups would be harder, and we would not be able to +support user-provided images anymore.