6.2 KiB
RFC 015: Trusted Kubernetes Container Images
Kubernetes control plane images should be verified by the Constellation installation.
The Problem
When we bootstrap the Constellation Kubernetes cluster, kubeadm
places a set
of static pods for the control plane components into the filesystem. The
manifests refer to images in a registry beyond the users' control, and the
container image content is not reproducible.
This is obviously a trust issue, because the Kubernetes control plane is part of Constellation's TCB, but it is also a problem when Constellation is set up in a restricted environment where this repo is not available.
Requirements
- In a default installation, Constellation must verify Kubernetes control plane images.
Out of scope:
- Customization of the image repository or image content.
- This is orthogonal to image verification and will be subject of a separate RFC.
- Reproducibility from github.com/kubernetes/kubernetes to registry.k8s.io and the associated chain of trust.
- Container registry trust & CA certificates.
- This is also orthogonal to image verification.
Solution
Kubernetes control plane images are going to be pinned by a hash, which is verified by
the CRI. Image hashes are added to the Constellation codebase when support for
a new version is added. During installation, the kubeadm
configuration is
modified so that images are pinned.
Image Hashes
We are concerned with the following control plane images (tags for v1.27.7):
- registry.k8s.io/kube-apiserver:v1.27.7
- registry.k8s.io/kube-controller-manager:v1.27.7
- registry.k8s.io/kube-scheduler:v1.27.7
- registry.k8s.io/coredns/coredns:v1.10.1
- registry.k8s.io/etcd:3.5.9-0
When a new Kubernetes version is added to /internal/versions/versions.go
, we
generate the corresponding list of images with kubeadm config images list
and
probe their hashes on registry.k8s.io. Generating the list of images this way
must happen offline to prevent kubeadm
from being clever. These hashes are
added to versions.go
as a mapping from component to pinned image:
V1_27: {
ClusterVersion: "v1.27.7",
// ...
KubernetesImages: {
"kube-apiserver": "registry.k8s.io/kube-apiserver:v1.27.7@sha256:<...>",
"kube-controller-manager": "registry.k8s.io/kube-controller-manager:v1.27.7@sha256:<...>",
"kube-scheduler": "registry.k8s.io/kube-scheduler:v1.27.7@sha256:<...>",
"etcd": "registry.k8s.io/etcd:3.5.9-0@sha256:<...>",
"coredns": "registry.k8s.io/coredns/coredns:v1.10.1@sha256:<...>",
},
}
Cluster Init
During cluster initialization, we need to tell kubeadm
that we want to use
the embedded image references instead of the default ones. For that, we
populate the
InitConfiguration.Patches
with a list of JSON Patch
files that replace the container image with the pinned alternative.
The patches need to be written to the stateful filesystem by the
bootstrapper. This is very similar to components.Component
, which also
place Kubernetes-related data onto the filesystem:
type Component struct {
URL string
Hash string
InstallPath string
Extract bool
}
Components are handled by the installer, where the convention currently expects HTTP URLs that are to be downloaded. We can extend this by allowing other forms of URI schemes: data URLs. A patch definition as Component would look like this:
patch := &components.Component{
URL: "data:application/json;base64,W3sib3AiOiJyZXBsYWNlIiwicGF0aCI6Ii9zcGVjL2NvbnRhaW5lcnMvMC9pbWFnZSIsInZhbHVlIjoicmVnaXN0cnkuazhzLmlvL215LWNvbnRyb2wtcGxhbmUtaW1hZ2U6djEuMjcuN0BzaGEyNTY6Li4uIn1dCg=="
InstallPath: "/opt/kubernetes/patches/kube-apiserver+json.json"
// Hash and Extract deliberately left empty.
}
This method does not work for coredns, which can only be customized with the
ClusterConfiguration.DNS
section. However, we can split the image into repository, path and tag+hash
and use these values as DNS.ImageMeta
.
Upgrade
The upgrade agent currently receives a kubeadm URI and hash, and internally
assembles this to a Component
. We change the upgrade proto to accept
a full components.Components
, which then would also include the new patches.
The components would be populated from the ConfigMap, as is already the case.
The CoreDNS config would need to be updated in the kube-system/kubeadm-config
ConfigMap.
Alternatives Considered
Exposing more of KubeadmConfig
We could allow users to supply their own patches to KubeadmConfig
for finer
control over the installation. We don't want to do this because:
- It does not solve the problem of image verification - we'd still need to derive image hashes from somewhere.
- It's easy to accidentally leave charted territory when applying config overrides, and responsibilities are unclear in that case: should users be allowed to configure network, etcd, etc.?
- The way Kubernetes exposes the configuration is an organically grown mess: registries are now in multiple structs, path names are hard-coded to some extent and versions come from somewhere else entirely (cf. kubernetes/kubernetes#102502).
Ship the container images with the OS
We could bundle all control plane images in our OS image and configure kubeadm to never pull images. This would make Constellation independent of external image resources at the expense of flexibility: overriding the control plane images in development setups would be harder, and we would not be able to support user-provided images anymore.