Previously we errored out of the entire join if retrieval
of either LB IP or control plane public IP failed. This resulted
in the entire "use either IP" logic not working as intended. This now
makes it log a warning only if the IP retrievals fail, and only errors
out of the join if no IP can be found at all.
* Let JoinClient return fatal errors
* Mark disk for wiping if JoinClient or InitServer return errors
* Reboot system if bootstrapper detects an error
* Refactor joinClient start/stop implementation
* Fix joining nodes retrying kubeadm 3 times in all cases
* Write non-recoverable failures to syslog before rebooting
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
There used to be three definitions of a Component type, and conversion
routines between the three. Since the use case is always the same, and
the Component semantics are defined by versions.go and the installer, it
seems appropriate to define the Component type there and import it in
the necessary places.
This is the first step in our migration off of
konnectivity. Before node-to-node encryption
we used konnectivity to route some KubeAPI
to kubelet traffic over the pod network which then
would be encrypted.
Since we enabled node-to-node encryption this has no
security upsides anymore. Note that we still deploy
the konnectivity agents via helm and still have the
load balancer for konnectivity.
In the following releases we will remove both.
The token given out by control-planes contains the node IP
as an endpoint. Since during this stage the joining node is
not connected to the WireGuard network, we cannot
communicate node-to-node. Therefore, we need to hop over the
load balancer again to have a src IP outside of the strict
range.
In unit tests, NewCollector may be called on systems that do not have
"journalctl" in $PATH.
We can defer checking if the command can work by not checking cmd.Err in
the constructor.
* Fix incorrect use of masterSecret salt for clusterID generation
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Co-authored-by: Leonard Cohnen <lc@edgeless.systems>
* add current chart
add current helm chart
* disable service controller for aws ccm
* add new iam roles
* doc AWS internet LB + add to LB test
* pass clusterName to helm for AWS LB
* fix update-aws-lb chart to also include .helmignore
* move chart outside services
* working state
* add subnet tags for AWS subnet discovery
* fix .helmignore load rule with file in subdirectory
* upgrade iam profile
* revert new loader impl since cilium is not correctly loaded
* install chart if not already present during `upgrade apply`
* cleanup PR + fix build + add todos
cleanup PR + add todos
* shared helm pkg for cli install and bootstrapper
* add link to eks docs
* refactor iamMigrationCmd
* delete unused helm.symwallk
* move iammigrate to upgrade pkg
* fixup! delete unused helm.symwallk
* add to upgradecheck
* remove nodeSelector from go code (Otto)
* update iam docs and sort permission + remove duplicate roles
* fix bug in `upgrade check`
* better upgrade check output when svc version upgrade not possible
* pr feedback
* remove force flag in upgrade_test
* use upgrader.GetUpgradeID instead of extra type
* remove todos + fix check
* update doc lb (leo)
* remove bootstrapper helm package
* Update cli/internal/cmd/upgradecheck.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* final nits
* add docs for e2e upgrade test setup
* Apply suggestions from code review
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/helm/loader.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/cmd/tfmigrationclient.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* fix daniel review
* link to the iam permissions instead of manually updating them (agreed with leo)
* disable iam upgrade in upgrade apply
---------
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
Co-authored-by: Malte Poll
terraform: collect apiserver cert SANs and support custom endpoint
constants: add new constants for cluster configuration and custom endpoint
cloud: support apiserver cert sans and prepare for endpoint migration on AWS
config: add customEndpoint field
bootstrapper: use per-CSP apiserver cert SANs
cli: route customEndpoint to terraform and add migration for apiserver cert SANs
bootstrapper: change interface of GetLoadBalancerEndpoint to return host and port separately
* Add common backend for interacting with cryptsetup
* Use common cryptsetup backend in bootstrapper
* Use common cryptsetup backend in disk-mapper
* Use common cryptsetup backend in csi lib
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* cli: add "--skip-helm-wait" flag
This flag can be used to disable the atomic and wait flags during helm install.
This is useful when debugging a failing constellation init, since the user gains access to
the cluster even when one of the deployments is never in a ready state.
* Delete helm chart on failure before retrying installation
* Add chart name to debug output
* Remove now unused wait flag from helm Release struct
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* aws: stop using the imds api for tags
* aws: disable tags in imds api
* aws: only tag instances with non-lecagy tag
* bootstrapper: always let coredns run before cilium
* debugd: make debugd less noisy
* fixup fix aws imds test
* fixup unsued context
* move getting instance id to readInstanceTag
* variant: move into internal/attestation
* attesation: move aws attesation into subfolder nitrotpm
* config: add aws-sev-snp variant
* cli: add tf option to enable AWS SNP
For now the implementations in aws/nitrotpm and aws/snp
are identical. They both contain the aws/nitrotpm impl.
A separate commit will add the actual attestation logic.