* increase ASG timeout
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make timeout dependent on SEV-SNP option
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
---------
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* perform upgrades in-place in terraform workspace
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* update buildfiles
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* add iam upgrade apply test
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* update buildfiles
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix linter
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make config fetcher stubbable
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* change workspace restoring behaviour
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* allow overwriting existing Terraform files
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* allow overwrites of TF variables
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix iam upgrade apply
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix embed directive
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make loader test less brittle
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* pass upgrade ID to user
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* naming nit
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* use upgradeDir
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* tidy
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
---------
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
Disabling SMT dynamically inside the image creates problems on AWS.
The problem should be fixed by disabling smt through the VMM.
By recommendation from AWS: add idle=poll.
This should improve our launch success rate while they investigate some
upstream issues.
* Move IAM migration client to cloudcmd package
* Move Terraform Cluster upgrade client to cloudcmd package
* Use hcl for creating Terraform IAM variables files
* Unify terraform upgrade code
* Rename some cloudcmd files for better clarity
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* Clean up Terraform pkg
* Add note to Terraform migration functions expecting to be run on initialized workspace
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* deps: limit Terraform version to FOSS releases
* fix: enforce upper version constraint
---------
Co-authored-by: Leonard Cohnen <lc@edgeless.systems>
* Remove `--config` and `--master-secret` falgs
* Add `--workspace` flag
* In CLI, only work on files with paths created from `cli/internal/cmd`
* Properly print values for GCP on IAM create when not directly updating the config
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Also update Azure terraform:
ignore snp policy changes on resource
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: edgelessci <edgelessci@users.noreply.github.com>
Co-authored-by: Otto Bittner <cobittner@posteo.net>
* add current chart
add current helm chart
* disable service controller for aws ccm
* add new iam roles
* doc AWS internet LB + add to LB test
* pass clusterName to helm for AWS LB
* fix update-aws-lb chart to also include .helmignore
* move chart outside services
* working state
* add subnet tags for AWS subnet discovery
* fix .helmignore load rule with file in subdirectory
* upgrade iam profile
* revert new loader impl since cilium is not correctly loaded
* install chart if not already present during `upgrade apply`
* cleanup PR + fix build + add todos
cleanup PR + add todos
* shared helm pkg for cli install and bootstrapper
* add link to eks docs
* refactor iamMigrationCmd
* delete unused helm.symwallk
* move iammigrate to upgrade pkg
* fixup! delete unused helm.symwallk
* add to upgradecheck
* remove nodeSelector from go code (Otto)
* update iam docs and sort permission + remove duplicate roles
* fix bug in `upgrade check`
* better upgrade check output when svc version upgrade not possible
* pr feedback
* remove force flag in upgrade_test
* use upgrader.GetUpgradeID instead of extra type
* remove todos + fix check
* update doc lb (leo)
* remove bootstrapper helm package
* Update cli/internal/cmd/upgradecheck.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* final nits
* add docs for e2e upgrade test setup
* Apply suggestions from code review
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/helm/loader.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/cmd/tfmigrationclient.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* fix daniel review
* link to the iam permissions instead of manually updating them (agreed with leo)
* disable iam upgrade in upgrade apply
---------
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
Co-authored-by: Malte Poll
terraform: collect apiserver cert SANs and support custom endpoint
constants: add new constants for cluster configuration and custom endpoint
cloud: support apiserver cert sans and prepare for endpoint migration on AWS
config: add customEndpoint field
bootstrapper: use per-CSP apiserver cert SANs
cli: route customEndpoint to terraform and add migration for apiserver cert SANs
bootstrapper: change interface of GetLoadBalancerEndpoint to return host and port separately
This change is required to ensure we have not tls handshake errors when connecting to the kubernetes api.
Currently, the certificates used by kube-apiserver pods contain a SAN field with the (single) public ip of the loadbalancer.
If we would allow multiple loadbalancer frontend ips, we could encounter cases where the certificate is only valid for one public ip,
while we try to connect to a different ip.
To prevent this, we consciously disable support for the multi-zone loadbalancer frontend on AWS for now.
This will be re-enabled in the future.
Normalize naming for the "instance_count" / "initial_count" int terraform to always use "initial_count".
This is required, since there is a naming confusion on AWS.
"initial_count" is more precise, since it reflects the fact that this value is ignored when applying the terraform template
after the scaling groups already exist.
This commit is designed to be reverted in the future (AB#3248).
Terraform does not implement moved blocks with dynamic targets: https://github.com/hashicorp/terraform/issues/31335 so we have to migrate the terraform state ourselves.