* increase ASG timeout
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make timeout dependent on SEV-SNP option
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
---------
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* perform upgrades in-place in terraform workspace
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* update buildfiles
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* add iam upgrade apply test
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* update buildfiles
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix linter
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make config fetcher stubbable
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* change workspace restoring behaviour
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* allow overwriting existing Terraform files
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* allow overwrites of TF variables
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix iam upgrade apply
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* fix embed directive
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* make loader test less brittle
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* pass upgrade ID to user
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* naming nit
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* use upgradeDir
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
* tidy
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
---------
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
Disabling SMT dynamically inside the image creates problems on AWS.
The problem should be fixed by disabling smt through the VMM.
By recommendation from AWS: add idle=poll.
This should improve our launch success rate while they investigate some
upstream issues.
* Move IAM migration client to cloudcmd package
* Move Terraform Cluster upgrade client to cloudcmd package
* Use hcl for creating Terraform IAM variables files
* Unify terraform upgrade code
* Rename some cloudcmd files for better clarity
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* Clean up Terraform pkg
* Add note to Terraform migration functions expecting to be run on initialized workspace
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
* deps: limit Terraform version to FOSS releases
* fix: enforce upper version constraint
---------
Co-authored-by: Leonard Cohnen <lc@edgeless.systems>
* Remove `--config` and `--master-secret` falgs
* Add `--workspace` flag
* In CLI, only work on files with paths created from `cli/internal/cmd`
* Properly print values for GCP on IAM create when not directly updating the config
---------
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Also update Azure terraform:
ignore snp policy changes on resource
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: edgelessci <edgelessci@users.noreply.github.com>
Co-authored-by: Otto Bittner <cobittner@posteo.net>
* add current chart
add current helm chart
* disable service controller for aws ccm
* add new iam roles
* doc AWS internet LB + add to LB test
* pass clusterName to helm for AWS LB
* fix update-aws-lb chart to also include .helmignore
* move chart outside services
* working state
* add subnet tags for AWS subnet discovery
* fix .helmignore load rule with file in subdirectory
* upgrade iam profile
* revert new loader impl since cilium is not correctly loaded
* install chart if not already present during `upgrade apply`
* cleanup PR + fix build + add todos
cleanup PR + add todos
* shared helm pkg for cli install and bootstrapper
* add link to eks docs
* refactor iamMigrationCmd
* delete unused helm.symwallk
* move iammigrate to upgrade pkg
* fixup! delete unused helm.symwallk
* add to upgradecheck
* remove nodeSelector from go code (Otto)
* update iam docs and sort permission + remove duplicate roles
* fix bug in `upgrade check`
* better upgrade check output when svc version upgrade not possible
* pr feedback
* remove force flag in upgrade_test
* use upgrader.GetUpgradeID instead of extra type
* remove todos + fix check
* update doc lb (leo)
* remove bootstrapper helm package
* Update cli/internal/cmd/upgradecheck.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* final nits
* add docs for e2e upgrade test setup
* Apply suggestions from code review
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/helm/loader.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* Update cli/internal/cmd/tfmigrationclient.go
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
* fix daniel review
* link to the iam permissions instead of manually updating them (agreed with leo)
* disable iam upgrade in upgrade apply
---------
Co-authored-by: Daniel Weiße <66256922+daniel-weisse@users.noreply.github.com>
Co-authored-by: Malte Poll
terraform: collect apiserver cert SANs and support custom endpoint
constants: add new constants for cluster configuration and custom endpoint
cloud: support apiserver cert sans and prepare for endpoint migration on AWS
config: add customEndpoint field
bootstrapper: use per-CSP apiserver cert SANs
cli: route customEndpoint to terraform and add migration for apiserver cert SANs
bootstrapper: change interface of GetLoadBalancerEndpoint to return host and port separately
This change is required to ensure we have not tls handshake errors when connecting to the kubernetes api.
Currently, the certificates used by kube-apiserver pods contain a SAN field with the (single) public ip of the loadbalancer.
If we would allow multiple loadbalancer frontend ips, we could encounter cases where the certificate is only valid for one public ip,
while we try to connect to a different ip.
To prevent this, we consciously disable support for the multi-zone loadbalancer frontend on AWS for now.
This will be re-enabled in the future.
Normalize naming for the "instance_count" / "initial_count" int terraform to always use "initial_count".
This is required, since there is a naming confusion on AWS.
"initial_count" is more precise, since it reflects the fact that this value is ignored when applying the terraform template
after the scaling groups already exist.
This commit is designed to be reverted in the future (AB#3248).
Terraform does not implement moved blocks with dynamic targets: https://github.com/hashicorp/terraform/issues/31335 so we have to migrate the terraform state ourselves.
* init
add variables
add amount to instance_group again
fix tf validate
rollback old names
make fields optional
fix image ref mini
daniel comments
use latest
* Update cli/internal/terraform/terraform/qemu/main.tf
Co-authored-by: Malte Poll <1780588+malt3@users.noreply.github.com>
* add uid to resource name
* make machine a global variable again
* fix tf
---------
Co-authored-by: Malte Poll <1780588+malt3@users.noreply.github.com>
* init
* migration working
* make tf variables with default value optional in go through ptr type
* fix CI build
* pr feedback
* add azure targets tf
* skip migration for empty targets
* make instance_count optional
* change role naming to dashed + add validation
* make node_group.zones optional
* Update cli/internal/terraform/terraform/azure/main.tf
Co-authored-by: Malte Poll <1780588+malt3@users.noreply.github.com>
* malte feedback
---------
Co-authored-by: Malte Poll <1780588+malt3@users.noreply.github.com>
* terraform: GCP node groups
* cli: marshal GCP node groups to terraform variables
This does not have any side effects for users.
We still strictly create one control-plane and one worker group.
This is a preparation for enabling customizable node groups in the future.
* deps: update Terraform google to v4.69.1
* deps: tidy all modules
* add delay for service account
* deps: tidy all modules
* add delay for service account
---------
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: edgelessci <edgelessci@users.noreply.github.com>
Co-authored-by: Adrian Stobbe <stobbe.adrian@gmail.com>