mirror of
https://github.com/edgelesssys/constellation.git
synced 2024-10-01 01:36:09 -04:00
rfc: node access (#3051)
This commit is contained in:
parent
d64aba7636
commit
68035cd90b
188
rfc/016-node-access.md
Normal file
188
rfc/016-node-access.md
Normal file
@ -0,0 +1,188 @@
|
||||
---
|
||||
status: approved, not implemented
|
||||
---
|
||||
|
||||
# RFC 016: Node Access
|
||||
|
||||
## Background
|
||||
|
||||
A production Constellation cluster is currently configured not to allow any kind of remote administrative access.
|
||||
This choice is deliberate: any mechanism for remote accesss can potentially be exploited, or may leak sensitive data.
|
||||
|
||||
However, some operations on a Kubernetes cluster require some form of access to the nodes.
|
||||
A good class of examples are etcd cluster maintenance tasks, like backup and recovery, or emergency operations like removing a permanently failed member.
|
||||
Some kubeadm operations, like certificate rotation, also require some form of cluster access.
|
||||
|
||||
While some tasks can be accomplished by DaemonSets, CronJobs and the like, relying on Kubernetes objects is insufficient.
|
||||
Executing commands in a Kubernetes pod may fail because Kubernetes is not healthy, etcd is bricked or the network is down.
|
||||
Administrative access to the nodes through a side channel would greatly help remediate, or at least debug, those situations.
|
||||
|
||||
## Requirements
|
||||
|
||||
Constellation admins can log into Constellation nodes for maintenance, subject to the following restrictions:
|
||||
|
||||
* Access must be encrypted end-to-end to protect from CSP snooping.
|
||||
* Access must be possible even if the Kubernetes API server is down.
|
||||
|
||||
Nice-to-have:
|
||||
|
||||
* The method of access should not require long-term storage of a second secret.
|
||||
* The method of access should be time-limited.
|
||||
|
||||
## Proposed Design
|
||||
|
||||
Core to the proposal is [certificate-based authentication for OpenSSH](https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Certificate-based_Authentication).
|
||||
We can derive a valid SSH key from the Constellation master secret.
|
||||
An OpenSSH server on the node accepts certificates issued by this CA key.
|
||||
Admins can derive the CA key from the master secret on demand, and issue certificates for arbitrary public keys.
|
||||
An example program is in the [Appendix](#appendix).
|
||||
|
||||
### Key Details
|
||||
|
||||
We use an HKDF to derive an ed25519 private key from the master secret.
|
||||
This private key acts as an SSH certificate authority, whose signed certs allow access to cluster nodes.
|
||||
Since the master secret is available to both the cluster owner and the nodes, no communication with the cluster is needed to mint valid certificates.
|
||||
The choice of curve allows to directly use the derived secret bytes as key.
|
||||
This makes the implementation deterministic, and thus the CA key recoverable.
|
||||
|
||||
### Server-side Details
|
||||
|
||||
An OpenSSH server is added to the node image software stack.
|
||||
It's configured with a `TrustedUserCAKeys` file and a `RevokedKeys` file, both being empty on startup.
|
||||
All other means of authentication are disabled.
|
||||
|
||||
After initialization, the bootstrapper fills the `TrustedUserCAKeys` file with the derived CA's public key.
|
||||
Joining nodes send their public host key as part of the `IssueJoinTokenRequest` and receive the CA certificate and an indefinitely valid certificate as response.
|
||||
|
||||
The `RevokedKeys` KRL is an option for the cluster administrator to revoke keys, but it's not managed by Constellation.
|
||||
|
||||
### Client-side Details
|
||||
|
||||
A new `ssh` subcommand is added to the CLI.
|
||||
The exact name is TBD, but it should fit in with other key-related activity, like generating volume keys.
|
||||
It takes the master secret file and an SSH pub key file as arguments, and writes a certificate to stdout.
|
||||
Optional arguments may include principals or vailidity period.
|
||||
The implementation could roughly follow the PoC in the [Appendix](#appendix).
|
||||
|
||||
As an extension, the subcommand could allow generating a key pair and a matching certificate in a temp dir, and `exec` the ssh program directly.
|
||||
This would encourage use of very short-lived certificates.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
Exposing an additional service to the outside world increases the attack surface.
|
||||
We propose the following mitigations:
|
||||
|
||||
1. The SSH port is only exposed to the VPC.
|
||||
This restricts the attackers to malicious co-tenants and the CSP.
|
||||
In an emergency, admins need to add a load balancer to be able to reach the nodes.
|
||||
2. A hardened OpenSSH config only allows the options strictly necessary for the scheme proposed here.
|
||||
Authorized keys and passwords must be disabled.
|
||||
Cipher suites should be restricted. etc.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Enable Serial Console
|
||||
|
||||
Serial consoles for cloud VMs are tunnelled through the CSP in the clear.
|
||||
To make this solution secure, an encrypted channel would need to be established on top of the serial connection.
|
||||
The author is not aware of any software providing such a channel.
|
||||
|
||||
### SSH with Authorized Keys
|
||||
|
||||
We could ask users to add a public key to their `constellation-conf.yaml` and add that to `/root/.ssh/authorized_keys` after joining.
|
||||
This would require the cluster owner to permanently manage a second secret, and there would be no built-in way to revoke access.
|
||||
|
||||
### Debug Pod
|
||||
|
||||
Some node administration tasks can be performed with a [debug pod].
|
||||
If privileged access is required, it's usually necessary to schedule a custom pod.
|
||||
This only works if the Kubernetes API server is still processing requests, pods can be scheduled on the target node and the network allows connecting to it.
|
||||
|
||||
[debug pod]: https://kubernetes.io/docs/tasks/debug/debug-cluster/kubectl-node-debug/
|
||||
|
||||
### Host an Admin API Server
|
||||
|
||||
There are alternatives to SSH that allow fine-grained authorization of node operations.
|
||||
An example would be [SansShell], which verifies node access requests with a policy.
|
||||
Setting up such a tool requires a detailed understanding of the use cases, of which some might be hard to foresee.
|
||||
This may be better suited as an extension of the low-level emergency access mechanisms.
|
||||
|
||||
[SansShell]: https://github.com/Snowflake-Labs/sansshell
|
||||
|
||||
## Appendix
|
||||
|
||||
A proof-of-concept implementation of the certificate generation.
|
||||
Constellation nodes would stop after deriving the CA public key.
|
||||
|
||||
```golang
|
||||
package main
|
||||
|
||||
import (
|
||||
"crypto/ed25519"
|
||||
"crypto/rand"
|
||||
"crypto/sha256"
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"golang.org/x/crypto/hkdf"
|
||||
"golang.org/x/crypto/ssh"
|
||||
)
|
||||
|
||||
type secret struct {
|
||||
Key []byte `json:"key,omitempty"`
|
||||
Salt []byte `json:"salt,omitempty"`
|
||||
}
|
||||
|
||||
var permissions = ssh.Permissions{
|
||||
Extensions: map[string]string{
|
||||
"permit-port-forwarding": "yes",
|
||||
"permit-pty": "yes",
|
||||
},
|
||||
}
|
||||
|
||||
func main() {
|
||||
masterSecret := flag.String("secret", "", "")
|
||||
flag.Parse()
|
||||
|
||||
secretJSON, err := os.ReadFile(*masterSecret)
|
||||
must(err)
|
||||
var secret secret
|
||||
must(json.Unmarshal(secretJSON, &secret))
|
||||
|
||||
hkdf := hkdf.New(sha256.New, secret.Key, secret.Salt, []byte("ssh-ca"))
|
||||
|
||||
_, priv, err := ed25519.GenerateKey(hkdf)
|
||||
must(err)
|
||||
|
||||
ca, err := ssh.NewSignerFromSigner(priv)
|
||||
must(err)
|
||||
|
||||
log.Printf("CA KEY: %s", string(ssh.MarshalAuthorizedKey(ca.PublicKey())))
|
||||
|
||||
buf, err := os.ReadFile(flag.Arg(0))
|
||||
must(err)
|
||||
pub, _, _, _, err := ssh.ParseAuthorizedKey(buf)
|
||||
must(err)
|
||||
certificate := ssh.Certificate{
|
||||
Key: pub,
|
||||
CertType: ssh.UserCert,
|
||||
ValidAfter: uint64(time.Now().Unix()),
|
||||
ValidBefore: uint64(time.Now().Add(24 * time.Hour).Unix()),
|
||||
ValidPrincipals: []string{"root"},
|
||||
Permissions: permissions,
|
||||
}
|
||||
must(certificate.SignCert(rand.Reader, ca))
|
||||
|
||||
fmt.Printf("%s\n", string(ssh.MarshalAuthorizedKey(&certificate)))
|
||||
}
|
||||
|
||||
func must(err error) {
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}
|
||||
```
|
Loading…
Reference in New Issue
Block a user