Operations grimoire/Eglide/Vault
Vault on the shellserver role is installed through HashiCorp repository package.
States are located in rOPS: roles/shellserver/vault unit. This unit is needed as Eglide isn't connected to our private network and so doesn't have access to Complector directly.
Configuration
Configuration is stored in /etc/vault.hcl as it's a Debian machine, not in /usr/local/etc
The package generate auto-signing keys and a out-of-the-box configuration in /etc/vault.d that needs to be removed by our unit, if the package is reinstalled, there is a risk those files are respawned again.
Also, the package populates a systemd service to read /etc/vault.d/vault.hcl so it also needs to be replaced by our one, or by overrides.
In any case, run salt-call --local state.apply roles/shellserver/vault/config
to restore a correct configuration state.
Certificates
Vault certificates should be generated in /etc/certificates/vault
If we use the Nasqueron Vault CA for this, Vault client should use certificate from /usr/local/share/ca-certificates/nasqueron-vault-ca.crt
like on any other server. The certificates_update_store state in rOPS: roles/core/certificates includes that certificate in /etc/ssl/certs as debian:nasqueron-vault-ca.pem
.
Vault server wants two files to do TLS termination:
- /etc/certificates/vault/private.key
- /etc/certificates/vault/fullchain.pem
From Operations grimoire/Vault we can generate those elements from Complector Vault (working on Complector or WindRiver).
The certificate common name MUST be a subdomain of *.nasqueron.drake, so we use <machine name>.eglide.nasqueron.drake:
vault write -format=json pki_vault/issue/nasqueron-drake common_name=zonegrey.eglide.nasqueron.drake ttl=2160h ip_sans=127.0.0.1,10.197.126.53
The output needs to be dispatched in several files:
- .data.certificate to certificate.pem
- .data.issuing_ca to ca.pem (that will be used for the fullchain)
- .data.private_key to private.key (careful how you replace the \n, if you use Python REPL, do it on Complector and get rid of the history with import readline ; readline.clear_history()) -> chmod 400
- the fullchan bundle can be created with
cat certificate.pem ca.pem > fullchain.pem
You can then restart Vault with systemctl restart vault
.
Secrets K/V store
On Eglide, key value secrets are store in kv2 engine mounted under default path kv/
Secrets can be organised by service or user account:
- kv/service/ contains secrets sorted by services deployed through Salt or automated
- kv/user/ contains secrets by user accounts, for general purpose and unsalted services
For example, Odderon services live in secrets/service/odderon
Basic operations are:
- Store a secret:
vault kv put kv/service/acme/nickserv password=$(openssl rand -base64 18)
- Read a secret:
vault kv get kv/service/acme/nickserv
Secrets can be read:
- for kv/user/<account>/* by a token issued for that user, on request
- for kv/service/* by Salt to deploy services, by default
- for kv/service/<service>/* to a specific service with Vault support through AppRole authentication, on request
Troubleshoot
DRP :: Reconfigure Salt
Prepare a policy for Salt:
vault policy write salt-node /srv/salt/roles/shellserver/vault/files/salt.hcl
Salt needs an approle auth method.
vault auth enable approle vault write auth/approle/role/salt-node secret_id_ttl=10m \ token_policies="jenkins" \ token_num_uses=10 \ token_ttl=20m \ token_max_ttl=30m \ secret_id_num_uses=0 vault read auth/approle/role/salt-node/role-id vault write -f auth/approle/role/salt-node/secret-id
The role_id and secret_id values belong to /etc/salt/master.d/vault.conf
Tip: you can test connection with sudo _tests/config/salt-primary/vault.py
if you replace /usr/local/etc by /etc in it. You should then get the following content:
Can connect to Vault: metadata: {'role_name': 'salt-node'} policies: ['default', 'salt-node'] token_policies: ['default', 'salt-node'] token_type: service
Certificates
Certificate MUST has a 127.0.0.1 IP SAN:
Error checking seal status: Get "https://127.0.0.1:8200/v1/sys/seal-status": tls: failed to verify certificate: x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs
That could simply by the symptom Vault uses the self-signed certificate in /etc/vault.d. Delete that directory by running againt Salt states (sudo salt-call --local roles/shellserver/vault/config
).