Operations grimoire/Eglide/Vault

From Nasqueron Agora

📕📁📜 Old technical information :: content warning

⌛ This Nasqueron Operations Grimoire page hasn't been updated for a long time.

☣ As our infrastructure evolves quickly, there is a good chance this information is outdated or now inaccurate. Be careful and consider update it.

➡️ To assert the information is still up-to-date or not, you can check the history of the relevant role in our Operations repository.

Vault on the shellserver role is installed through HashiCorp repository package.

States are located in rOPS: roles/shellserver/vault unit. This unit is needed as Eglide isn't connected to our private network and so doesn't have access to Complector directly.

Configuration

Configuration is stored in /etc/vault.hcl as it's a Debian machine, not in /usr/local/etc

The package generate auto-signing keys and a out-of-the-box configuration in /etc/vault.d that needs to be removed by our unit, if the package is reinstalled, there is a risk those files are respawned again.

Also, the package populates a systemd service to read /etc/vault.d/vault.hcl so it also needs to be replaced by our one, or by overrides.

In any case, run salt-call --local state.apply roles/shellserver/vault/config to restore a correct configuration state.

Certificates

Vault certificates should be generated in /etc/certificates/vault

If we use the Nasqueron Vault CA for this, Vault client should use certificate from /usr/local/share/ca-certificates/nasqueron-vault-ca.crt like on any other server. The certificates_update_store state in rOPS: roles/core/certificates includes that certificate in /etc/ssl/certs as debian:nasqueron-vault-ca.pem.

Vault server wants two files to do TLS termination:

  • /etc/certificates/vault/private.key
  • /etc/certificates/vault/fullchain.pem

From Operations grimoire/Vault we can generate those elements from Complector Vault (working on Complector or WindRiver).

The certificate common name MUST be a subdomain of *.nasqueron.drake, so we use <machine name>.eglide.nasqueron.drake:

   vault write -format=json pki_vault/issue/nasqueron-drake common_name=zonegrey.eglide.nasqueron.drake ttl=2160h ip_sans=127.0.0.1,10.197.126.53

The output needs to be dispatched in several files:

  • .data.certificate to certificate.pem
  • .data.issuing_ca to ca.pem (that will be used for the fullchain)
  • .data.private_key to private.key (careful how you replace the \n, if you use Python REPL, do it on Complector and get rid of the history with import readline ; readline.clear_history()) -> chmod 400
  • the fullchan bundle can be created with cat certificate.pem ca.pem > fullchain.pem

You can then restart Vault with systemctl restart vault.

Secrets K/V store

On Eglide, key value secrets are store in kv2 engine mounted under default path kv/

Secrets can be organised by service or user account:

  • kv/service/ contains secrets sorted by services deployed through Salt or automated
  • kv/user/ contains secrets by user accounts, for general purpose and unsalted services

For example, Odderon services live in kv/service/odderon

Basic operations are:

  • Store a secret: vault kv put kv/service/acme/nickserv password=$(openssl rand -base64 18)
  • Read a secret: vault kv get kv/service/acme/nickserv

Secrets can be read:

  • for kv/user/<account>/* by a token issued for that user, on request
  • for kv/service/* by Salt to deploy services, by default
  • for kv/service/<service>/* to a specific service with Vault support through AppRole authentication, on request

Troubleshoot

DRP :: Reconfigure Salt

Prepare a policy for Salt:

   vault policy write salt-node /srv/salt/roles/shellserver/vault/files/salt.hcl

Salt needs an approle auth method.

   vault auth enable approle
   vault write auth/approle/role/salt-node secret_id_ttl=10m \
       token_policies="jenkins" \
       token_num_uses=10 \
       token_ttl=20m \
       token_max_ttl=30m \
       secret_id_num_uses=0
   vault read auth/approle/role/salt-node/role-id
   vault write -f auth/approle/role/salt-node/secret-id

The role_id and secret_id values belong to /etc/salt/minion.d/vault.conf

Tip: you can test connection with sudo _tests/config/salt-primary/vault.py if you replace /usr/local/etc by /etc in it. You should then get the following content:

   Can connect to Vault:
       metadata: {'role_name': 'salt-node'}
       policies: ['default', 'salt-node']
       token_policies: ['default', 'salt-node']
       token_type: service

Certificates

Certificate MUST has a 127.0.0.1 IP SAN:

   Error checking seal status: Get "https://127.0.0.1:8200/v1/sys/seal-status": tls: failed to verify certificate: x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs

That could simply by the symptom Vault uses the self-signed certificate in /etc/vault.d. Delete that directory by running againt Salt states (sudo salt-call --local roles/shellserver/vault/config).