Operations grimoire/TLS certificates

From Nasqueron Agora
Revision as of 20:34, 5 November 2024 by DorianWinty (talk | contribs) (→‎Deploying of the certificates)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

TLS certificates should be used for every service we provide, so we can encrypt the communications.

Let's Encrypt

Certbot commands

acme-v02 migration. If you've a complaint acme-v01.api. isn't available, add --server https://acme-v02.api.letsencrypt.org/directory.

T1505 alignment. We don't use anymore Certbot as a Docker container on paas-docker role. As such, all paths are now unified. Symlinks have been added to old /srv paths to help transition.

Generate a certificate

certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "service nginx reload" -d foo.nasqueron.org

certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "systemctl reload nginx" -d foo.nasqueron.org

Generate a certificate for several sites

-d foo.nasqueron.org -d bar.nasqueron.org

If a certificate for foo already existed, it will offer to extend it to a new alternative name, which is probably a good idea.

Generate a certificate through DNS

DNS can be used to generate certificates for domains. For example, the Openfire XMPP certificate is generated like this:

   certbot certonly --server https://acme-v02.api.letsencrypt.org/directory --manual --manual-auth-hook /etc/letsencrypt/acme-dns-auth --preferred-challenges dns --debug-challenges -d xmpp.nasqueron.org -d nasqueron.org -d conference.nasqueron.org

From December 2023, you can use this command on FreeBSD servers where T1505 Let's Encrypt standard installation have been deployed:

   certbot certonly --manual --manual-auth-hook /usr/local/etc/letsencrypt/acme-dns-auth --preferred-challenges dns --debug-challenges -d subdomain.nasqueron.org

This uses a specialized DNS server deployed on our Docker PaaS to serve dynamic TLS records under .acme.nasqueron.org, rOPS: roles/paas-docker/containers/acme_dns.sls.

Support files were previously only deployed to paas-docker role through rOPS: roles/paas-docker/letsencrypt/files.

Renew all certificates

certbot renew

Installation on nginx

Allow Let's encrypt validation
   include includes/letsencrypt;

This will use rOPS: roles/webserver-core/nginx/files/includes/letsencrypt nginx configuration.

Serve TLS certificate
   include includes/tls;
   ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem;
   ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem;

This will configure a compromise between security and compatibility, based on Intermediate Mozilla SSL config[1]. The current configuration is served by rOPS: roles/webserver-core/nginx/files/includes/tls.

If you prefer to restrict the resource for TLS 1.3, and accepts to block legacy clients, you can also use with D3251:

   include includes/tls-modern-only;
   ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem;
   ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem;

Edit renewal hook

In /etc/letsencrypt/renewal or /usr/local/etc/letsencrypt/renewal you can edit this section to customize the command to run after renewal:

   [renewalparams]
   renew_hook = systemctl reload nginx

That can be applied in batch when nothing is set: gsed -i "s/\[\[webroot_map]]/renew_hook = systemctl reload nginx\n\n[[webroot_map]]/g" *.conf

Install certbot on CentOS 10 stream

Packages for Python 3.12 currently used as of 2024-08-04 on CentOS 10 stream can be fetched from Fedora 40 repository.

Installation on Dwellers was done like this:

   mkdir /tmp/certbot && cd /tmp/certbot
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/c/certbot-2.9.0-1.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-acme-2.9.0-1.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-certbot-2.9.0-1.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-configargparse-1.7-3.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-josepy-1.13.0-8.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-parsedatetime-2.6-12.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pyOpenSSL-23.2.0-3.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pyrfc3339-1.1-18.fc40.noarch.rpm
   wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pytz-2024.1-1.fc40.noarch.rpm
   dnf install ./*.rpm

acme.sh

acme.sh should be run with the called "acme" with a

   sudo su - acme

Generate a certificate

Generate a certificate for several sites

Generate a certificate through DNS

For DNS certificate, we need to start the certificate generation, edit the DNS records to add the certificate verification. And launch the end of the certificate generation

   export ACMEDNS_BASE_URL="https://acme.nasqueron.org"
   acme.sh --issue --dns dns_acmedns -d example.com --server letsencrypt

Regenerate certificate how are from certbot

   export ACMEDNS_BASE_URL="https://acme.nasqueron.org"
   export ACMEDNS_USERNAME="<username>"
   export ACMEDNS_PASSWORD="<password>"
   export ACMEDNS_SUBDOMAIN="<subdomain>"
   acme.sh --issue --dns dns_acmedns -d example.com --server letsencrypt

Renew all certificates

For the certificates renewal, anything is needed the cron tab will try to renew them every day "x" day before the expiracy

Deploying of the certificates

 acme.sh --install-cert -d exemple.nasqueron.org \
 --cert-file /var/certificates/exemple.nasqueron.org/cert.pem \
 --key-file /var/certificates/exemple.nasqueron.org/key.pem \
 --fullchain-file /var/certificates/exemple.nasqueron.org/fullchain.pem


improved command
 sudo su acme
 export domain="exemple.nasqueron.org"
 mkdir /var/certificates/$domain
 acme.sh --install-cert -d $domain \
 --cert-file /var/certificates/$domain/cert.pem \
 --key-file /var/certificates/$domain/key.pem \
 --fullchain-file /var/certificates/$domain/fullchain.pem

on nginx

Nasqueron PKI

Principles

Internal PKI material are stored in Vault, see Operations grimoire/Vault for information about how to generate, renew, etc.

In the operations repository, the rOPS: roles/core/certificates unit is responsible to deploy nasqueron-vault-ca.crt, used to authenticate for *.nasqueron.drake or IP services. For exemple, Vault clients require a certificate they can validate with a chain from that intermediate CA. Note as we don't currently use the root CA, a fullchain is probably not needed for those services.

Rollover a new certificate

If nasqueron-vault-ca.crt is updated or if we want instead to provision the root certificate, the following repository need to be updated and services to be redeployed:

Where to update certificate?
Repository Path Purpose Deployment instructions
operations rOPS: roles/core/certificates/files/ Trust internal resources on every server Salt: salt '*' state.apply roles/core/certificates
docker-airflow files/ Connect to Vault from Airflow Python code Docker: build and deploy new image nasqueron/airflow on Dwellers

Deployment instructions are up-to-date when newly added on that table. Afterwards, they should be checked against current procedures.

Special considerations

New server

Let's encrypt client is available on Ysul (natively) and Dwellers (as a wrapper script for a Docker container).

Fill a task in Servers component, subscribe Sandlayth and Dereckson to deploy it on a new server.

A salt state would be nice for such purpose. There is work-in-progress on that matter through D3248.

Internationalized domain names

Punycode conversion

Both for web server configuration and certificate authority, name must be converted to Punycode (RFC 3492): https://www.punycoder.com/

Let's encrypt support

Let's encrypt has supported IDN since 2016[2]. We use it for dægrefn.nasqueron.org certificate.

Previously, they were afraid: attackers could register a domain with a Cyrillic character matching a real domains. As some people consider it's the responsibility of the CA to mitigate such risks, the feature has been several times postponed.

StartSSL

StartSSL is not in activity anymore. It was used at Nasqueron when Let's Encrypt didn't support IDN.

Troubleshoot

Acme DNS

Can't reach ACME DNS API / HTTP Error 403: Forbidden

AcmeDNS API access is restricted to IPs addresses set in rOPS: roles/webserver-core/nginx/files/includes/geo_nasqueron.

If you got a 403 by running acme.sh (only visible in --debug mode):

  • If the IP address is missing from geo_nasqueron, add it
  • If the IP address is already there, check on the server if /etc/nginx/includes/geo_nasqueron matches
    • Yes -> nginx must be reloaded (nginx -t reload)
    • No -> Deploy with salt docker-002 state.sls_id /etc/nginx/includes/geo_nasqueron roles/webserver-core/nginx

Another solution is a specific routing to reach the API: For Hervil, add a route to use router-001 gateway: route add 51.255.124.8/30 172.27.27.1

Troubleshoot records in ACME DNS database

To check the acme.nasqueron.org subdomain matching your _acme-challenge subdomain: nslookup -type=TXT _acme-challenge.<subdomain>.nasqueron.org

The DNS server to store acme.nasqueron.org records uses a sqlite3 database. To connect and check your parameters:

   $ ssh docker-002
   $ sqlite3 /srv/acme/lib/acme-dns.db
   SELECT Username, Password FROM records WHERE Subdomain = "<DNS answer>"

Credential lost for a ACME DNS subdomain

Passwords are hashed with bcrypt, so it's not possible to recover them.

In that case, there are two options:

  • if you prefer edit the DNS record, delete acme.sh information for the domain renewal then issue a new certificate, that will register a new account
  • if you prefer edit the ACME DNS database, encrypt a new password in bcrypt and update it with sqlite

Certificates requests are limited by Let's Encrypt, updating the password avoids to reach that quota.

You can generate a uuid and pass it to bcrypt, or register a new password (and update DNS):

   $ ssh docker-002
   $ export SQLITE_HISTORY=/dev/null
   $ sqlite3 /srv/acme/. lib/acme-dns.db
   UPDATE records SET Password = "<new bcrypt hash>" WHERE Subdomain = "<DNS answer>"

You can generate any arbitrary password with any random tool. It seems UUIDv4 is used by the client code, if you want one, you can use uuidgen -r.

To get password hash with bcrypt, you can use psysh on WindRiver:

  $ psysh
  password_hash("alpha", PASSWORD_BCRYPT)
  $2y$10$N17lnt09YFIoVY4025AmCuGrlmA.ZiThm.RDufcakGq9.3IK4xVcW

Replace $2y$ by $2a$ as x and y are unique for PHP, the rest of the world uses $2a$.

Notes & references