Operations grimoire/TLS certificates: Difference between revisions
(Installation on Dwellers. T1505 sync.) |
No edit summary |
||
(5 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
== Let's Encrypt == | == Let's Encrypt == | ||
=== | === Certbot commands === | ||
'''acme-v02 migration.''' If you've a complaint acme-v01.api. isn't available, add <code>--server https://acme-v02.api.letsencrypt.org/directory</code>. | '''acme-v02 migration.''' If you've a complaint acme-v01.api. isn't available, add <code>--server https://acme-v02.api.letsencrypt.org/directory</code>. | ||
Line 9: | Line 9: | ||
==== Generate a certificate ==== | ==== Generate a certificate ==== | ||
<code>certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto -d foo.nasqueron.org</code> | <code>certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "service nginx reload" -d foo.nasqueron.org</code> | ||
<code>certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "systemctl reload nginx" -d foo.nasqueron.org</code> | |||
==== Generate a certificate for several sites ==== | ==== Generate a certificate for several sites ==== | ||
Line 52: | Line 54: | ||
ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem; | ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem; | ||
ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem; | ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem; | ||
==== Edit renewal hook ==== | |||
In /etc/letsencrypt/renewal or /usr/local/etc/letsencrypt/renewal you can edit this section to customize the command to run after renewal: | |||
[renewalparams] | |||
renew_hook = systemctl reload nginx | |||
That can be applied in batch when nothing is set: <code><nowiki>gsed -i "s/\[\[webroot_map]]/renew_hook = systemctl reload nginx\n\n[[webroot_map]]/g" *.conf</nowiki></code> | |||
=== Install certbot on CentOS 10 stream === | === Install certbot on CentOS 10 stream === | ||
Line 114: | Line 124: | ||
StartSSL is not in activity anymore. It was used at Nasqueron when Let's Encrypt didn't support IDN. | StartSSL is not in activity anymore. It was used at Nasqueron when Let's Encrypt didn't support IDN. | ||
== Troubleshoot == | |||
=== Acme DNS === | |||
==== Can't reach ACME DNS API / HTTP Error 403: Forbidden ==== | |||
AcmeDNS API access is restricted to IPs addresses set in {{Ops file|roles/webserver-core/nginx/files/includes/geo_nasqueron}}. | |||
If you got a 403 by running acme.sh (only visible in --debug mode): | |||
* If the IP address is missing from geo_nasqueron, add it | |||
* If the IP address is already there, check on the server if /etc/nginx/includes/geo_nasqueron matches | |||
** Yes -> nginx must be reloaded (nginx -t reload) | |||
** No -> Deploy with <code>salt docker-002 state.sls_id /etc/nginx/includes/geo_nasqueron roles/webserver-core/nginx</code> | |||
Another solution is a specific routing to reach the API: | |||
For Hervil, add a route to use router-001 gateway: route add 51.255.124.8/30 172.27.27.1 | |||
==== Troubleshoot records in ACME DNS database ==== | |||
To check the acme.nasqueron.org subdomain matching your _acme-challenge subdomain: <code>nslookup -type=TXT _acme-challenge.<subdomain>.nasqueron.org</code> | |||
The DNS server to store acme.nasqueron.org records uses a sqlite3 database. To connect and check your parameters: | |||
$ ssh docker-002 | |||
$ sqlite3 /srv/acme/lib/acme-dns.db | |||
SELECT Username, Password FROM records WHERE Subdomain = "<DNS answer>" | |||
==== Credential lost for a ACME DNS subdomain ==== | |||
Passwords are [https://github.com/joohoi/acme-dns/blob/9c6ca258e1d57e7441b60db4474a68c36356dae2/validation.go hashed with bcrypt], so it's not possible to recover them. | |||
In that case, there are two options: | |||
* if you prefer edit the DNS record, delete acme.sh information for the domain renewal then issue a new certificate, that will register a new account | |||
* if you prefer edit the ACME DNS database, encrypt a new passw@ord in bcrypt and update it with sqlite | |||
Certificates requests are limited by Let's Encrypt, updating the password avoids to reach that quota. | |||
You can generate a uuid and pass it to bcrypt, or register a new password (and update DNS): | |||
$ ssh docker-002 | |||
$ sqlite3 /srv/acme/. lib/acme-dns.db | |||
UPDATE records SET Password = "<new bcrypt hash>" WHERE Subdomain = "<DNS answer>" | |||
To get password hash with bcrypt, you can use psysh on WindRiver: | |||
$ psysh | |||
password_hash("alpha", PASSWORD_BCRYPT) | |||
$2y$10$N17lnt09YFIoVY4025AmCuGrlmA.ZiThm.RDufcakGq9.3IK4xVcW | |||
Replace $2y$ by $2a$ as x and y are unique for PHP, the rest of the world uses $2a$. | |||
== Notes & references == | == Notes & references == | ||
<references /> | <references /> |
Latest revision as of 22:25, 20 October 2024
TLS certificates should be used for every service we provide, so we can encrypt the communications.
Let's Encrypt
Certbot commands
acme-v02 migration. If you've a complaint acme-v01.api. isn't available, add --server https://acme-v02.api.letsencrypt.org/directory
.
T1505 alignment. We don't use anymore Certbot as a Docker container on paas-docker role. As such, all paths are now unified. Symlinks have been added to old /srv paths to help transition.
Generate a certificate
certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "service nginx reload" -d foo.nasqueron.org
certbot certonly -a webroot --webroot-path=/var/letsencrypt-auto --deploy-hook "systemctl reload nginx" -d foo.nasqueron.org
Generate a certificate for several sites
-d foo.nasqueron.org -d bar.nasqueron.org
If a certificate for foo already existed, it will offer to extend it to a new alternative name, which is probably a good idea.
Generate a certificate through DNS
DNS can be used to generate certificates for domains. For example, the Openfire XMPP certificate is generated like this:
certbot certonly --server https://acme-v02.api.letsencrypt.org/directory --manual --manual-auth-hook /etc/letsencrypt/acme-dns-auth --preferred-challenges dns --debug-challenges -d xmpp.nasqueron.org -d nasqueron.org -d conference.nasqueron.org
From December 2023, you can use this command on FreeBSD servers where T1505 Let's Encrypt standard installation have been deployed:
certbot certonly --manual --manual-auth-hook /usr/local/etc/letsencrypt/acme-dns-auth --preferred-challenges dns --debug-challenges -d subdomain.nasqueron.org
This uses a specialized DNS server deployed on our Docker PaaS to serve dynamic TLS records under .acme.nasqueron.org, rOPS: roles/paas-docker/containers/acme_dns.sls.
Support files were previously only deployed to paas-docker role through rOPS: roles/paas-docker/letsencrypt/files.
Renew all certificates
certbot renew
Installation on nginx
Allow Let's encrypt validation
include includes/letsencrypt;
This will use rOPS: roles/webserver-core/nginx/files/includes/letsencrypt nginx configuration.
Serve TLS certificate
include includes/tls; ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem; ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem;
This will configure a compromise between security and compatibility, based on Intermediate Mozilla SSL config[1]. The current configuration is served by rOPS: roles/webserver-core/nginx/files/includes/tls.
If you prefer to restrict the resource for TLS 1.3, and accepts to block legacy clients, you can also use with D3251:
include includes/tls-modern-only; ssl_certificate /srv/letsencrypt/etc/live/xmpp.nasqueron.org/fullchain.pem; ssl_certificate_key /srv/letsencrypt/etc/live/xmpp.nasqueron.org/privkey.pem;
Edit renewal hook
In /etc/letsencrypt/renewal or /usr/local/etc/letsencrypt/renewal you can edit this section to customize the command to run after renewal:
[renewalparams] renew_hook = systemctl reload nginx
That can be applied in batch when nothing is set: gsed -i "s/\[\[webroot_map]]/renew_hook = systemctl reload nginx\n\n[[webroot_map]]/g" *.conf
Install certbot on CentOS 10 stream
Packages for Python 3.12 currently used as of 2024-08-04 on CentOS 10 stream can be fetched from Fedora 40 repository.
Installation on Dwellers was done like this:
mkdir /tmp/certbot && cd /tmp/certbot wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/c/certbot-2.9.0-1.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-acme-2.9.0-1.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-certbot-2.9.0-1.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-configargparse-1.7-3.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-josepy-1.13.0-8.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-parsedatetime-2.6-12.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pyOpenSSL-23.2.0-3.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pyrfc3339-1.1-18.fc40.noarch.rpm wget https://dl.fedoraproject.org/pub/fedora/linux/releases/40/Everything/x86_64/os/Packages/p/python3-pytz-2024.1-1.fc40.noarch.rpm dnf install ./*.rpm
Nasqueron PKI
Principles
Internal PKI material are stored in Vault, see Operations grimoire/Vault for information about how to generate, renew, etc.
In the operations repository, the rOPS: roles/core/certificates unit is responsible to deploy nasqueron-vault-ca.crt
, used to authenticate for *.nasqueron.drake or IP services. For exemple, Vault clients require a certificate they can validate with a chain from that intermediate CA. Note as we don't currently use the root CA, a fullchain is probably not needed for those services.
Rollover a new certificate
If nasqueron-vault-ca.crt
is updated or if we want instead to provision the root certificate, the following repository need to be updated and services to be redeployed:
Repository | Path | Purpose | Deployment instructions |
---|---|---|---|
operations | rOPS: roles/core/certificates/files/ | Trust internal resources on every server | Salt: salt '*' state.apply roles/core/certificates
|
docker-airflow | files/ | Connect to Vault from Airflow Python code | Docker: build and deploy new image nasqueron/airflow on Dwellers |
Deployment instructions are up-to-date when newly added on that table. Afterwards, they should be checked against current procedures.
Special considerations
New server
Let's encrypt client is available on Ysul (natively) and Dwellers (as a wrapper script for a Docker container).
Fill a task in Servers component, subscribe Sandlayth and Dereckson to deploy it on a new server.
A salt state would be nice for such purpose. There is work-in-progress on that matter through D3248.
Internationalized domain names
Punycode conversion
Both for web server configuration and certificate authority, name must be converted to Punycode (RFC 3492): https://www.punycoder.com/
Let's encrypt support
Let's encrypt has supported IDN since 2016[2]. We use it for dægrefn.nasqueron.org certificate.
Previously, they were afraid: attackers could register a domain with a Cyrillic character matching a real domains. As some people consider it's the responsibility of the CA to mitigate such risks, the feature has been several times postponed.
StartSSL
StartSSL is not in activity anymore. It was used at Nasqueron when Let's Encrypt didn't support IDN.
Troubleshoot
Acme DNS
Can't reach ACME DNS API / HTTP Error 403: Forbidden
AcmeDNS API access is restricted to IPs addresses set in rOPS: roles/webserver-core/nginx/files/includes/geo_nasqueron.
If you got a 403 by running acme.sh (only visible in --debug mode):
- If the IP address is missing from geo_nasqueron, add it
- If the IP address is already there, check on the server if /etc/nginx/includes/geo_nasqueron matches
- Yes -> nginx must be reloaded (nginx -t reload)
- No -> Deploy with
salt docker-002 state.sls_id /etc/nginx/includes/geo_nasqueron roles/webserver-core/nginx
Another solution is a specific routing to reach the API: For Hervil, add a route to use router-001 gateway: route add 51.255.124.8/30 172.27.27.1
Troubleshoot records in ACME DNS database
To check the acme.nasqueron.org subdomain matching your _acme-challenge subdomain: nslookup -type=TXT _acme-challenge.<subdomain>.nasqueron.org
The DNS server to store acme.nasqueron.org records uses a sqlite3 database. To connect and check your parameters:
$ ssh docker-002 $ sqlite3 /srv/acme/lib/acme-dns.db SELECT Username, Password FROM records WHERE Subdomain = "<DNS answer>"
Credential lost for a ACME DNS subdomain
Passwords are hashed with bcrypt, so it's not possible to recover them.
In that case, there are two options:
- if you prefer edit the DNS record, delete acme.sh information for the domain renewal then issue a new certificate, that will register a new account
- if you prefer edit the ACME DNS database, encrypt a new passw@ord in bcrypt and update it with sqlite
Certificates requests are limited by Let's Encrypt, updating the password avoids to reach that quota.
You can generate a uuid and pass it to bcrypt, or register a new password (and update DNS):
$ ssh docker-002 $ sqlite3 /srv/acme/. lib/acme-dns.db UPDATE records SET Password = "<new bcrypt hash>" WHERE Subdomain = "<DNS answer>"
To get password hash with bcrypt, you can use psysh on WindRiver:
$ psysh password_hash("alpha", PASSWORD_BCRYPT) $2y$10$N17lnt09YFIoVY4025AmCuGrlmA.ZiThm.RDufcakGq9.3IK4xVcW
Replace $2y$ by $2a$ as x and y are unique for PHP, the rest of the world uses $2a$.