Operations grimoire/PostgreSQL: Difference between revisions
DorianWinty (talk | contribs) No edit summary |
|||
(One intermediate revision by the same user not shown) | |||
Line 102: | Line 102: | ||
| A || Airflow > airflow_scheduler || On Dwellers, docker restart airflow_scheduler | | A || Airflow > airflow_scheduler || On Dwellers, docker restart airflow_scheduler | ||
|} | |} | ||
== Upgrade == | |||
When upgrading PostgreSQL the following checks must be done: | |||
* Do we have the postgres -contrib port installed too for the pg_trgm module for FANTOIR? | |||
* Do we have XML support for Orbeon? | |||
There are several methods to upgrade a server: | |||
* the pg_dumpall method has been successfully tested on WindRiver for 15 -> 16 | |||
== Clusters == | == Clusters == |
Latest revision as of 12:20, 17 August 2024
PostgreSQL is available as a standalone role and can also be enabled on devserver.
Applications on the Docker PaaS can use our PostgreSQL infrastructure (Airflow, datasources) or have their own container (Sentry, with custom wal2json extension). This resource documents our own db- servers.
Howto
Open a console
Use peer authentication from the postgres user: sudo -u postgres psql
Run queries against database
The important things are:
- prepare the operation with a .sql file
- put that SQL file to a task
- avoid data destruction and allow rollback
- if a table needs to be dropped, we can rename it to _unused
- if an application doesn't use a column anymore, we can rename it from foo to foo_unused so we can rollback by renaming back
- actual deletion can be done some weeks later
Recommend procedure:
- Create a task on DevCentral or reuse an existing deployment task
- Upload the .sql file on DevCentral or to a repository:
- One-shot maintenance can go to https://devcentral.nasqueron.org/paste/ or sent with
cat foo.sql | arc paste --title foo.sql --
- Regular queries can go to the reports repository
- One-shot maintenance can go to https://devcentral.nasqueron.org/paste/ or sent with
- Link it in the task
- Run it
To run foo.sql against acme database:
- upload the foo.sql file to the relevant server (for example db-A-001 for db-A cluster)
sudo -u postgres psql acme < foo.sql
Create a new database or user
It's a simple two steps process:
- Edit the relevant pillar file, for example pillar/dbserver/cluster-A.sls in rOPS
- Deploy the change:
salt db-A-001 state.apply roles/dbserver-pgsql
You generally need:
- an user block in dbserver_postgresql.users
- block title is the username
- a password needs to be set in Flow, key is under ops/secrets/
- a database block
- a connection triplet db, user, ips (use /32 for an unique IP)
By default, external connections are NOT enabled for any user and database, an entry for pg_hba.conf MUST be added in connections part.
Servers don't have a public ICANN IP, so you can only connect from other Nasqueron servers.
Grant privileges to an user
Some applications don't set privileges, even if they own the database. That's for example the case of Orbeon Forms DDL.
To determine if acme has already the rights on the database:
SELECT grantor, grantee, table_schema, table_name, privilege_type FROM information_schema.table_privileges WHERE grantee = 'acme';
If not:
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO acme;
This set of privileges can be handled directly by Salt in cluster privileges definition:
privileges: - database: <database name> scope: table schema: public tables: - ALL privileges: - ALL
See D3252 or D3253 for an example.
As a known issue, this query will ALWAYS be executed by Salt.
- Sequences privileges
Sequences aren't included in that GRANT ALL PRIVILEGES. We can generate GRANT queries for them:
SELECT 'GRANT USAGE ON SEQUENCE ' || sequence_schema || '.' || sequence_name || ' TO "acme";' FROM information_schema.sequences WHERE sequence_schema NOT LIKE 'pg_%' AND sequence_schema != 'information_schema';
What to do after an update of PostgreSQL servers?
The following services need their connection to explicitly be restarted:
Cluster | Service | Procedure |
---|---|---|
A | Airflow > airflow_triggerer | On Dwellers, docker restart airflow_triggerer |
A | Airflow > airflow_scheduler | On Dwellers, docker restart airflow_scheduler |
Upgrade
When upgrading PostgreSQL the following checks must be done:
- Do we have the postgres -contrib port installed too for the pg_trgm module for FANTOIR?
- Do we have XML support for Orbeon?
There are several methods to upgrade a server:
- the pg_dumpall method has been successfully tested on WindRiver for 15 -> 16
Clusters
Letters can be discontinuous: for example, B will be a MySQL cluster.
Cluster A
A is the general cluster, for Nasqueron services. It currently has one server, db-A-001.
Database | Managed by | Description |
---|---|---|
airflow | Nasqueron Ops | Workflows runner, used by datasources |
fantoir | Datasources | PostgreSQL version of FANTOIR for geocoding |
Troubleshoot
An application stops to work correctly with PostgreSQL
Issues with queries are logged to /var/log/messages:
$ tail -n100 -f /var/log/messages | grep postgres
For example, T1943 issue was found with this line:
Jan 14 23:13:58 db-A-001 postgres[35351]: [8-1] 2024-01-14 23:13:58.081 UTC [35351] ERROR: unsupported XML feature
Unsupported XML feature
For Orbeon, XML support is required. That's not by default the case on FreeBSD, the port needs to be built manually. That's the case when provisioning a new server through rOPS: roles/dbserver-pgsql/server/build.sls but upgrade can break this port.
If so, rebuild the package. For that, you can follow the instructions of Operations grimoire/FreeBSD, section PostgreSQL.
Some postgres_privileges.present are always changed by Salt
Some ON ALL TABLE privilege queries are always run by the Salt PostgreSQL execution module. For example:
ID: dbserver_pgsql_user_orbeon_privilege_2_ALL Function: postgres_privileges.present Name: orbeon Result: True Comment: The privilege(s): ALL have been granted to orbeon Started: 21:54:53.656120 Duration: 284.394 ms Changes: ---------- orbeon: Present
Those statements aren't idempotent. This is a known issue, and doesn't affect deployment.