Operations grimoire/PostgreSQL: Difference between revisions

From Nasqueron Agora
No edit summary
(Upgrade notes)
Line 102: Line 102:
| A || Airflow > airflow_scheduler || On Dwellers, docker restart airflow_scheduler
| A || Airflow > airflow_scheduler || On Dwellers, docker restart airflow_scheduler
|}
|}
== Upgrade ==
When upgrading PostgreSQL the following checks must be done:
* Do we have the postgres -contrib port installed too for the pg_trgm module?
* Do we have XML support for FANTOIR?
There are several methods to upgrade a server:
* the pg_dumpall method has been successfully tested on WindRiver for 15 -> 16


== Clusters ==
== Clusters ==

Revision as of 12:18, 17 August 2024

PostgreSQL is available as a standalone role and can also be enabled on devserver.

Applications on the Docker PaaS can use our PostgreSQL infrastructure (Airflow, datasources) or have their own container (Sentry, with custom wal2json extension). This resource documents our own db- servers.

Howto

Open a console

Use peer authentication from the postgres user: sudo -u postgres psql

Run queries against database

The important things are:

  1. prepare the operation with a .sql file
  2. put that SQL file to a task
  3. avoid data destruction and allow rollback
    1. if a table needs to be dropped, we can rename it to _unused
    2. if an application doesn't use a column anymore, we can rename it from foo to foo_unused so we can rollback by renaming back
    3. actual deletion can be done some weeks later

Recommend procedure:

  1. Create a task on DevCentral or reuse an existing deployment task
  2. Upload the .sql file on DevCentral or to a repository:
    1. One-shot maintenance can go to https://devcentral.nasqueron.org/paste/ or sent with cat foo.sql | arc paste --title foo.sql --
    2. Regular queries can go to the reports repository
  3. Link it in the task
  4. Run it

To run foo.sql against acme database:

  1. upload the foo.sql file to the relevant server (for example db-A-001 for db-A cluster)
  2. sudo -u postgres psql acme < foo.sql

Create a new database or user

It's a simple two steps process:

  1. Edit the relevant pillar file, for example pillar/dbserver/cluster-A.sls in rOPS
  2. Deploy the change: salt db-A-001 state.apply roles/dbserver-pgsql

You generally need:

  • an user block in dbserver_postgresql.users
    • block title is the username
    • a password needs to be set in Flow, key is under ops/secrets/
  • a database block
  • a connection triplet db, user, ips (use /32 for an unique IP)

By default, external connections are NOT enabled for any user and database, an entry for pg_hba.conf MUST be added in connections part.

Servers don't have a public ICANN IP, so you can only connect from other Nasqueron servers.

Grant privileges to an user

Some applications don't set privileges, even if they own the database. That's for example the case of Orbeon Forms DDL.

To determine if acme has already the rights on the database:

   SELECT grantor, grantee, table_schema, table_name, privilege_type
   FROM information_schema.table_privileges
   WHERE grantee = 'acme';

If not:

   GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO acme;

This set of privileges can be handled directly by Salt in cluster privileges definition:

     privileges:
       - database: <database name>
         scope: table
         schema: public
         tables:
           - ALL
         privileges:
           - ALL

See D3252 or D3253 for an example.

As a known issue, this query will ALWAYS be executed by Salt.

Sequences privileges

Sequences aren't included in that GRANT ALL PRIVILEGES. We can generate GRANT queries for them:

   SELECT 'GRANT USAGE ON SEQUENCE ' || sequence_schema || '.' || sequence_name || ' TO "acme";'
   FROM information_schema.sequences
   WHERE sequence_schema NOT LIKE 'pg_%'
   AND sequence_schema != 'information_schema';

What to do after an update of PostgreSQL servers?

The following services need their connection to explicitly be restarted:

Services to tweak
Cluster Service Procedure
A Airflow > airflow_triggerer On Dwellers, docker restart airflow_triggerer
A Airflow > airflow_scheduler On Dwellers, docker restart airflow_scheduler

Upgrade

When upgrading PostgreSQL the following checks must be done:

  • Do we have the postgres -contrib port installed too for the pg_trgm module?
  • Do we have XML support for FANTOIR?

There are several methods to upgrade a server:

  • the pg_dumpall method has been successfully tested on WindRiver for 15 -> 16

Clusters

Letters can be discontinuous: for example, B will be a MySQL cluster.

Cluster A

A is the general cluster, for Nasqueron services. It currently has one server, db-A-001.

Databases
Database Managed by Description
airflow Nasqueron Ops Workflows runner, used by datasources
fantoir Datasources PostgreSQL version of FANTOIR for geocoding

Troubleshoot

An application stops to work correctly with PostgreSQL

Issues with queries are logged to /var/log/messages:

   $ tail -n100 -f /var/log/messages | grep postgres

For example, T1943 issue was found with this line:

   Jan 14 23:13:58 db-A-001 postgres[35351]: [8-1] 2024-01-14 23:13:58.081 UTC [35351] ERROR:  unsupported XML feature

Unsupported XML feature

For Orbeon, XML support is required. That's not by default the case on FreeBSD, the port needs to be built manually. That's the case when provisioning a new server through rOPS: roles/dbserver-pgsql/server/build.sls but upgrade can break this port.

If so, rebuild the package. For that, you can follow the instructions of Operations grimoire/FreeBSD, section PostgreSQL.

Some postgres_privileges.present are always changed by Salt

Some ON ALL TABLE privilege queries are always run by the Salt PostgreSQL execution module. For example:

         ID: dbserver_pgsql_user_orbeon_privilege_2_ALL
   Function: postgres_privileges.present
       Name: orbeon
     Result: True
    Comment: The privilege(s): ALL have been granted to orbeon
    Started: 21:54:53.656120
   Duration: 284.394 ms
    Changes:   
             ----------
             orbeon:
                 Present

Those statements aren't idempotent. This is a known issue, and doesn't affect deployment.