Operations grimoire/Incidents/2025-11-25-MariaDB: Difference between revisions
From Nasqueron Agora
Created page with "MariaDB extended downtime for InnoDB tables. == Incident timeline == ; 2025-11-25 * 10:22 - Deployment of {{D|3890}} - ZFS volumes change for MariaDB to fix {{T|2074}} * 10:22 - MariaDB is restarted * 10:22 - Services using MyISAM tables are OK, but services using InnoDB like wikis are down * 10:24 - Quick investigation shows engine error for InnoDB tables * 10:25 - Unmount and remount ZFS volumes for InnoDB data and logs * 10:25 - Restart MariaDB server, databases are..." |
(No difference)
|
Latest revision as of 11:19, 25 November 2025
MariaDB extended downtime for InnoDB tables.
Incident timeline
- 2025-11-25
- 10:22 - Deployment of D3890 - ZFS volumes change for MariaDB to fix T2074
- 10:22 - MariaDB is restarted
- 10:22 - Services using MyISAM tables are OK, but services using InnoDB like wikis are down
- 10:24 - Quick investigation shows engine error for InnoDB tables
- 10:25 - Unmount and remount ZFS volumes for InnoDB data and logs
- 10:25 - Restart MariaDB server, databases are served correctly again
Timestamps are UTC. Timestamps are an estimation, but 10:22 for MariaDB restart is accurate (from Salt).
Analysis
When we noticed wikis were down, we browsed the following logs:
- /var/log/www/nasqueron.org/wikis-error.log (not relevant, only confirms PHP error 500)
- /var/log/mediawiki/error.log (relevant)
We also previously directly confirmed MediaWiki can connect to database with sudo -u mw agora shell
MediaWiki was showing objectcache table doesn't exist in current engine.
Done with MySQL client to check databases and tables: SHOW DATABASES, \u nasqueron_wiki, SHOW CREATE TABLE objectcache.
Fix
$ /usr/local/etc/rc.d/mysql-server stop
$ umount -f arcology/mysql-innodb-data
$ umount -f arcology/mysql-innodb-logs
$ zfs mount arcology/mysql-innodb-data
$ zfs mount arcology/mysql-innodb-logs
$ /usr/local/etc/rc.d/mysql-server start
Actionables
- Always unmount and remount ZFS volumes if touched, even if only the parent dataset has been updated, don't trust ZFS automount blindly
