Operations grimoire/Checklist router post-restart: Difference between revisions

From Nasqueron Agora
(Created page with "A router-001 restart severes GRE tunnels and their existing TCP connections. As such, this checklist notes the known issues when private network is disrupted. To partially prevent this, a blue/green deployment or immutable artifacts should be encouraged. It's recommended for the next non urgent upgrade to create router-002 and move GRE connections to it. The checklist should still be checked. == IRC eggdrops == * Dæghrefn: MySQL is normally at periodic interval reco...")
 
(+cats)
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:


To partially prevent this, a blue/green deployment or immutable artifacts should be encouraged. It's recommended for the next non urgent upgrade to create router-002 and move GRE connections to it. The checklist should still be checked.
To partially prevent this, a blue/green deployment or immutable artifacts should be encouraged. It's recommended for the next non urgent upgrade to create router-002 and move GRE connections to it. The checklist should still be checked.
As most services are currently on hyper-001, they don't need the router to communicate, so lot of things will still be working.


== IRC eggdrops ==
== IRC eggdrops ==


They require a GRE tunnel between server with viperserv role and the router to connect to both complector (Vault) and a db-B server (MariaDB).
On Libera:
* Dæghrefn: MySQL is normally at periodic interval reconnected, but you can trigger this with <code>.tcl sqlrehash</code>
* Dæghrefn: MySQL is normally at periodic interval reconnected, but you can trigger this with <code>.tcl sqlrehash</code>
* Wearg: MySQL does NOT reconnect, and that will probably exit from the timer
* Wearg: MySQL does NOT reconnect, and that will probably exit from the timer
Line 14: Line 19:


[https://devcentral.nasqueron.org/T1862 T1862] offers to detect MySQL issues and reconnect.
[https://devcentral.nasqueron.org/T1862 T1862] offers to detect MySQL issues and reconnect.
[[Category:Operations grimoire]]
[[Category:IRC bots]]

Latest revision as of 20:36, 26 October 2024

A router-001 restart severes GRE tunnels and their existing TCP connections.

As such, this checklist notes the known issues when private network is disrupted.

To partially prevent this, a blue/green deployment or immutable artifacts should be encouraged. It's recommended for the next non urgent upgrade to create router-002 and move GRE connections to it. The checklist should still be checked.

As most services are currently on hyper-001, they don't need the router to communicate, so lot of things will still be working.

IRC eggdrops

They require a GRE tunnel between server with viperserv role and the router to connect to both complector (Vault) and a db-B server (MariaDB).

On Libera:

  • Dæghrefn: MySQL is normally at periodic interval reconnected, but you can trigger this with .tcl sqlrehash
  • Wearg: MySQL does NOT reconnect, and that will probably exit from the timer
    • .tcl sqlrehash to reconnect to MySQL
    • .tcl broker::on_tick to reconnect to broker and start timer to get message
    • .tcl utimers to ensure timer is good, you should see a :broker::on_tick timer

T1862 offers to detect MySQL issues and reconnect.