Operations grimoire/Incidents/2017-03-01-Eglide: Difference between revisions
From Nasqueron Agora
No edit summary |
|||
Line 16: | Line 16: | ||
== Actionables == | == Actionables == | ||
* Get operations squad contact information on file ([https://devcentral.nasqueron.org/T1164 T1164]) | * Get operations squad contact information on file ([https://devcentral.nasqueron.org/T1164 T1164])<ref name="Restricted task" group="Note">This task has been restricted to ops for privacy or security concerns.</ref> | ||
* Ensure Scaleway account is accessible ([https://devcentral.nasqueron.org/T1165 T1165]) | * Ensure Scaleway account is accessible ([https://devcentral.nasqueron.org/T1165 T1165])<ref name="Restricted task" group="Note" /> | ||
* [DONE] Enable Odderon service ([https://devcentral.nasqueron.org/T1163 T1163]) | * [DONE] Enable Odderon service ([https://devcentral.nasqueron.org/T1163 T1163]) | ||
<references /> |
Revision as of 13:06, 6 March 2017
Tracked at https://devcentral.nasqueron.org/T1162.
Incident timeline
- 03:18:56 amj weechat client timeouts on Freenode
- 03:19:05 Odderon timeouts too
- 04:22:47 tomjerr asks if it's down
- 19:52:41 Sandlayth rebooted the server
After the incident, it was noticed Odderon didn't automatically connect:
- 21:08:05 Odderon joins channel
Analysis
Outage root cause isn't known, logs doesn't contain any relevant information.
A simple reboot was enough to resume service, but 16 hours was needed to reach Sandlayth, alone with credentials to do it.
Actionables
- Get operations squad contact information on file (T1164)[Note 1]
- Ensure Scaleway account is accessible (T1165)[Note 1]
- [DONE] Enable Odderon service (T1163)
Cite error: <ref>
tags exist for a group named "Note", but no corresponding <references group="Note"/>
tag was found