All incidents

Gke Incident

  • 2018-11-09 6:00 AM CET: Unusual number of alerts
  • 2018-11-09 6:08 AM CET: Pingdom services reporting sites down
  • 2018-11-09 6:08 AM CET: On call engineer checks that the GKE update process is ongoing and waits for a few minutes until situation is stabilized
  • 2018-11-09 6:10 AM CET: After a few a minutes on call engineer checks unusual spike of containers restarting and crashinglooping
  • 2018-11-09 6.30 AM CET: the problem is escalated to SRE team
  • 2018-11-09 7:30 AM CET: First strategy is to try to fix the current cluster
  • 2018-11-09 8:00 AM CET: Start called google support engineers
  • 2018-11-09 8:45 AM CET: Started failover process to a new cluster
  • 2018-11-09 11:27 AM CET: Services back online
Flag of European UnionMade in Europe. Privacy by default.