Resolved

We have identified the problem.

Today we rolled out Canada LRN. One of our replica servers did not accept the update and became lagged. Because of this threshold was breached which caused traffic to fail over cross continent. This effectively caused a brown-out due to query amplification.

We have will be changing our failure strategy to blackout allowing the requests to route directly to a fail over zone and minimize amplification.

We will continue monitoring, however we believe that this incident is now resolved.

Avatar for

We have stabilized the network, however we are still seeing hot spot loading against our databases. We will leave this ticket open whilst we investigate further.

Avatar for
Ongoing

We have observed a number of issues on the network including call failures, We are currently investigating.

Avatar for
Began at:

Affected components
  • SIP Servers
  • Database Servers