Server Canyon - Reimagining Web Infrastructure

Service Interruption Report (8/5/2016)

by judahnator 0 Comments

What happened:

At approximately 5:00AM PST ServerCanyon was alerted to severe performance degradation. The cause as to why is still unknown, but for transparency here is a timeline of events:

  • 5:30AM, We rebooted our primary server
  • 5:45AM, After realizing a severe problem was preventing the server from booting, the failover server kicked into gear and began to download client backups
  •  6:45AM, 80% of affected clients had been activated on the backup server
  • 7:30AM, We got on the phone with our domain registrar, as the changes to our name servers “A” record were about 45min overdue.
  • 7:40AM, Data recovery began on the old server.
  • 8:00AM, All accounts had been brought back online, changes since the last offsite backup began to sync.

 

What we are doing to prevent this from happening in the future:

  • We increased our in-datacenter backup provision by 50%
  • All accounts will now be backed up daily (5 day retention) and weekly (4 week retention)
  • Over the course of the next several weeks we will be moving clients to a pool of IP addresses that can be routed to new servers
  • We are moving to a more reliable domain registrar

 

If any customer discovers any issues with their website, please contact support immediately!

Leave a reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>