Website outage

The WarLight.net website is experiencing an outage (EDIT: It’s back up!). I’m working on getting it back up as fast as possible. More details will be posted soon. Sorry for the inconvenience.

UPDATE 10pm GMT-8: The main WarLight database has suffered a hardware failure. Rackspace engineers are currently working to recover the data onto a new server. As a backup plan, I am also restoring the latest database backup onto a fresh server, just in case Rackspace is unable to restore the old server. The latest backup I have was taken about 4 hours before the outage, so I’m hoping it does not need to be used.

UPDATE 5am GMT-8: I’m really mad at Rackspace. For the last ten hours all they’ve given me is “We can’t give you an ETA” over and over. Finally, they tell me it’s 95% done being restored and will be up within 2 hours.

UPDATE 6:49am GMT-8: The website is now functional again. Unfortunately, there was a few hours rollback in the database. Hopefully players who happened to commit turns within that period will be respectful and try to commit the same orders they played the first time. I’m going to be taking steps to ensure this can’t happen again.

Additionally, for the next several hours, booting in all games will be disabled (you’ll get an error message if you try to boot.) This will allow players who couldn’t take their turn before the boot timer to get it in before getting booted.

WarLight’s database has had hardware failures before, however Rackspace typically will automatically move your server to a fresh machine with only a few minutes of downtime. This has happened several times, actually, and it’s never been a problem before now. This time it didn’t work for some reason, and to make matters worse, their support personnel were very unhelpful. I had gotten complacent on relying on Rackspace’s automated failover. Unfortunately, when the failover finally did happen after several hours, the database was corrupt beyond repair and I had to go with my own backup anyway.

To ensure this can’t happen again, I am going to set up a continuously streaming backup system that will backup the database several times per minute. This way, if there’s another failure, I don’t have to rely on waiting for Rackspace’s failover and I can restore from my own backup without significant data loss. Previously, the backup was only made every few hours, so this is a significant jump in reliability.

I realize the problems that outages cause, and I’m probably more disappointed by this than any of you. I take this very seriously as it’s my full-time job and I am going to take several steps to ensure this can’t happen again.

UPDATE: As mentioned above, booting is temporarily not allowed in any games. This gives a chance for players who are over the boot time in their games and couldn’t play due to the outage to take their turns. I realize this makes it difficult to play new real-time games, but please bear with me as we get everything back to working normally.

UPDATE: Booting is now re-enabled.

14 thoughts on “Website outage”

    1. There shouldn’t be, although there may be strange things happening as the backup is restored, e.g. finished games may magically be open again etc.

  1. We know you are doing your best and I am sure it will work out just fine. Going Warlight cold turkey is worse an I thought. Merry Christmas

  2. Hey,

    GMT 14.47pm
    Warlight back online, but unable to access single player/ mutiplayer games at all. Blank page par the bar across the top when I try to click on them.

    Can access profile, settings, etc… And maps as well. Just not games.

    Hope you can fix it soon!

    Sitrix

  3. going to Iowa to visit the family, i’ll check back when i get there. Good luck with all the repairs. Happy Holidays everyone sincerely, The Magician -poof-

  4. There is a certain player going by the nickname “Pink Floyd” who has been constantly joining games only to then abandon them and make no move indefinitely. He is by his own admission doing this on purpose as can be easily verified from chat logs, taking advantage of the booting immunity to disrupt the community’s gaming experience. It would be pleasant if he was dealt with appropriately.

    1. If you click on his name in the lower-right corner, you’ll find a button labeled “Report”. That’s the proper way to report bad behavior – posting comments on the blog isn’t going to get it looked at, reporting it will.

Leave a Reply

Your email address will not be published. Required fields are marked *


*