On April 24th we had a brief interruption on one of our backbone connections that made it appear as if Winhost had dropped off the map.
That interruption, outage, glitch or whatever you want to call it, raised a lot of questions that I thought I could use this opportunity to answer.
1) How could this happen?
Every data center is connected to the Internet through high capacity connections called backbone connections. The “backbone” of the Internet is a group of high capacity providers called tier 1 providers.
Tier 1 providers are pretty reliable, they have to be or the Internet wouldn’t work. But they still have problems from time to time. A cut fiber on a construction site, a natural disaster or power outage, someone flipping the wrong switch – all of these things can cause an outage on a backbone connection.
2) Why don’t you have a backup in place?
We do. We have two backbone connections to our servers, provided by different companies. Normally the traffic in and out of the servers is balanced between those two connections using a number of network analyzing tools and a lot of routers and switches.
So if one connection is dropped, everyone whose traffic has been routed through that connection is cut off. The other half of the traffic, coming in on the other backbone connection, doesn’t experience a problem. That’s what happened on the 24th.
If there was an extended outage on one of the connections we could switch all traffic to the working connection. Making that switch (and then switching back when the problem is solved) is not a trivial matter though, so we wouldn’t do it unless we anticipated a long outage on the connection that was down.
A long outage on a backbone connection is rare though, so rerouting all the traffic is usually unnecessary.
3) Why don’t you post the outage on your site or in the forum?
Anyone affected by the outage wouldn’t be able to see our site or the forum, since they can’t access anything on our network.
We reacted and responded on Google Plus, Twitter and Facebook, which is probably more effective than an outage post somewhere on our site or on a status site somewhere (that no one knows how to get to).
Things like this are part and parcel of life on the Internet. Any provider who tells you they can host your site and there will never be an outage of any kind isn’t telling you the truth. All of these things (even the mighty, mystical cloud) run on hardware. And hardware is just machines and machines don’t run perpetually without problems.
When they invent machines that do run forever without problems, we’ll be first in line to buy them. I can guarantee that. 😉
Until then, we’ll continue to provide the best service your money can buy, and be open and honest about actual and potential problems.