Single Point Of Failure: The Internet

No matter how well you architect for redundancy and availability, there will certainly be single points of failure (SPOF) that you can't account for. The SPOF, in all of its forms, can make application mobility via VDI, cloud services, and netbooks like Google's Chromebook less attractive computing options when compared to fat clients, fat servers, and boring but reliable storage. While services can fail, let's not forget that the most frequent SPOF we deal with is the access networks at the Int

Mike Fratto

May 31, 2011

4 Min Read
Network Computing logo

"If you call something by a single a name, it is a single point of failure," George Reese, CTO of enStratus, has said. I think of that statement as a mnemonic for how IT should think about redundancy. No matter how well you architect for redundancy and availability, there will certainly be single points of failure (SPOF) that you can't account for.

Whether the SPOF is some feature that is overlooked, a system outside of your control or a more complex issue--like the one Amazon found when its Elastic Block Storage service melted down, causing a number of service outages. The SPOF, in all of its forms, can make application mobility via virtual desktop infrastructure (VDI), cloud services and netbooks such as Google's Chromebook less attractive computing options when compared with fat clients, fat servers and boring but reliable storage. While services can fail, let's not forget that the most frequent SPOF we deal with is the access networks at the Internet edge.

I am often shocked when a SPOF rears its ugly head. When I started writing this blog, my Verizon FiOS was down because of a fiber cut affecting Syracuse and Binghamton. [[Correction: Verizon told me on June 3rd that the fiber was part of a ring that stretched from New York city through Broome County. The other side of the ring was taken out the day before by a manhole fire and was under repair. No Verizon services were affected. Once the other side was cut on Thursday by stormy weather, Verizon and First Energy, who they leased the fiber from, had to wait to the all clear from the utility and emergency services to repair the break.]] If the outage affected just Internet access, that wouldn't be horrible, because I still have fat clients running on fat computers. But phone service was affected, too. Ironically, I couldn't call Verizon tech support on my home phone because the service was down, nor could I call 911. Granted, in the eight years I have had Verizon services and the six years with FiOS, we have experienced only one other service outage, so uptime has been good. However, a SPOF that takes out all services for a couple of hundred thousand customers [[Correction: Verizon told me about 24,000 customers were affected]], including emergency services, should never happen.

The FiOS outage had a number of other ill affects. I couldn't synchronize my critical files with cloud services like Dropbox or Microsoft's Skydrive. I wasn't able to access Google Apps, nor Microsoft's OfficeLive. I couldn't connect to any wiki that my company uses internally or externally. I was pretty much reduced to doing tasks using whatever tools and information I had at hand.

I am lucky that I have had only two outages in eight years. At least one co-worker a month announces that he or she has lost Internet access and therefore will be offline for an extended period. At least I have an Android phone and can get email and limp along with web access when needed. I couldn't have tethered my laptop to my phone since I don't pay for tethering and I am unwilling to violate Verizon's terms of service.There are lots of ways to implement system redundancy. We can do it at the appliance level, at the network level, at the software level, at the metro level or the geographic level, but the hardest component to make redundant and available is the access edge. I chuckle when I hear usually reasonable people say that with the proliferation of 802.11 Wi-Fi, mobile users will have less dependency on 3G/4G carrier data services.

What better indicator than this disclaimer by Google on its Chromebook page: "Obviously, you're going to need a wireless network, be willing to use it subject to the provider's terms and conditions, and be ready to put up with its real life limitations including, for example, its speed and availability. When you do not have network access, functionality that depends on it will not be available."

I don't know about you, but unless I am in an urban area, the chances of finding an open wireless access point are slim. If I do find one, the performance is so poor that using it is more frustrating than accepting the fact that I can't get online. Even in an urban area, I am better off ponying up for a Wi-Fi service so I can avoid sitting in a cramped coffee shop with my kit precariously balanced on a small, wobbly table. By the way, I consider myself a typical business user with relatively modest needs. I know others have faced the same set of issues.

There are some cool and useful cloud services coming from big software vendors like Google and Microsoft, as well as a bunch of smaller services like Zoho that provide enterprise-grade, cloud-hosted software packages. Their services are redundant. Their network connections are redundant. The Internet is redundant. But I can point to the access edge and call it by a single name. That, my friends, is a single point of failure.

About the Author(s)

Mike Fratto

Former Network Computing Editor

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights