OpenFlow is starting to gain some buzz in the industry, with a demonstration at the upcoming Interop show in Las Vegas and vendors starting to adopt the protocol. However, as others begin to learn about OpenFlow and controller-based networking, complaints about single points of failure and single targets of attack get fired off in an almost knee-jerk reaction. Let's stop and take a breath. Single points of failure and single points of attack are common issues in networking and, frankly, have been dealt with in many ways. These objections are non-issues.
Whenever you stand up a new service or project, you, or someone you work with, has to address availability and security issues. And, guess what? The principles are the same, whether you are talking about an application service such as Microsoft Exchange, a network service like a firewall or a network management command-and-control system. If you can, avoid single points of failure through clustering, hot stand-by or some other method. You may be able to do that in the product itself or by using external products. If you must have a single point of failure or attack, make sure that you take steps to reduce the likelihood of failure or attack, and make sure you take steps to reduce recovery time.
I am being vague, I know, but how you do it is based on context, the severity of a service disruption, the likelihood of a service disruption, your management practices, product features, and a bunch of other things.
Look at your network today. Is every host dual-homed to different switches? Are those switches powered by separate power distribution systems with backup? Is each tier of the network fully redundant? I bet the answer to most of these questions is either "no" or "in some places." It probably doesn't make economic sense to have all of your access switches fully redundant. Having a spare switch that can be replaced and running in 15 minutes or less is good enough. In your data center, where a disruption has a larger impact, you are more likely to build on more capacity and redundancy. What you do is all about context.
What about your servers? You know how to run mission-critical services. You do it every day. Sometimes that means running application clusters that can distribute load and fail-over statefully if a cluster member fails. In other cases, you can put a load-balancer in front of a set of servers and, if a server fails, you only lose those users that are connected to that failed server. Sometimes, you keep a stand-by server ready to take over the load after it has been provisioned and configured.Mike Fratto is a principal analyst at Current Analysis, covering the Enterprise Networking and Data Center Technology markets. Prior to that, Mike was with UBM Tech for 15 years, and served as editor of Network Computing. He was also lead analyst for InformationWeek Analytics ... View Full Bio