How To Avoid Network Outages: Go Back To Basics
February 10, 2014
While there's a lot of hype about hacking and DDoS attacks, the reality is most network outages are caused by an organization’s own people. A recent Gartner study projected that through 2015, 80% of outages impacting mission-critical services will be caused by people and process issues, and more than 50% of those outages will be caused by change/configuration/release integration and hand-off issues.
In fact, both Xbox LIVE and Facebook suffered network outages from configuration errors during routine maintenance. And while China blamed its recent outage on hackers, some independent observers believe it was actually due to an internal configuration error in the firewall.
At the same time, however, external threats can't be discounted. While hackers used to target primarily big-name organizations, attacks are now being carried out on organizations of all sizes. A recent survey of UK SMBs revealed that many had been targeted by attacks, but were not taking the basic steps to protect themselves.
Taking some basic steps and implementing best practices can help protect your organization from unplanned downtime caused by external threats as well as internal ones. Here are several low-tech, low-budget ways to mitigate network outages caused by internal errors:
1. Checks and balances. Code reviews in software development have become a best practice that's increased code quality and significantly reduced errors; IT teams should adopt a similar review practice for network changes.
2. Monitor, monitor, monitor. Ensure systems are monitored properly before any changes are made and configure alerts so that IT teams can respond quickly if the health, availability or performance of a system is impacted negatively following a change.
3. Keep things simple. A series of changes affecting multiple parts of the IT infrastructure can make it difficult to isolate and remediate errors. Break down massive changes into smaller, more manageable chunks that can be reverted atomically.
4. Build in room for error. IT teams often go full steam ahead in rolling out changes without thinking about how they will revert back to the previous state. These teams should assume errors will happen, and create the action plan for addressing those errors once they do.
5. Communication. Any application or system owner impacted by changes should be notified of changes prior to their implementation. That will serve as precaution to be vigilant for abnormal application or system behavior.
[Read why poor planning, bad design, and lack of communication are the biggest risks companies face in"The Banality of IT Failure: Overlooking Mundane Insider Threats."]
Addressing malicious external threats will require more sophistication, but nevertheless, following tried-and-true best practices are still important in protecting networks.
1. Strengthen your shields. The first level of defense is ensuring firewalls are configured properly and systems are patched with the latest security updates. Will this prevent a successful attack? No, but ignoring these basics steps leaves organizations vulnerable.
2. Remain vigilant. Appropriately monitor firewalls and key systems in your network to detect abnormal events, including high connection counts and high CPU and bandwidth utilizations. These systems should be capable of alerting IT staff to abnormal network behaviors and events.
3. Use appropriate technology. Leveraging deep-packet inspection or flow-based technology to monitor network behavior provides a live picture of the network traffic on the network, minimizing the window of time that it takes to detect an abnormal behavior.
4. Assign responsibility. Ownership empowers and confers accountability. It is extremely important to designate someone in the IT organization to be responsible for the security posture of the company. That individual should be involved in security assessments and analyses, stay abreast of the security threat landscape, be consulted anytime there is suspicion of security related attacks, and educate the rest of the organization. This does not alleviate the entire IT team of security responsibilities, but it puts someone in charge of the effort.
SMBs often minimize the security risks associated with their infrastructures and are thus underprepared to face threats to their network, whether malicious or benign. By following a few basic best practices, organizations of all sizes can start to better defend their networks against unplanned outages.
Joel Dolisy is a senior vice president and CTO/CIO at SolarWinds.