Data centers

10:00 AM
Connect Directly
RSS
E-Mail
50%
50%

Automation Not The Solution For Human Error

We've all had our share of misfortunes with IT devices and services that have failed to perform as expected in an increasingly information-centric world. But as much as we may want to fault the technology, it appears that we are to blame in the majority of cases, at least as far as data-center outages. The solution is not to replace humans with lights-out automation, but provide better training, processes and procedures, says Julian Kudritzki, vice president of the Uptime Institute. "It's the sa

We've all had our share of misfortunes with IT devices and services that have failed to perform as expected in an increasingly information-centric world. But as much as we may want to fault the technology, it appears that we are to blame in the majority of cases, at least as far as data-center outages. The solution is not to replace humans with lights-out automation, but provide better training, processes and procedures, says Julian Kudritzki, vice president of the Uptime Institute. "It's the same things over and over causing the failures, either the lack of processes, procedures and training, or the procedures are not followed."

The institute recently published the Operational Sustainability standard to address the human factor. According to a recent survey from the Ponemon Institute, 95 percent of U.S. data centers have had an unplanned outage.

Respondents averaged 2.48 complete data center shutdowns over the two-year period, with an average duration of 107 minutes. While complete shutdowns are frequent, row or rack-based outages had an average occurrence of 6.8 times with an average duration of 152 minutes. Rack-and server-based downtime had an average occurrence of 11.2 times during the two-year timeframe with an average duration of 153 minutes. While not the biggest factor, accidental EPO (emergency power off)/human error accounted for 51 percent of the outages.

Kudritzki says human error is in fact a bigger problem, accounting for up to 70 percent of data-center outages. The institute has been gathering data from over 100 of the largest most critical sites globally since 1994 (Abnormal Incident Reports), and with just under 5,000 reports in, including 500 on full data-center shut-downs, over 73 percent of events were attributed to human factors.

The problem of human error also seems to be worsening, he adds. "When looked at over the last one-and-a-half to two years, we've actually seen a slight uptick in process-related failures. There's a lot of work we need to do as an industry to address this."

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Cartoon
Slideshows
Audio Interviews
Archived Audio Interviews
Jeremy Schulman, founder of Schprockits, a network automation startup operating in stealth mode, joins us to explore whether networking professionals all need to learn programming in order to remain employed.
White Papers
Register for Network Computing Newsletters
Current Issue
2014 Private Cloud Survey
2014 Private Cloud Survey
Respondents are on a roll: 53% brought their private clouds from concept to production in less than one year, and 60% ­extend their clouds across multiple datacenters. But expertise is scarce, with 51% saying acquiring skilled employees is a roadblock.
Video
Twitter Feed