The institute recently published the Operational Sustainability standard to address the human factor. According to a recent survey from the Ponemon Institute, 95 percent of U.S. data centers have had an unplanned outage.
Respondents averaged 2.48 complete data center shutdowns over the two-year period, with an average duration of 107 minutes. While complete shutdowns are frequent, row or rack-based outages had an average occurrence of 6.8 times with an average duration of 152 minutes. Rack-and server-based downtime had an average occurrence of 11.2 times during the two-year timeframe with an average duration of 153 minutes. While not the biggest factor, accidental EPO (emergency power off)/human error accounted for 51 percent of the outages.
Kudritzki says human error is in fact a bigger problem, accounting for up to 70 percent of data-center outages. The institute has been gathering data from over 100 of the largest most critical sites globally since 1994 (Abnormal Incident Reports), and with just under 5,000 reports in, including 500 on full data-center shut-downs, over 73 percent of events were attributed to human factors.
The problem of human error also seems to be worsening, he adds. "When looked at over the last one-and-a-half to two years, we've actually seen a slight uptick in process-related failures. There's a lot of work we need to do as an industry to address this."