NETWORKING

  • 01/28/2016
    6:30 AM
  • Rating: 
    0 votes
    +
    Vote up!
    -
    Vote down!

The High Price Of IT Downtime

Outages cost enterprises $700 billion a year, according to IHS study.

What if I told you that every network slowdown and every outright outage that your company suffers costs, on average, over a million dollars?

For large enterprises, these are the findings of a recent study of the cost of server, application and network downtime conducted by IHS in 2015 and released this week.

For this report, IHS surveyed 400 of its mid- to large-size clients throughout North America on their decision-makers' information and communication technology (ICT) downtime woes. For IHS purposes, midsize companies average approximately $100 million in revenues and 500 employees (within a range of 100 to 1,000 employees);  large enterprises average 13,000 employees and nearly $2 billion in revenues.

The result: Those surveyed reported an average of five downtime events each month, with each downtime event being expensive indeed: from $1 million a year for a typical midsize company to more than $60 million for a large enterprise.

Extrapolated out, that's a cost to North American companies of $700 billion a year for ICT outages. This includes lost employee productivity (78%), lost revenue (17%), and actual costs to fix the downtime issues (5%).

IT downtime causes

IHS reports that network interruptions are the biggest factor contributing to downtime, but what causes the actual network interruptions? Where does all of this costly downtime actually come from?

In an interview, Matthias Machowinski, research director for enterprise networks and video at IHS, told me that equipment is the major source of overall downtime, as measured in hours.

Indeed, according to Machowinski, equipment failures and other equipment problems contribute to close to 40% of all reported downtime. Service provider problems and internal human errors each make up nearly 25% of downtime.  Trailing these bête noires are system attacks, which, despite all of the cybersecurity hype these days, contribute to only about 10% of all downtime.

How to reduce IT downtime

Machowinski said the most popular technique for mitigating downtime is network-monitoring implementations. Sixty-four percent of respondents in the IHS study indicated that they are pursuing this strategy, and Machowinski thinks it's a good one.

"You don't want the user to be your early-warning system," he said. 

Looking for IT education? Interop Las Vegas offers dozens of learning opportunities and showcases the latest technology developments. Don't miss out! Register now for Interop, May 2-6, and receive $200 off. 

The second-most popular downtime-mitigation technique (at about 57%) -- one that Machowinski especially recommends -- is incorporating more redundancy into networks. Added redundancy can solve both on-premises equipment failures and problems with hosted service providers, he said.

"Equipment is going to fail at some time or another," Machowinski said. With redundant networks, there's still some connectivity, even if backups aren't running at 100%, he explained.

To this end, he recommends "keeping spares on hand…or at least being able to source a spare relatively quickly – preferably same date, maybe within hours."  He further suggests that IT departments take advantage of premium support offerings from vendors in order to more efficiently source replacement equipment and keep downtime to a minimum.

Machowinski points out that there are a number of other things enterprise organizations can do and are doing in addition to these techniques to cut down on downtime, including better training, improving hiring processes, and increasing reliance on backup processes that are independent of ICT-based systems (e.g., good old-fashioned pen and paper).

"You really need to have a multi-pronged strategy [and] understand the effects of downtime and the importance of creating equipment that minimizes downtime within ICT infrastructure," Machowinski said. "[Downtime] is a serious issue that companies really need to take a hard look at, and I think it's a good example of where a small investment can have such tremendous benefits to a company."


Comments

more than ust the network

I agree that network equipment issues are important, but certainly a more holistic view that incorporates cases such as configuration (human) errors for HW and SW, and, as you referenced, forced downtime (due to a security breach) are also important. What I find interesting is that the study states hardware errors for 40% of downtime, but I was led to believe that those were traditionally lower based on earlier studies. I think this is worth revisiting if hardware is a major culprit, since those are straightforward to address based on redundancy or architectural changes.

Re: more than ust the network

That's a good point Dan. It seems that most of the high-profile outages (e.g., AWS) are caused by human configuration errors, not hardware problems.

Re: more than ust the network

Downtime that is created in areas within the organization's control is manageable. However, downtime that is created through external events can be difficult to contain.
For instance, if power backup has been provisioned to provide 24 hours of backup power and freezing rain has caused a power outage that lasts for 48 hours, it is just a matter of time before operations come to a complete stand still. It would still be a good idea to reconfigure the power management system to shutdown nonessential equipment, stretching the backup time to 30 hours.

Re: more than ust the network

There are other variables the study has not covered.

The Cost of Down Time Takes On Many Forms

The estimated cost of downtime doesn’t surprise me, yet I wonder why a study of the cost of the amount of wasted time by employees when the network is up has never gained the headlines.

Very few employee are productive when the network is functioning fine, these employees would rather surf the net and keep up with friends on FaceBook.

Oh the sky is falling when the network goes down, but when it comes back up - we still have over half a workforce wasting time.

Re: The Cost of Down Time Takes On Many Forms

Around the year 2013, a report was published that stated social media was costing $650 billion to the economy. There are lots of hypotheses that assume social media is beneficial for the employee and their productivity in the long term and an equal number of alternative hypotheses.
At the end of the day, it is all dependent on the employee and the level of professionalism that they are aiming towards. If entrepreneurs can setup entire businesses by allocating their time resources, employees must also allocate their resources to complete the task at hand.

Re: The Cost of Down Time Takes On Many Forms

@Brian.Dean Thanks for the figure. Now that you mention it that number does look familiar. It is really frustrating for tech professionals to constantly here how much money is being lost when they are in the mist of trying to solve some sometimes very complex issues.

And once all the complaining has subsided and the network is back, most go back to what they were doing before it went down - Nothing.

But you are right, it depends upon the professionalism and maturity of the individual, some can handle it, some can't.

Regardless, the amount of money wasted is astounding and probably even higher today.

Re: The Cost of Down Time Takes On Many Forms

ClassC,

I have seen and heard of So many Organizations(primarily in the Public Sector);where employees are hired without any work profile just for the sake of keeping the population busy/occupied.

How are we to judge whether this wastage amount involves only such underemployed folks or also involves folks who are Gainfully employed but are using Social Media as a break/Stress-buster tool?

Re: The Cost of Down Time Takes On Many Forms

"...How are we to judge whether this wastage amount involves only such underemployed folks or also involves folks who are Gainfully employed but are using Social Media as a break/Stress-buster tool? "

@Ashu001 Great question. And I think we have no choice but to employ the honor system for the most part, but companies need to vent prospective applicants for their penchant for surfing.

I can understand the need to use it as a stress relief tool ( I am guilty of this myself) - I think the point here is the responsible use.

Some can do this and some can’t - the problem is that those that can’t are significant in number. Maybe this is an issue that can not be solved, yet we have to acknowledge it and mitigate it to the extent a company can. No easy task for sure.

Re: The Cost of Down Time Takes On Many Forms

ClassC,

I am sure you are aware that most Corporations today do employ HR Teams whose principal job is to review Social Media Networks to get a handle on the person they are interviewing for their vacant position.

A lot of folks might claim this amounts to infringement of their Privacy(Especially Privacy advocates) but I for one am not so sure.

If you don't want somebody to know something about you ;just don't post it online.

That's my mantra.

Planned outages

One cause for outages we deal with that cost lots of lost productivity are the planned outages. These can be divided several ways, but I'd go with Facility, Hardware, and Software. Facility outages are things like the need to perform power maintenance. The data center power infrastructure, such as UPSs, PDUs, Switchgear, Generators, needs to have preventative maintenance. For the Software and Hardware areas, outages are needed to perform upgrades and maintenance. I know these are all valid things and are often done overnight or weekends. However, they still cause lost time. I wish organizations would realize how the constant flood of upgrades, changes, etc. interrupts each worker's rhythm. There is some value to keeping things the way they are.

Re: Planned outages

Thanks for chiming in Scott. That's a difficult situation. Like you say, the maintenance is required, but invariably impacts employees. Is there anything you think might help?

Re: Planned outages

Scott,

Good and critical points raised by you here.

Lets put it in another way-Would you rather have Planned Downtime/Outages or Unplanned Outages caused By Infrastructure that eventually breaks down?

While I personally prefer the latter,I am not surprised that there are folks who work on a totally different rhythm to me and would rather not see any downtime whatsoever.

Can we reach a middle-Ground somewhere by polling all Employees with some Options for when they would prefer planned Downtime to happen and then going with what the Majority feels works best for them?

I know that would be a really-really good feature to have.

The Poll would something like this-

"We plan to take down the XYZ part of IT Infrastructure for routine upgrades and Maintenance sometime in the Month of February.Please select which among the 4 options below is most suitable and most convenient for you."

Such a poll would ensure that folks don't feel disenfranchised from the whole process and would readily work with most foks.

Your two cents?

Re: Planned outages

@Ashu001 I think that is a great idea, the best solution to this interuption of routine. My argument is that most workers take their routine for granted. As was mentioned, most of this type of work is carried out during the night and on weekends.

Which explains the stress that comes with these types of roles. I can appreciate this angle, I just refuse to believe that employees are inconvenienced by routine maintenance.

Re: Planned outages

ClassC,

I for one am always for more Transparency and Giving folks atleast some semblance of Power& Responsibility.

This is why I mentioned the polling scenario.

Atleast this way ;majority of the folks will be on base with the Organization on routine Maintenance issues.

Sure,there will always be outliers in every organization-I have a developer who is able to focus and work with 100% Concentration ONLY on Saturdays and Sundays(because his Wife is home to take care of the kids then);he gripes the maximum when we take down the System on Weekends.
We try our best to keep everyone happy but we really can't do much in such special cases.Can we?

LOL!