Oregon Storage Debacle Highlights Need To Plan For Failure

Every so often, a public IT failure provides a reminder of why it’s important that IT folks pay attention to the little things. Just such an incident landed the state of Oregon in the news recently, when a storage upgrade went awry, causing what a state spokesman reportedly called a “catastrophic failure” that cut off the state’s storage area network from the agencies it services.

The impact of the failure was felt throughout the state. Child support payments and unemployment checks for new recipients were delayed. Employees couldn’t access email. The forestry service lost access to maps, the state’s job search portal crashed, and overnight computing processes were interrupted. In other words, it really was a catastrophic failure--the kind of nightmare scenario that IT folks dread.

Naturally, blame has to be assigned, and early indications are that the state is pointing the finger at its storage vendor, Hitachi.

But Greg Schulz, a senior advisory consultant with research firm StorageIO, said there’s plenty of blame to go around in such situations, and that the state may also want to take a look in the mirror.

“Ultimately, you--the deployer--are responsible,” Schulz said. “Did everything get outsourced to Hitachi, or did they have oversight? They have to do a post-mortem, address how it happened, how it could have been prevented, and look at what their options are.”

Schulz also said there’s an opportunity to learn from the storage fiasco--both for the state of Oregon and for IT departments everywhere. One of those lessons, he said, is to always have contingency plans in place.

“Fundamental IT 101 is that all technology will fail, despite what the vendors tell you,” Schulz said. And the most likely time technology will fail, he notes, is when people are involved--doing configurations, making changes or updates, or performing upgrades.

[Get tips on how to shake up your continuity training in "Creative Tests for Your Business Continuity Plan."]

The prospect of such failures should motivate organizations to perform more due diligence when buying and deploying storage and other infrastructure technology so that they can minimize potential damages. Specifically, there are three steps organizations should take, according to Schulz:

• When making buying decisions, companies should think hard about how they’re going to use new tools. Businesses that jump into a technology purchase without thinking through use scenarios may run into problems down the road.

• Vendors might say they can address any issues online, but Schulz suggests asking them to put you on the phone with another customer under a non-disclosure agreement before making a purchase so you can candidly ask what to expect when things don’t go quite right.

• Be clear about the availability you require from each of your applications, and make sure you replicate the ones with high-availability requirements in a parallel system to protect them from inevitable failures.

Schulz suspects that Oregon could have minimized the damage of its recent incident if it had ensured higher availability for its services in the event of failure. Rather than throw its vendor under the bus, he said, the state should focus on answering a fundamental question as it troubleshoots its storage area network: How could this have been prevented?

The simple act of asking such questions could mean that Oregon reduces a future large-scale failure to a relative blip.

“You need to isolate and contain faults to prevent them from becoming a disaster,” Schulz said. “If anything can happen, it will. If there is that chance that it can happen, you mitigate it.”

Juniper Networks Announces AI-Native Networking Platform

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 31, 2024

Bob Friday, Chief AI Officer for Juniper Networks, explains how the advanced technology is transforming operations.

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 29, 2024

Contact center leaders from 8x8, Awaken Intelligence, and 360insight discuss the importance of agent experience.

AI Drives the Ethernet and InfiniBand Switch Market

David Curry, Technology Writer

January 27, 2024

AI may force enterprises to rewire parts of their data centers so they are fully optimized to run such workloads. The question is do you use Ethernet or InfiniBand?

Oregon Storage Debacle Highlights Need To Plan For Failure

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market

Search form

Oregon Storage Debacle Highlights Need To Plan For Failure

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market