Storage

04:00 PM
Howard Marks
Howard Marks
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Hadoop And Enterprise Storage?

Both NetApp and EMC have announced that they're turning the turrets of their marketing battleships toward the Apache Hadoop marketplace that provides the back end to many Web 2.0 implementations. While I understand how Hadoop is attractive to these storage vendors--after all, a typical Hadoop cluster will have hundreds of gigabytes of data--I'm not sure I buy that Hadoop users need enterprise-class storage.

Enterprise storage, on the other hand, is based more on a "failure is not an option" model than on a fault-tolerant model. Controllers, drives and even drive enclosures are designed to have long mean times between failures. This reliability, of course, costs money, so you pay more per gigabyte for a Vplex (or even a Clariion) than Google does for its MicroATX motherboards with SATA drives all but duct taped to them.

To understand the very different enterprise and Web 2.0 models, think for a moment of an engineering school egg drop contest. The rules of the contest state that teams must get a dozen eggs unbroken from the roof of the engineering building to a team cooking omelets on the quad. Teams will be judged on cost, speed and originality.

The enterprise team builds a dumbwaiter to gently lower the eggs in a supermarket package down to the quad. The Hadoop team buys three dozen eggs and a roll of bubble wrap, wraps each egg in the bubble wrap, and throws the eggs off the roof. As long as one-third of the Hadoop team's eggs arrive unbroken, it has solved the problem and spent $30 to 40 (compared with the hundreds of dollars the enterprise team needed for dumbwaiter parts).

I can just see some application group deciding that Hadoop will help them process the deluge of data in their data center. The proposal finally comes to the storage group, which looks at the low cost and--to the storage guy's eye--low-reliability storage in the proposal. They say, "This should go on our SAN so we can provide the five-nines reliability enterprise applications require." The project goes ahead with storage on the Symetrix, and, while it works fine, the organization doesn't see the cost savings it expected because they're spending several times as much for storage as they needed.

Howard Marks is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage ... View Full Bio
Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
Cartoon
Slideshows
Audio Interviews
Archived Audio Interviews
Jeremy Schulman, founder of Schprockits, a network automation startup operating in stealth mode, joins us to explore whether networking professionals all need to learn programming in order to remain employed.
White Papers
Register for Network Computing Newsletters
Current Issue
Video
Twitter Feed