Enterprise storage, on the other hand, is based more on a "failure is not an option" model than on a fault-tolerant model. Controllers, drives and even drive enclosures are designed to have long mean times between failures. This reliability, of course, costs money, so you pay more per gigabyte for a Vplex (or even a Clariion) than Google does for its MicroATX motherboards with SATA drives all but duct taped to them.
To understand the very different enterprise and Web 2.0 models, think for a moment of an engineering school egg drop contest. The rules of the contest state that teams must get a dozen eggs unbroken from the roof of the engineering building to a team cooking omelets on the quad. Teams will be judged on cost, speed and originality.
The enterprise team builds a dumbwaiter to gently lower the eggs in a supermarket package down to the quad. The Hadoop team buys three dozen eggs and a roll of bubble wrap, wraps each egg in the bubble wrap, and throws the eggs off the roof. As long as one-third of the Hadoop team's eggs arrive unbroken, it has solved the problem and spent $30 to 40 (compared with the hundreds of dollars the enterprise team needed for dumbwaiter parts).
I can just see some application group deciding that Hadoop will help them process the deluge of data in their data center. The proposal finally comes to the storage group, which looks at the low cost and--to the storage guy's eye--low-reliability storage in the proposal. They say, "This should go on our SAN so we can provide the five-nines reliability enterprise applications require." The project goes ahead with storage on the Symetrix, and, while it works fine, the organization doesn't see the cost savings it expected because they're spending several times as much for storage as they needed.