Storage

10:43 PM
Howard Marks
Howard Marks
Commentary
50%
50%
Repost This

What Comes After RAID? Erasure Codes

As I mentioned a few blog entries ago, the basic math behind parity based RAID (Redundant Array of Inexpensive Drives) solutions is starting to break down. While I think it's important for those of us that spend our days thinking about these things to raise the alarm, it's more important to think and write about the technologies that can take us past parity RAID. One major contender is Reed-Solomon erasure codes, which vendors are starting to use as an efficient alternative to parity or mirrorin

As I mentioned a few blog entries ago, the basic math behind parity based RAID (Redundant Array of Inexpensive Drives) solutions is starting to break down. While I think it's important for those of us that spend our days thinking about these things to raise the alarm, it's more important to think and write about the technologies that can take us past parity RAID. One major contender is Reed-Solomon erasure codes, which vendors are starting to use as an efficient alternative to parity or mirroring.

As we've previously discussed, the problem with parity RAID is that disk drives have been getting bigger much faster than they've been getting faster or more reliable. In just a few years, we'll be buying 10TB (1x10^13 bits) drives that will take 10 hours or more to read end to end, pushing a RAID rebuild into a several day event with a high probability of failure due to a read error on the other drives.

Reed-Solomon erasure codes (which got their name from their original use as a forward correction method for sending data over an unreliable channel which may fail to transfer, or erase, some data) can extend the data protection model from RAID-5's simplistic n+1 to substantially higher levels of protection. Rather than separating the data from error correction or check data as parity and CRCs do, erasure codes expand the data, adding redundancy so even if a portion of the data is mangled or lost,  the original data can be retrieved from the remaining portion.
 
Erasure codes have been around for decades, for applications like data transmissions from deep space probes, where the several minutes of latency makes a TCP style timeout and retransmit impractical, to CDs and DVDs which use erasure codes to handle dust, scratches and other impairments of the vulnerable disk.  They've even made it quietly into enterprise storage as vendors use Reed-Solomon math to calculate the ECC (Error Correcting Codes) for their n+2 RAID-6 implementations.

Erasure codes get really interesting, however, when we up the ante beyond n+2 as several vendors have.  NEC's HydraStor deduplicating grid system uses erasure codes to spread each data chunk across twelve disk drives in the grid.  With a protection level of 9 of 12, the original data can be reconstructed from any nine of the twelve data chunks. Hydrastore users can set the protection level as high as 6 of 12 which would have the same 50% overhead as mirroring, but be able to deliver the data after six drive failures.

Cleversafe has extended erasure coding to add location information, creating what they call dispersal coding. This lets them insure that blocks are stored, not just on different disk drives or different nodes of their RAIN cluster, but even in different data centers. Using their default coding, which requires 10 of the 16 chunks created for each data stripe in order to reconstruct the original data, you can tell their system to disperse the data across three data centers and be able to read the data if one data center went off line with less than 40% overhead. A typical replicated solution in three datacenters would have 200% overhead.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
More Blogs from Commentary
Infrastructure Challenge: Build Your Community
Network Computing provides the platform; help us make it your community.
Edge Devices Are The Brains Of The Network
In any type of network, the edge is where all the action takes place. Think of the edge as the brains of the network, while the core is just the dumb muscle.
Fight Software Piracy With SaaS
SaaS makes application deployment easy and effective. It could eliminate software piracy once and for all.
SDN: Waiting For The Trickle-Down Effect
Like server virtualization and 10 Gigabit Ethernet, SDN will eventually become a technology that small and midsized enterprises can use. But it's going to require some new packaging.
IT Certification Exam Success In 4 Steps
There are no shortcuts to obtaining passing scores, but focusing on key fundamentals of proper study and preparation will help you master the art of certification.
Hot Topics
1
Scale-Out Storage Has Limits
George Crump, President, Storage Switzerland,  4/21/2014
White Papers
Register for Network Computing Newsletters
Cartoon
Current Issue
Video
Slideshows
Twitter Feed