Howard Marks

Network Computing Blogger

Upcoming Events

Where the Cloud Touches Down: Simplifying Data Center Infrastructure Management

Thursday, July 25, 2013
10:00 AM PT/1:00 PM ET

In most data centers, DCIM rests on a shaky foundation of manual record keeping and scattered documentation. OpManager replaces data center documentation with a single repository for data, QRCodes for asset tracking, accurate 3D mapping of asset locations, and a configuration management database (CMDB). In this webcast, sponsored by ManageEngine, you will see how a real-world datacenter mapping stored in racktables gets imported into OpManager, which then provides a 3D visualization of where assets actually are. You'll also see how the QR Code generator helps you make the link between real assets and the monitoring world, and how the layered CMDB provides a single point of view for all your configuration data.

Register Now!

A Network Computing Webinar:
SDN First Steps

Thursday, August 8, 2013
11:00 AM PT / 2:00 PM ET

This webinar will help attendees understand the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging. It will also help users decide whether SDN makes sense in their environment, and outline the first steps IT can take for testing SDN technologies.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

What Comes After RAID? Erasure Codes

As I mentioned a few blog entries ago, the basic math behind parity based RAID (Redundant Array of Inexpensive Drives) solutions is starting to break down. While I think it's important for those of us that spend our days thinking about these things to raise the alarm, it's more important to think and write about the technologies that can take us past parity RAID. One major contender is Reed-Solomon erasure codes, which vendors are starting to use as an efficient alternative to parity or mirroring.

As we've previously discussed, the problem with parity RAID is that disk drives have been getting bigger much faster than they've been getting faster or more reliable. In just a few years, we'll be buying 10TB (1x10^13 bits) drives that will take 10 hours or more to read end to end, pushing a RAID rebuild into a several day event with a high probability of failure due to a read error on the other drives.

Reed-Solomon erasure codes (which got their name from their original use as a forward correction method for sending data over an unreliable channel which may fail to transfer, or erase, some data) can extend the data protection model from RAID-5's simplistic n+1 to substantially higher levels of protection. Rather than separating the data from error correction or check data as parity and CRCs do, erasure codes expand the data, adding redundancy so even if a portion of the data is mangled or lost,  the original data can be retrieved from the remaining portion.
Erasure codes have been around for decades, for applications like data transmissions from deep space probes, where the several minutes of latency makes a TCP style timeout and retransmit impractical, to CDs and DVDs which use erasure codes to handle dust, scratches and other impairments of the vulnerable disk.  They've even made it quietly into enterprise storage as vendors use Reed-Solomon math to calculate the ECC (Error Correcting Codes) for their n+2 RAID-6 implementations.

Erasure codes get really interesting, however, when we up the ante beyond n+2 as several vendors have.  NEC's HydraStor deduplicating grid system uses erasure codes to spread each data chunk across twelve disk drives in the grid.  With a protection level of 9 of 12, the original data can be reconstructed from any nine of the twelve data chunks. Hydrastore users can set the protection level as high as 6 of 12 which would have the same 50% overhead as mirroring, but be able to deliver the data after six drive failures.

Cleversafe has extended erasure coding to add location information, creating what they call dispersal coding. This lets them insure that blocks are stored, not just on different disk drives or different nodes of their RAIN cluster, but even in different data centers. Using their default coding, which requires 10 of the 16 chunks created for each data stripe in order to reconstruct the original data, you can tell their system to disperse the data across three data centers and be able to read the data if one data center went off line with less than 40% overhead. A typical replicated solution in three datacenters would have 200% overhead.

Page:  1 | 2  | Next Page »

Related Reading

More Insights

Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Data Deduplication Reports

Research and Reports

Network Computing: April 2013

TechWeb Careers