After the flood of storage news generated at or around VMworld, it's time to return to our discussion of storage performance. In the first two installments, we looked at basic storage performance metrics and how Input/Output Operations Per Second (IOPs) and storage system latency have a greater impact on the performance of applications like databases than simple system throughput. In this final installment, we're going to take a look at how RAID affects the performance of your applications.
In general, RAID increases reliability and availability of your storage system. The redundancy that it provides comes at a cost, however--not just in additional disk space consumed, but also in the increased amount of work that your disk drives, spinning or solid state, have to do when you write data to a RAID set. The good news is this write amplification--or, as some call it, write penalty--can be offset by the boost in read performance reading from multiple drives in parallel. Let's look at how a theoretical RAID controller behaves when reading and writing to some common RAID configurations so we can see the impact of RAID on performance:
In a mirrored configuration like RAID 1 or RAID 10, the controller duplicates all writes to a pair of drives in the RAID set so each write request from your application becomes 2 IOs to the back-end disks. As a result, the number of write IOPS a mirrored RAID set can deliver is half the sum of the IOPS the drives in the set can deliver. A smart RAID controller can distribute read requests across the drives in the mirrored pair, a process Novell dubbed disk duplexing, so that mirrored RAID set can deliver almost four times as many read IOPS as write IOPS..
Things get a little more complicated with parity-based RAID schemes like RAID 5 and RAID 6, as the amount of work the disks have to do varies depending on the size of the write request. If the write request is smaller than the RAID stripe size (as is common in database applications where the database engine writes data in 4 or 8KB pages to a RAID set that stripes data across its drives writing 64KB to each drive), a storage system running parity RAID will have to perform several IO operations to satisfy a single write request.
To write a small change to a parity RAID set, the controller must read the data currently in the RAID set to memory, insert the new data, calculate the new value for the parity stripe(s) and then write the new data and parity to the back-end disks. While the number of actual I/O operations depends on how many drives are in the RAID set, the process of reading or writing the data occurs in parallel across all the data drives. The net effect for an optimized RAID controller is that each small write causes roughly four times the IOPS and latency that a write to a single drive would. Of course, in the real world many RAID controllers don't have the bandwidth or CPU horsepower to achieve total parallelization, so a 14+1 RAID 5 set will do small I/Os significantly slower than a 5+1 RAID set. Similarly, many RAID controllers calculate the two sets of parity data for a RAID 6 set sequentially, so their RAID 6 performance is substantially less than their RAID 5 performance.
Parity RAID is better suited to environments like file servers and streaming media, where the write I/O sizes are larger than the stripe size. If you're writing more than 512KB at a time to an 8+1 RAID 5 or 8+2 RAID 6 set with a 64KB stripe size, the RAID controller doesn't need to read the existing data--it can just calculate parity and slam the new data and parity to the disk drives.
You, dear reader, should note that vendors have tricks up their sleeves that can make more sophisticated array controllers perform better than the simple RAID controller described here would. A nonvolatile memory cache that allows an array to acknowledge writes before writing data to its back-end storage would give that array to have lower write latency as long as the traffic was bursty enough to let the controller flush its cache before another burst of data arrived. Similarly, log-based data structures allow the controller to always write full RAID stripes to free space, reducing the write amplification of small writes to parity-based RAID set.
While an oversimplification--as all rules of thumb are--the storage admin's rule of thumb to use RAID 10 for random I/O workloads and parity RAID for sequential workloads does hold water. Of course, sequential workloads, other than backups, are becoming rare in today's virtualized data center.
I'd like to thank Duncan Epping for his post on this subject at the Yellow Bricks blog and all the folks who contributed to the long series of comments to that post. That discussion helped me focus my thoughts on the subject of RAID and storage performance.