RAID is Born
You can trace the beginnings of RAID to the University of California at Berkeley, where a paper was presented in 1987 on ideas on how to combine large quantities of disks to increase capacity and improve I/O performance and data redundancy.
The initial RAID levels (0 through 5) were developed and defined to support combinations of data redundancy and I/O performance. RAID has developed over the years to gain additional levels, and its usefulness has continued to expand. Arrays are used not only for large data storage needs but also in the growing SAN (storage area network) field. SANs allow the stored data to be shared by multiple users as if the drives were local instead of attached to a file server.
RAID arrays are created from multiple drives connected to a controller. Although normal SCSI or Fibre Channel controllers can be used for RAIDs by using software that determines to which disk each block of data should go, Adaptec, American Megatrends and other companies produce controllers specifically designed for RAID arrays. You could use a cheaper software-only RAID solution, but it most likely will support only RAID levels 0 and 1. If building your own array is not an option, look to companies, like Medea, that build arrays for just about every need and computer. Hewlett-Packard Co., Sun Microsystems and other large computer manufacturers also sell RAID products that work with their systems.
RAID depends on three basic technologies to achieve its goals: striping, mirroring and parity. Striping creates large volumes or increases I/O performance; mirroring and data parity create redundant information that can be re-created if necessary.
Painting Those Stripes
The idea of creating larger volumes by combining disk space is not new. Databases were growing to sizes that couldn't be handled by the disk technology available at the time, so striping was used to combine groups of disks to create one volume or directory that was not contained within a single physical drive.
With striping, data that didn't fit on one disk could be spread across several disks. Early attempts grouped disks together one after another. When the space on one disk was full, the system moved onto the next disk. This was fine for creating larger volumes but didn't deal with performance issues at all.
In contrast, RAID systems stripe disks together in an interleaved fashion, writing a little bit on the first disk, then some on the next and so on. This increases performance because while the first disk is still writing the information, the computer can move on to send the next set of data to the next device in the stripe. Performance on today's systems can be further improved by separating the disks onto separate controllers.
Mirror Images
While most enterprise customers had been happy to back up all essential data on a daily basis, or just several times a week, such backups could not provide up-to-the-second data availability should a problem occur. Mirroring was designed to keep that up-to-the-second backup by duplicating everything written to one drive on another. In case of a drive failure, or more likely a bad drive sector, the data can be retrieved from the mirror drive. This eliminates the headache of pulling out the backup tapes and trying to restore the needed information. Data can be mirrored onto multiple disks, creating additional redundancy.
Parity has been used in the computer industry almost from day one with most systems using 9 bits per byte to store information in RAM. To identify data that is read or written incorrectly, parity is used to verify data integrity from disk to memory or vice versa.
Parity schemes vary depending upon the use but work in the same fashion. A group of bits are added together; if the sum is a positive number, parity is assigned a "0," if the sum is negative, the parity becomes a "1." Data is then verified by adding the bits and checking that total against the value stored in parity. If they agree, most likely the data is correct. If they disagree, there is at least one incorrect bit. In a worst-case scenario, the parity bit itself is incorrect.