Just a few years ago, the typical high-end RAID array box looked a bit like a mainframe with heavy, metal components, supporting the notion that it was the pillar of the IT department.
Pricing was mainframe class, too, and all that gold-anodized metal reinforced the user’s belief that multi-path Fibre Channel arrays were worth the half-million dollar cost, including the $5,000 per drive that was the standard.
Today, however, storage arrays are very compact, very dense, and a lot cheaper than the “big iron” that used to rule the enterprise space. Let's take a look at the evolution of the storage array.
A number of factors led to a rethinking of values in the array space. First, SAS replaced Fibre Channel as the drive interface, and 15K RPM drives virtually disappeared from the product. Both of these, and especially spin speed, simplified the superstructure needed to mount the drives.
At the same time, 2.5-inch drives offered a better packing density and more packaging options, while being less sensitive to vibration issues. This led to really compact 24-drive arrays taking only a 2U slot in the rack.
Not to be left behind, the 3.5-inch array makers responded with 12-drive server structures that could be turned into JBODs and arrays, then with 48- to 60-drive boxes that solved drive replacement much more elegantly than the old-fashioned drive caddy did.
With 60-drive boxes down to 4U in size, and capable of holding one or two controllers, the coup de grace came with the arrival of SSDs and the rapid capacity increases in 3.5-inch hard drives. With a new tiering approach using ultra-fast SSD as primary-tier storage and low-cost, huge-capacity SATA drives for slow data as a secondary tier, the stage was set for the array to really shrink, even while holding much more data.
SSD performance accelerated this transition. With SSDs having as much as 1,000 times the random I/O performance compared with HDDs, no controller could meet the requirements of more than a dozen SSDs. This led to hybrid arrays with a mix of SSDs and HDDs, and for these, the 48- to 60-drive boxes fit the bill.
The search for a way to get data off the SSD led us to the all-flash array approach, where a highly customized solution feeds a bunch of ultra-fast interfaces. These units can transfer 1 million to 2 million random IOPS, but the compact nature of flash (its “IOPS density”), means that a typical unit is just 3U or 4U in size.
With HGST’s recent announcement of 10 TB drives, we are looking at 6 petabytes of bulk storage in a single rack, coupled with a few million IOPS of all-flash array mounted in the racks with the servers to take advantage of connection to the head-of-rack switches.
The evolution is likely to continue, with all-flash arrays vying for survival against in-server persistent storage that can be networked via RDMA. The bulk storage appliances also are evolving, with one likely change being loss of the ability to replace drives, which will fit the zero-repair maintenance approach that new data integrity systems make possible.
All-flash arrays will get faster networking and more software features such as compression and deduplication, and will tie into the operating systems much more closely. Violin is heading there with its Windows Flash Array, and Red Hat is well positioned to tie products to Linux.
Some industry watchers believe that arrays are going out of style altogether, to be replaced by software-defined storage. The need for performance probably precludes general-purpose virtualized servers from handling actual transfers for all-flash arrays, but the concept of distributed storage computing will further impact array configurations. Judging by the number of terms being tossed around software-defined storage, such as converged and hyper-converged, the approach is getting a lot of interest, even if some of it is just hype.
Looking a bit further out, the transition to an object storage model is gaining speed, due to the ability to scale out capacity and manage data more easily, especially when unstructured big data is involved. The smaller box sizes are good matches for object storage code approaches, and the major software packages now support both replication and erasure coding for data integrity, which enables simpler designs and lower maintenance.
As for all those expensive “mainframe” arrays, with EMC pushing Xtremio all-flash appliances, it may relegate them to secondary storage!