Today's flash SSDs are more than the latest in a long series of storage systems that provide performance at any cost. RAMdisks, head-per-track disk, short-stroked drives and the like all boosted performance for data sets small enough to fit in the limited space they provided. Vendors are promising that we can use the performance of SSDs for more mainstream applications by moving the "hottest" data into SSDs and leaving the rest behind on capacity-oriented drives. Now we just have to agree on what to call it. Caching, tiering, potato, potahto...
Most of the tiering buzz so far has been about inside the array tiering. Compellent and 3Par are delivering sub-LUN tiering now, and while the current version of EMC's FAST can only re-locate whole LUNs, they've been promising sub-LUN tiering for delivery later this year. It's no surprise that upstart vendors with the wide-striped data blocks spread across all disks architecture were the first ones to do automated tiering. They had a head start as their architecture built LUNs from almost randomly assigned blocks already. They just had to make the system spread blocks across storage with different performance characteristics. Engineers starting with architectures where a LUN is a series of contiguous blocks in a RAID set had more work to do.
The first step to sub-LUN tiering is to start collecting access frequency meta-data on disk blocks. This lets a policy engine periodically identify the blocks that are being accessed most frequently and move them to a faster tier of storage while moving cooler blocks from the fast SSD tier down to a spinning disk tier. The simplest policy would be to migrate blocks that have the highest access rates to faster tiers on a nightly basis.
The problem with this simple policy is that as the mutual fund ads say "past performance doesn't guarantee future results." Just because a block was busy yesterday, when we ran the database defrag, that doesn't mean it will be busy later today, when we run the end of month close. To get the best bang for the buck, vendors will have to keep access meta-data over time so we can write a policy that says move the blocks that were hot the last time we did a data warehouse load to SSD tonight so we can build a new cube tomorrow.
As array vendors were ramping up their tiering story, another group -- Gear6, DataRAM, Avere, StorSpeed and most recently FalconStor -- decided that they could accelerate the process of moving to SSD and capacity-oriented drives, now called Flash and Trash, by implementing a huge cache in SSDs. Just as modern CPUs have several tiers of cache (64KB, 256KB and 8MB for a Xeon 5500) they combine RAM, SSD and in the case of Avere 15KRPM, drives to cache data in a standalone appliance.Howard Marks is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage ... View Full Bio