Enterprise Hybrid Drives: Who Manages The Flash?

Solid-state hybrid drives designed for the enterprise seem like an attractive option for boosting performance, but they come with management challenges.

Howard Marks

November 13, 2014

3 Min Read
Network Computing logo

At the end of a recent seminar, an audience member asked how I felt about enterprise hybrid disk drives. Could vendors, and even users, boost the performance of their disk arrays by replacing their 600 GB HDDs with 600 GB solid-state hybrid drives (SSHDs)?

Before answering the question, let's look back at the evolution of the SSHD. Like a hybrid storage system, a hybrid disk drive combines multiple storage media to deliver a price/performance proposition better than flash or spinning disks alone.

The first successful hybrid drives, like Seagate's Momentus XT, targeted the laptop market. Back in 2011, it seemed like a good solution for people who wanted to carry around more data than an SSD they could afford would handle.

When I used the Momentus XT, it felt more like a fast hard drive than a solid-state drive. Just as you noticed that the system was stuttering to read from the disk, it would finish; with a big enough SSD, the system wouldn't stutter at all. My working set was just enough larger than the 8 GB of flash on the drive for me to notice when there was a cache miss. When I bought my last laptop, an SSD big enough to fit my needs was cheap enough -- about $100 for 240 GB. That's why I'm now running all flash on the road.

Seagate's enterprise SSHDs marry 32 GB of flash to its fastest 15K RPM spinning disks. That works out to a cache of roughly 10% of capacity for the 300 GB model or 5% for the 600 GB version. Since many of today's hybrid storage systems default to a cache 10% the size of their disk layer, those proportions make a lot more sense than the Momentus XT's 8 GB cache for 1 TB of capacity.

My problem with enterprise SSHDs is that, for any given amount of flash, one big pool is a lot easier to manage well than a whole bunch of little pools in the individual disk drives. If we have a storage system with 32 SSHDs, the 32 GB of flash on each of the drives will cache the hottest 5% or 10% of the data stored on that disk drive.

Since the storage controller will decide which drive to use to store any given piece of data based on its data protection and volume management schemes, the hottest 10% of some drives will likely be significantly colder than the top 10% of others.

If instead we used a pair of 480 GB SSDs as a centralized cache, the array controller could cache the hottest 960 GB of data. The centralized cache would also be more efficient because the distributed cache would have to duplicate data or parity information for data protection. A centralized cache could write new data to both SSDs and then overwrite one copy when the data block is written to the backend data store.

Since each SSHD is independent, if the SSHDs are configured as RAID-1, new data written to the system will be cached on both drives in the mirrored pair. If the controller distributes reads across both drives -- as even Windows' built-in volume manager does -- data blocks will be equally hot on both drives that hold them, and therefore any data blocks hot enough to be in cache at all will likely be cached on both drives. Since the central cache can store just one copy, it will have room for more warm data.

Hybrid controllers can also combine their caching algorithms with their data layout. Systems like Nimble Storage's CASL can accumulate data in the cache and write large sequential stripes of data to their backend disks. Unless hybrid drives get really smart and implement their own log-based data layouts, destaging data from a drive's cache to disk will require more head motion.

Like laptop hybrids, I think enterprise hybrid drives will remain a niche product. They can provide a performance boost when installed in servers as DAS and on basic array controllers in applications with small working sets or modest IOPS. In shared storage systems or server SANs, a more centralized cache should be significantly more efficient and perform better.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights