Storage

11:29 AM
Howard Marks
Howard Marks
Commentary
50%
50%
Repost This

Server SSD Isn't Necessarily PCIe

SSDs aren't for just laptops and desktops--there is room for server SSDs. Learn why, and why people are gun-shy about server-side SSD use.

As I talk with both users and vendors researching an upcoming report on server-side SSD use, I've discovered that many folks think a server SSD has to be the kind of ultra-high-performance SLC PCIe flash card made by vendors like Fusion-IO, Micron, Virident and LSI. While people are perfectly aware that the majority of SSDs sold are 2 1/2-inch form-factor devices with SAS or SATA and interfaces, they seem to think these devices are suitable only for laptops and enthusiasts' desktops.

Sure, most of the SATA SSDs on the market aren't up to the rigors of data center use, but there are several steps between the kind of low-cost SSDs they sell at Akbar and Jeff's Computer Hut and the PCIe cards the go-fast-any-cost guys produce. Most folks are just terrified that they'll put an SSD in their disk arrays or servers, and at some point they'll exhaust its endurance and lose all their data.

I think part of the problem is it's been a long time since we've had devices in the data center that actually wear out. Today's sealed hard drives don't wear out in any predictable way--they just fail at random--and most storage systems treat disk drives as binary devices that are either working properly or not working at all. So when a drive has a head pre-amplifier failure, the RAID controller it's connected to declares the drive bad and stops using it.

SSD write exhaustion isn't an unpredictable failure, but it is a relatively well-understood process whereby the flash in an SSD has been programmed and erased enough times that the error array from that flash begins to exceed the ability of the flash controller to correct the error. This exhaustion doesn't happen to the whole SSD all at once but flash page by flash page; it has less working flash to use for housekeeping wear leveling and the like. Eventually there's no spare flash at all, and the SSD can no longer accept writes.

Most flash controllers keep careful track of how often they've overwritten each page of flash that they manage and can report back to the storage system how much of their write endurance has been consumed via extensions to the smart diagnostic system. If I can know weeks or months before that this flash exhaustion is going to cause me problems, it should be a simple enough matter to replace an SSD when it's reached 80% or 90% of its life.

RAID controllers could send a message that SSD 14 has reached its endurance threshold, identify a new or spare SSD and rebuild the RAID set to the new storage. If the SSD was being used as a read cache, it wouldn't contain unique data and the replacement could be even easier.

Since MLC SSDs (which may have write endurance of only 5,000 program erase cycles) typically cost about a tenth as much as eMLC or SLC SSDs, it may make sense to treat MLC SSDs as disposable devices for the data center. While we might not be able to predict whether an MLC SSD will last 18 months or five years, it would still be cheaper to buy three MLC SSDs and replace them as needed then to buy one substantially more expensive SLC SSD.

Of course, this is just not the way people buy equipment for the data center--we buy gear based on some projection of future peak need, multiplied by whatever factor we feel may be necessary to prevent a run on the storage bank. Personally, I like to multiply by pi. So if a project may need 10 Tbytes of storage, we make sure the budget for that project includes 30 Tbytes of storage just so we don't get caught short at some point in the future.

Sure, introducing the concept of disposable devices to the data centers may make more work for the poor guy left to swap out a dying server SSD. But until we have the miracle nonvolatile memory of the future, it may be a way to get performance without a huge capital expenditure.

Of course, some of you may be concerned about people recovering sensitive corporate data from your discarded SSDs. While that's a real concern, it can be easily addressed by simply running your old worn-out SSDs through a Blendtec blender. If it can reduce an iPad to dust, it can do the same to a Micron P400 RealSSD.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
SGHill-NWCmod
50%
50%
SGHill-NWCmod,
User Rank: Apprentice
7/6/2012 | 6:43:40 AM
re: Server SSD Isn't Necessarily PCIe
Of course, the key reason that high-performance SSD's like Fusion IO are more commonly used for server acceleration is because they read and write directly to the PCIe x16 bus, and can push over 1.3 Million (512byte) IoPS and up to 6.3GB/s read performance; which is still way faster than any array.

That being said, you may be on to something there with the thought of looking at SSD as a disposable device. But I doubt storage vendors will agree to authorize MLC and lesser-grade drives for use in a high-performance array. Unfortunately, SSD's of the quality necessary to be validated by the top array vendors will be the most costly storage available, so it will be interesting to see how well the enterprise takes to the idea of that level of planned obsolescence.

Of course, it always comes down to the actual application in mind, but assuming (for example) that you're using SSD for a cache array then there may be no problem with using drives until they cook themselves out and letting the array deal with migration. But it still brings up the issue of rebuild times, which usually result in reduced performance on the entire array. Though the rebuild process might likely be sped up 2-3 fold (due to the increased throughput of the drives themselves) I would still guess that the performance hit would untimately depend on the performance of the storage controller itself as well as the total size of the array being rebuilt.

What makes me most curious about the potential of SSD
for data center applications is how well the newest generation of SSD devices will perform under the rigors of the production environment. Spinning disk has a well-proven track record and the only way to find out how SSD REALLY compares will be to run them hard and "let the chips fall where they may". Yeah, I said that. -

Steven Hill - NWC Moderator
More Blogs from Commentary
Infrastructure Challenge: Build Your Community
Network Computing provides the platform; help us make it your community.
Edge Devices Are The Brains Of The Network
In any type of network, the edge is where all the action takes place. Think of the edge as the brains of the network, while the core is just the dumb muscle.
Fight Software Piracy With SaaS
SaaS makes application deployment easy and effective. It could eliminate software piracy once and for all.
SDN: Waiting For The Trickle-Down Effect
Like server virtualization and 10 Gigabit Ethernet, SDN will eventually become a technology that small and midsized enterprises can use. But it's going to require some new packaging.
IT Certification Exam Success In 4 Steps
There are no shortcuts to obtaining passing scores, but focusing on key fundamentals of proper study and preparation will help you master the art of certification.
Hot Topics
1
Scale-Out Storage Has Limits
George Crump, President, Storage Switzerland,  4/21/2014
White Papers
Register for Network Computing Newsletters
Cartoon
Current Issue
Video
Slideshows
Twitter Feed