Software-defined networking is beginning to take off, but what’s happening with software-defined storage? We are well into the hype phase, with everything from backup managers to disk drives being described as “software-defined” and we are perhaps just beginning to see the first real SDS products emerge. That’s a long way from mainstream -- or is it?
Despite all of the hype, startups have been developing new solutions and SDS may be closer to becoming a reality than you think. Let’s look at why that is. Mr. Gillette would recognize today's storage business in an instant. Razors and razor-blades or appliances and drives -- they're essentially the same business model. The major vendors have built a business where commodity drives are marked up enormously, while ensuring that cheap drives can’t be used in their arrays by getting unique identifiers added to the drive firmware.
But the cloud and other trends are bursting the bubble and paving the way for software-defined storage. Cloud providers like Google don’t buy specialized drives; everything COTS, with the result that the mega-CSPs enjoy $30 per terabyte hard drives while many businesses are locked into $300+ drives.
Looking at some numbers, we see a $190 list price 3 TB SAS drive marked up to $4,215 by EMC, $1,856 by NetApp and “only” $532 by Dell. But that’s only part of the story. Google uses many cheap SATA drives, with solid-state drives for fast work; a fast terabyte SSD/flash card likely costs Google around $500. List price for an 800 GB SAS SSD is $739. EMC sells that for $14,435 -- a 20X markup!
So what does all of this have to do with software-defined storage? We now realize that there are cheaper alternatives that will allow cost containment of the expected explosion in capacity requirements. The problem has been getting to them. Hardware isn’t enough on its own; we need good software, and this is where SDS becomes important.
To get commodity prices on drives, the appliance has to be free of any proprietary lock-in. That precludes the traditional vendors and means that alternative sources for appliances are needed. These can be COTS units from the same companies that supply AWS, Google and Azure: The Chinese ODMs, such as Supermicro, Lenovo, and Quanta. Such units are high quality -- the CSPs assure that by buying in millions of units -- and very inexpensive compared with the traditional storage array or appliance.
Learn more about the changing storage landscape in the Storage Track at Interop Las Vegas this spring. Don't miss out! Register now for Interop, May 2-6, and receive $200 off.
The next, and maybe most important issue, is finding software to run the appliances. Some software vendors such as Caringo and DataCore sell software that runs on COTS servers. Even better, open-source efforts such as Ceph and OpenStack Swift and Cinder are creating viable strong solutions for point appliances.
These software tools make deployment of a low-cost, COTS-based storage farm feasible and attractive, but are they SDS? The concept behind SDS is deceptively simple: Take the complicated data services that sit on top of storage and move them from the appliances to virtual machines sitting in servers. This allows right-sizing of the storage services for workload variation and also, incidentally, makes services compete with each other for market share, bringing prices down.
That’s the theory. Ceph is on the edge of SDS-compatibility. It is Lego-like and could be reconstructed to allow service abstraction. This would benefit the object/file/block universal storage software tremendously since missing features such as encryption, compression and deduplication could be integrate into the dataflow. With rewrites planned for the OSD storage node software in Ceph, this would be a great time to consider its SDS credentials more closely.
DataCore and FalconStor have software products that meet the definition of SDS and provide an inexpensive way to feature up boxes. These still move data through the service instance, which is a weakness shared with the current Ceph approach. Primary Data’s DataSphere attempts code that is more like asynchronous pooling, where the producer of data talks to the service and organizes metadata and chunk addressing and then communicates directly with a set of storage devices to read or write data. In another development, Nutanix is considering selling its software as a subscription service without a hardware appliance, while partnering with Dell, Lenovo and SuperMicro to put that code on their platforms.
We can expect the major storage vendors to react to the threat of SDS by introducing their own software products. Whether these are really SDS and whether they free the buyer from vendor lock-in on drives remains to be seen.
SDS is still in its early stages, but the signs of aggressive growth seem evident. Interest is high and some estimate that more than 70% of companies will try the approach, if not deploy it, in 2016. With intense pressure on IT budgets and a need to grow capacity dramatically looming, SDS may be the answer.