Buzzwords come and go in this industry and one particular bête noire of mine has been term “software-defined storage” or SDS. It’s not that I’m particularly against the idea of storage delivered from software, but rather the way the term has been misused over recent years.
Taking a step back we can see that since the first tape and disk drives were introduced in the 1950’s, all persistent storage media has focused on hardware. The integrated cache disk arrays of the 1990’s provided a better level of availability and resiliency by introducing the logical LUN/volume and providing automated RAID data protection features. However even these systems with their level of abstraction are essentially hardware devices -- if I want more IOPS, I have to add more disk; if I want lower latency I need to add faster disks and/or flash storage.
Software-defined storage came about due to the way networking functions are evolving to be delivered on commodity hardware with centralized networking policy management -- so called software-defined networking. Never wanting to miss an opportunity, marketing teams at a number of storage startups and incumbents took the opportunity to brand their products as “software defined,” although there is no definition or consensus as to what the term means. The Wikipedia page for SDS refers to common characteristics including virtualization, automated policy management, and the use of commodity hardware. None of these definitions to my mind are strong enough or explain what SDS should really be about.
If we think for a moment about what the term software defined means, we can see that the definition implies that software, rather than hardware determines the features of storage. To a certain extent this has always been true, as software exists at every level (in the drive controller, in the array microcode) in order to store and retrieve data. However, in this instance we’re referring to the ability of storage to implement higher level functions that were once the domain of custom FPGAs and ASICs, such as data deduplication and compression.
I like to look at the definition of SDS in a more detailed way by thinking about the characteristics I need to apply to my data, namely performance, availability and resiliency. These features are complemented by management/automation, all of which should operate in an abstracted way and not be dependent on the amount of hardware used to store the data. To give an example, performance is delivered through quality of service; if I apply a QoS setting of 300 IOPS to a logical volume, I don’t expect this figure to change if I double the disk capacity of an array and redistribute the data across both the old and new disks.
So true SDS provides abstraction, policy-based management, and service-level metrics based on performance, availability and resiliency. I still need hardware, but only to the point that it supports my SDS service requirements.
Now that we understand SDS, why should we use it? Well in a public/private/hybrid cloud world, we want to assign resources to applications based on their requirements and of course based on what the customer has paid for. We want to eliminate crosstalk (like the noisy neighbor) and be able to scale environments in a controlled manner rather than through trial and error -- as has been the case with shared storage solutions that weren’t able to manage the specific characteristics of the data at a granular level. We also want to replace hardware without impacting the service levels offered to the customer.
VMworld 2015 was a great opportunity to see how software-defined storage has evolved, with many storage vendors exhibiting products. Companies such as Atlantis Computing, Springpath and Primary Data are pure software plays focused heavily on abstraction, even down to the level of implementing virtual disks that sit on their physical equivalent.
At the array/appliance level, SolidFire and HP with 3PAR have built systems that deliver to the requirements of SDS. Both vendors offer QoS as a key part of their infrastructures, and it’s not surprising to see that both have a heritage of being implemented by cloud service providers.
All of these companies are worth watching over the coming months and years as they will show the direction for future storage and the way we access our data. Remember, true SDS is about delivering the characteristics of storage in software, not simply about the move to commodity infrastructure.