There was a time when buying a storage array meant comparing products from one of the big four or five vendors and having to decide whether to go “enterprise” with something like Symmetrix or “mid-range” with CLARiiON. Today our choices are wider than ever and as a part of the IT industry, storage seems to be spawning more startups than any other. Let's take a look at how the storage industry has evolved away from the paradigm of storage area networking and what storage administrators might expect in the coming years.
From the perspective of the customer, there is more choice in storage hardware and software solutions than ever before. We have all-flash, hybrid, scale-out SAN and NAS, object stores, traditional NAS, commodity, open source, and application-aware products. More recently, we’ve also seen the introduction of scale-out backup as a platform and there’s always the option to push data into the public cloud.
Storage requirements include not only rapidly processing increasing volumes of data, but doing something useful with it. Scale-out solutions and object stores provide the capability to do complex analytics while all-flash and some hybrid storage arrays resolve issues around latency that can’t be solved with legacy hard-drive based arrays. Application-aware storage attempts to provide more efficient use of resources, ensuring data is stored on the most appropriate medium at the time it's needed.
The overall effect is that the storage industry is fragmenting into application- and problem- specific technologies, moving away from the legacy “one size fits all” products. Some of the technology that we’ve been reliant on for many years simply doesn’t work at scale and new technologies are being used to plug the gaps.
For example, take RAID, which has been used for protection against failing drives for more than 25 years. At scale (large drives and many drives), RAID becomes less efficient and open to data loss through multiple drive failures. Erasure coding provides a more flexible solution that allows data to be protected redundantly using algorithmically generated redundant fragments, only a set number of which are needed to recreate the original data.
Erasure coding allows geo-protection (dispersal and protection of data over multiple locations) without creating entire replicas of data, but does suffer from some issues in performance, depending on object size. As a result, this new data protection technique can’t simply be slotted into existing hardware and so we see new platforms emerging to meet the needs of dispersed data protection.
Open source and integrated storage
While there's an ever-growing number of storage products and storage vendors to choose from, the storage industry also has gone open source with platforms like Ceph and GlusterFS. Even storage giants like EMC are playing with the open source model, releasing its unsuccessful ViPR platform as CoprHD in the hope that end users will help develop the code base in a way that EMC can’t. It’s perfectly possible that EMC will open source other platforms too, such as ScaleIO, in an attempt generate traction for what is a more complex sale than a typical hardware appliance.
Outside of the pure storage market, the hyperconvergence vendors are gaining market share. These companies, such as including Nutanix, Scale Computing and SimpliVity, offer hardware products that dispense with the need for dedicated storage and all of the management overhead that goes with it. The IT generalist can install and run a hyperconverged solution without needing to be an expert in technologies like Fibre Channel and SAN.
From a software perspective, hyperconverged vendors include VMware (with VSAN), Maxta, and Atlantis Computing. These are more examples of the fragmentation of the storage landscape and the move away from having everything in one place.
There are definitely plenty of choices available today for storing and manipulating data in a wide variety of formats. As the traditional role of the storage administrator --managing hardware -- starts to disappear, the focus will be squarely on the data rather than the hardware platform. The future evolution of the storage industry will be about understanding and managing the content and choosing the software solution to match. Picking the right product certainly won’t get any easier and will be one of the challenging tasks for the next generation of data managers.