As we wrote in this InformationWeek State of Storage report, "2013 looks like the year the phrase 'software defined' displaces 'cloud' as the all-purpose modifier synonymous with everything in IT that is innovative and salubrious." ViPR embodies both the design philosophy and terminology of SDN by separating what EMC calls the control plane of storage--provisioning, management and data migration--from the data plane--blocks, files, LUNs and devices.
It also uses the same controller-device archetype made familiar by SDN and OpenFlow. ViPR is the software controller, analogous to an OpenFlow controller like Floodlight in the networking world, while storage arrays, both big iron hardware like VMAX and VNX, or scale out and cloud-enable devices like Isilon or Atmos, handle the data plane analogous to routers and switches in SDN.
Digging into the details, it's clear EMC is taking this whole control-data segmentation paradigm seriously, as ViPR supports a heterogenous mix of hardware devices of widely varying characteristics. For example, a pool could include high-performance SSD or SSD-HDD autotiered capacity in a VMAX or VNX array alongside commodity, high capacity HDDs from a scale out Isilon system.
The latter point is particularly interesting because Isilon devices have traditionally been used for very large, unstructured file systems. However, once incorporated into the ViPR Borg, they inherit all the storage service features of the software controller.
Indeed, EMC is taking heterogeneity seriously as ViPR will also initially support some NetApp systems (as yet unspecified) and will publish a southbound API (again, heavily borrowing from established SDN terminology) that will allow other vendors to integrate their storage hardware into ViPR-controlled pools. And ViPR pools are Olympic-sized, scaling to potentially hundreds of physical arrays and PBytes in size, according to Chris Ratcliffe, VP of Marketing at EMC's Advanced Software Division.
ViPR handles both the creation of storage pools and provisioning of specific storage resources, be they raw blocks, traditional file systems, object stores or even big data distributed (HDFS) file systems. It then leaves the actual data handling and processing to the underlying arrays. In this sense, ViPR embodies more of a hybrid 'software-defined' approach. The ViPR controller handles the northbound application and administration functions while the arrays offload data processing like deduplication or compression and movement.
One obvious conundrum is pools comprised of hardware with vastly different performance characteristics. ViPR deals with this by building a hardware inventory and profiling the various performance parameters, like reliability/availability, speed/IOPs, latency, and available capacity, of associated hardware into an asset database. When provisioning storage for a new application, the system admin or orchestration system, like vCloud or OpenStack, requests storage meeting the requisite application requirements, such as 1 TB at five-9's and 5,000 IOPS. The ViPR system automatically creates the resource on available hardware meeting said requirements.
ViPR also has a northbound API for creating what EMC calls "storage services." These allow programmatic extension of the ViPR ecosystem to support new storage modalities, formats and applications.
For example, out of the box, will EMC support object files and Hadoop using a ViPR-based software overlay. The ViPR Object Data Service exposes REST APIs for Atmos (EMC's object storage appliance), Amazon S3 and Swift (the native OpenStack object store service) meaning pools can potentially use both cloud services and local VNX and Isilon arrays masquerading as object stores. In essence, ViPR tricks applications into seeing a familiar S3 or Swift object store even though the back end may be a traditional file or block storage device. Indeed, this prestidigitation allows data written as objects by a cloud application to be accessed as files by legacy apps.
Similar to the way it provides object support, ViPR also can provision pools as a Hadoop file system (HDFS). This is significant because it means data stored in a traditional block storage VMAX database can be exposed to big data Hadoop applications without moving it to a separate file repository. Theoretically, this could allow the same set of physical data to serve as a traditional transactional database while simultaneously be incorporated into a big data analytics system, in place. "You can run analytics across entire heterogeneous storage infrastructure," said Ratcliffe.
There are over a dozen sessions devoted to software designed storage and data centers at EMC World, and it's clear ViPR is EMC's contribution to the storage component of that vision. It's hard to overstate the significance of this move, as EMC is at risk of being undercut by less expensive rivals in a rapidly commodifying storage market, and as software becomes more important than hardware. That said, while ViPR looks good on paper and demos, its ultimate success depends on EMC quickly making good on all the promises.