In a rarely seen demonstration of self-reflection, several virtualization-technology vendors, including Compaq Computer Corp., DataCore Software, FalconStor Software, StoreAge Networking Technologies, StorageTek and Veritas Software, have concluded that IT decision makers simply do not understand the value of storage virtualization--and that their own infighting might just be to blame.
Our discussions with several IT managers and systems administrators at recent trade shows suggest that the vendors are partly right--not about the education problem, but about the market infighting. Prospective consumers said repeatedly that virtualization is simple to grasp: "Virtualization means the mapping of physical disk drives to logical volumes--something that has been done for years on large-scale, server-attached storage arrays," a systems administrator at a major Idaho bank said, clearly "getting it." "The problem is that all the vendor marketing hype is muddying this definition. It seems like virtualization means different things to different vendors."
In fact, many prospective buyers probably understand virtualization too well to feel comfortable using it in their shops. "I am not convinced that virtualizing 120 TB of storage is a good thing to do," an enterprise storage manager for a Detroit-based insurance company said. He added that vendors still need to work out many issues, such as the performance impact of queuing many application I/O requests behind a virtualization server. "I have to wonder how much storage traffic it takes to saturate one of these virtualization engines. I hear different things from different vendors, and it worries me," he said.
The bottom line is that while vendors are going to great lengths "educating" consumers to create a market--investing time, money and effort to participate in a ceaseless flood of informational forums and storage conferences over the coming months--IT managers argue that education isn't the issue. Until a clear winner emerges in the virtualization wars, and until additional software has been created to take advantage of virtualization and deliver business value, no forum, conference or junket will convince users to buy in.
Defining Terms
Storage virtualization is not new. The concept of taking many physical disk drives and representing them with a volume name or drive letter started with the introduction of RAID arrays in the late 1970s. The virtualization discussion has been resurrected in connection with burgeoning architectures for networked storage, such as SANs (storage-area networks).
In the case of storage virtualization, virtual volumes represent potentially great numbers of physical disk drives or other disk-based storage constructs, such as array partitions, which are themselves virtual volumes created from a subset of disks in the array. In a storage network, just about anything represented by a LUN (logical unit number) can be aggregated into a virtual volume. Many industry insiders even refer to software used to create virtual volumes using such derogatory vernacular as "LUN splicers" or "LUN carvers."
For several decades, disk virtualization has been commonly provided by code executed on the controllers of large disk arrays. Software on the controller has enabled the aggregation of multiple disk drives in the array and their presentation to server operating systems as virtual volumes. As storage arrays are increasingly linked together into Fibre Channel fabrics (or into true storage networks based on TCP/IP protocols), vendors are seeking to provide the virtualization functionality of array controllers as a SAN software layer--a virtualization engine.
Vendors boast that these virtualization engines offer various cost advantages--depending on the marketing brochure. StoreAge, for example, claims its virtualization appliance enables more effective storage-capacity usage and delivers high availability to storage organized in a SAN. DataCore Software says virtualization is required for storage to scale nondisruptively and to deliver "just enough storage, just in time." FalconStor claims virtualization will make data more secure in a SAN and will provide server-free backup. Veritas says virtualization delivers storage consolidation and intelligent provisioning of storage to applications.
Truth is, virtualization delivers none of these benefits. It is simply an enabling technology--one that other applications can leverage to reduce costs and create other operational efficiencies, if and when the industry can decide on a common method to deliver them. Given virtualization's disk-array pedigree, most IT decision makers understand this point all too well.
Where to Virtualize
Today, much of the industry debate about virtualization revolves around how and where to provide virtualization services. EMC and other large-scale array vendors say they see little reason to deviate from the tried-and-true approach of virtualizing storage at the array, using controllers specifically engineered for the job. However, SAN virtualization vendors have advanced out-of-the-box approaches that can be implemented via application host-based software, appliances in the data path or virtualization servers outside the data path.
Veritas has supported the position that virtualization is best performed by software on each server host. The software becomes part of how applications relate to the storage infrastructure: Applications do not "see" storage devices--they see only volumes created by the virtualization engine.
Such a strategy is derivative of tape-drive virtualization. In an environment in which servers can access different tape devices from different vendors, virtualizing the devices so the backup application "believes" it is working with a specific make and model tape drive is beneficial. The tape-drive virtualization software ensures that data is sent and received using the semantics and command set employed by individual tape devices.
Critics of host-based virtualization complain that the strategy is expensive and fraught with hassles. Software must be maintained on each server and is subject to a broad range of errors and failures. Moreover, host-based software imposes a drain on server operating system resources, consuming server cycles that could be better spent on application processing.
These criticisms have given rise to in-band and out-of-band, or symmetrical and asymmetrical, approaches.
In-band virtualization. The symmetrical method uses a virtualization appliance or server placed in the data stream between the application host and the SAN switch. In effect, the appliance virtualizes the storage connected to its back end (the storage components connected to the SAN switch) and represents virtual volumes to the application servers connected to its front end.
Such a strategy moves the virtualization function off the application host and establishes another "server tier" that performs a function like that of an array controller. It further provides a centralized location where additional storage-management applications can be hosted.
Critics argue that in-band appliances, such as those offered by DataCore, FalconStor and StorageTek, may become choke points as the storage infrastructure grows and the volume of traffic traversing the virtualization servers increases. Those vendors respond with performance test data and clustering/failover strategies to show how storage traffic can pass readily through the virtualization layer. DataCore and FalconStor, for example, claim they have actually decreased response times and improved throughput in small file transfers through the use of caching technologies. Of course, the ability to scale these appliances to very large storage pools remains a question. As The Evaluator Group's Kerns points out, few virtualization solutions have been deployed in enterprise data-center environments, where hundreds of terabytes of storage may be involved. Since most in-band appliances are based on PCI bus commodity-server platforms, whether they can scale to meet extremely large I/O requirements without becoming choke points in the data path remains a valid question.
Out-of-band virtualization. An alternative to the symmetrical approach is the asymmetrical strategy, promulgated by Compaq and StoreAge. Compaq originally conceived of an out-of-band virtualization engine, which it branded as VersaStor, to fulfill the storage management and control functionality requirements described in its Enterprise Network Storage Architecture (ENSA) white paper. This white paper is considered a fundamental document describing contemporary SAN architecture. In effect, the VersaStor server sits on the SAN sidelines, evaluating application requirements and capacity utilization, then checking available storage resources connected to the SAN infrastructure. It virtualizes the SAN storage and writes virtual volume descriptions to a proprietary chip on the host bus adapter installed in the application server.
The industry has turned a cold shoulder to Compaq's suggestion of a proprietary VersaStor chip, despite the vendors' arguments that writing the virtual volume descriptions to application host disk drives would make such descriptions vulnerable to hackers. StoreAge proposes an alternative that circumvents the proprietary chip issue by writing volume descriptions to application host systems.
Today, such asymmetrical solutions must compete with symmetrical and host-based virtualization approaches for market share. No clear winner has emerged, though most vendors can point to a few reference accounts to back their basic claims.