Storage Virtualization Revisited

Will your storage infrastructure ever be as simple as managing one LUN? Probably not, but it's a goal worth pursuing.

June 7, 2004

11 Min Read
Network Computing logo

We agree. In fact, we also devoted our May 27, 2002, cover to the state of storage virtualization technology (see "Nice, Neat Storage: The Reality,"), and have revisited the topic.

This time around, we liked what we saw from vendors; see "LUN Is the Loneliest Number," for the results from our latest RFI. But two years ago, we withheld our Editor's Choice award because we weren't impressed with any of the virtualization road maps submitted for our review, and we were put off by the oversell of the technology: Vendors were writing checks with their marketing brochures that we doubted their products could cash.

All Virtualization is In-BandClick to Enlarge

For example, vendors claimed their wares could bring about a well-managed storage infrastructure, as though virtualization software was all that was required to magically create a stable and well-groomed Fibre Channel SAN that could provision applications with the storage they needed automatically. The problem with this assertion is that virtualization is only an "enabling" layer of software--other layers are required on top of virtualization technology to manage the enterprise storage infrastructure.

Moreover, we perceived serious problems with virtualizing heterogeneous storage shares that were beyond the capacity of virtualization software vendors to address. Although virtualization might let you take a LUN exposed by an EMC platform and combine it with a LUN from a Hitachi Data Systems array to form a virtual volume, common sense dictates that you'd never want to risk it. Given that array designs and onboard features can vary greatly, we thought the resulting virtual volume would be unstable. Moreover, many end users told us they wouldn't trust their critical data to a volume constructed without regard for the hostility between the vendors of the source LUNs.

Your Input RequestedWe're always working to improve site content, but we need your help. Please take a few minutes to answer our short survey about our Newsletters.

Finally, like many folks, we followed the lead of industry analysts in describing four virtualization categories: host-based software, in-band, out-of-band and array-based. We now realize we had three categories too many. The reality: All virtualization is effectively in-band because it performs its function in the data path. As shown in "All Virtualization Is In-Band", regardless of the approach taken by the virtualization software vendor, all data must pass through a virtualization layer to get to where it needs to go. So, functionally speaking, all virtualization approaches are the same, though architectural variations do exist.

For example, host-based virtualization software works between the application read/write operation and the storage target-device driver. In effect, it intercepts your reads and writes and points them to its own virtual volumes rather than to physical volumes. In so doing, host-based virtualization is in-band virtualization.

The key drawback of host-based virtualization is the cost to license software for every host system. There's also the challenge of keeping the virtualization service on one host up-to-date with the virtualization services on all other hosts.

Veritas, a longtime advocate of such a strategy, is moving away from host-centric deployment. In a recent announcement, it said it's working with switch vendors to centralize its virtualization functionality on a SAN switch. Meanwhile, however, Microsoft has added volume virtualization directly to the Windows 2003 Server operating system. So for the foreseeable future, host-based virtualization will likely continue as one approach for simplifying storage.When Veritas finalizes its migration, it will join a cadre of vendors that already do what the analysts in our last article categorized as in-band virtualization. DataCore Software, FalconStor Software, Hewlett-Packard, StorageTek and even IBM are part of this group, with switch makers Cisco Systems and Brocade Communications Systems leading the way in switch-based offerings.

Other virtualization approaches, notably those of StoreAge, are inappropriately characterized by analysts as "out of band." Because they produce and update in-band device drivers, loaded on servers and through which data must travel to reach a target aggregation of LUNs, their virtualization function is in-band by nature. The only difference with so-called out-of-band approaches is just that--the mechanism for creating the virtualization layer is located outside the data path.

These products function like a coach on the sidelines (the out-of-band virtualization server) calling in plays to the team on the field (servers). Any way you cut it, however, the virtualization layer introduced is still functioning in the data path.

The final category of virtualization described by analysts refers to technology built directly on array controllers. The very fact that an array presents a LUN is a virtualization of the physical storage capacity of disks in the array. Controller-based virtualization is also clearly a form of in-band virtualization. Data entering the array encounters a virtualization layer that directs it to the right disk or combination of disks presented by the controller as a LUN. This technique is shared by virtually all array products except some un-RAID'd JBODs.

If we were cynics, we might argue that analysts carved out these four virtualization techniques to facilitate product differentiation for their vendor clientele, not for any real technical reasons. Regardless, though, they touched off a holy war around the relative merits and demerits of in-band and out-of-band techniques that continues at conferences and trade shows. Enough is enough.The proof is in the pudding: Both DataCore (in-band) and StoreAge (out-of-band) have posted strong sales during the past 18 months. Their product claims have been validated by the only test that really counts: real-world consumer success. To a one, customers that have deployed and tested each vendor's solution before buying it--something that must be done to validate the fit of this technology with your infrastructure in any case--have reported that it delivers on promises. The complex storage arrangement is rendered more simple, and hence, more manageable, regardless of the in-band or out-of-band location of the virtualization server.

Intuitively, you might expect an out-of-band strategy to be better than an in-band approach. "Queuing theory alone suggests you can't push traffic between 800 servers and 180 TB of disk on high-end storage-array platforms through an in-band virtualization server that is essentially a PC," said an IT manager at a Michigan health-insurance company after reading our previous article. "It's intuitively obvious that you'd quickly saturate the bus, introduce significant latency and insert a potential choke point into the data path."

This view was seized on by out-of-band providers to bolster their case against in-band rivals like DataCore. However, in this case, intuition proved incorrect.

Three factors determine how well in-band services perform, according to StorageTek chief technologist Randy Chalfant: processor efficiency; parallelism, or how many processes can be accomplished at the same time; and data pathing. A cursory read of technical white papers and case studies confirms that in-band virtualization vendors have leveraged increasing processor speeds and, in some cases, parallel server architectures and interrupt-reduced operating systems to avoid bus saturation and choke points. At one point, there was even talk from Softek, a licensee of DataCore's SANsymphony product, about porting the virtualization service to an IBM z/OS mainframe, whose parallel architecture would let it aggregate and manage nearly as many LUNs as you could throw at it.

The bottom line is that the segmentation of the virtualization market into in-band and out-of-band is best relegated to the dustbin of IT miscellany.The Lego Factor

We remain cautious about virtualization vendor claims of creating stable volumes from any available LUNs. A former CEO of EMC once remarked that there was no difference between his company's arrays and those of his nearest competitor: "We're both selling a box of Seagate hard drives."

This would seem to add credence to the view that virtualization can be used across any platform, regardless of the vendor logo on the tin. However, the CEO quickly added that the differentiator between arrays would be the value-added software created in microcode and added to the array controller, which increasingly functions like a standalone computing platform. This functionality continues to be the bane of effective management in a heterogeneous storage infrastructure and, in our view, poses a significant challenge to mix-and-match LUN aggregation, the main function of contemporary virtualization products.

Many of the purported cross-platform barriers attributed to different array architectures are more political than technical, Tyrrell says. Although he concedes that most shops don't aggregate LUNs from competing Big Iron array vendors, he adds that storage pools comprising LUNs carved from heterogeneous arrays are increasingly used to provide building blocks for virtual volumes.

Nevertheless, we're skeptical.Our final gripe--the view articulated by some vendors that virtualization is all that is required to make storage manageable--remains. Vendors have ratcheted back the marketing hype on this point, preferring instead to describe their virtualization products as one component of a storage capacity management service. This is the position of Softek, whose CTO, Nick Tabellion, is the other patent holder on IBM's Systems Managed Storage technology. Softek and others are working to integrate multiple management capabilities into a unified product set, of which virtualization is only one part and not necessarily the most important.

Despite our skepticism, the case for virtualization is a strong one. Given suitable arrays and a Fibre Channel fabric (or an IP storage network in the latest products from FalconStor), virtualization can adjust the storage capacity allocated to different enterprise applications with minimal disruption. It can also scale capacity behind applications that need more elbow room (see "The Potential Capacity Allocation Benefits of Virtualization,").

Less often perceived is the role that virtualization can play in improving capacity-utilization efficiency. Companies buy more storage than they need, and they use what they have inefficiently. In part, this is because of the way vendors package storage, but it's also a reflection of application-software's storage-capacity recommendations.

The Potential Capacity - Allocation Benefit of VirtualizationClick to Enlarge

As shown in "The Potential Capacity Utilization Benefit of Virtualization," a substantial portion of storage is allocated to an application per the software company's sizing recommendations, but is unused. Still more space is wasted by the additional capacity sold by array vendors as "elbow room" to meet future capacity requirements. The result is that the storage you actually need or use is typically a subset of what you end up buying.Limitations

Some vendors argue that virtualization can keep track of overall capacity and let you put unused but allocated capacity to use in the service of other applications. There are limitations to this capability, of course. For one, users must know what storage is a candidate for reallocation, and whether the storage has the the performance, reliability and other characteristics that would make it suitable for use by a particular application. Plus, we still confront the limitations imposed by heterogeneous storage.

The Potential Capacity - Utilization Benefit of VirtualizationClick to Enlarge

All that said, the capacity utilization management function is an unsung benefit of virtualization, though it may be the most important. Combined with off-loading the point-in-time mirroring functionality that wastes so much storage on our most expensive array platforms, and with effective policy-based data management to remove the stale or contraband data that's taking up valuable space, this virtualization function can help companies get considerably more from their storage platforms.

In the final analysis, there's a more persuasive case to be made for virtualization today than there was at the time of our last review. The hype around the technology has diminished, and the practical benefits are more clearly defined. We remain cautious, however, about some of the particulars--cross-platform LUN aggregation, for example--and recommend that you test before you buy to confirm vendor claims. The soft economy has made most vendors amenable to test drives. Don't be surprised to hear them hum the "one LUN" tune while they work through your test plan.JON WILLIAM TOIGO is a CEO of storage consultancy Toigo Partners International, founder and chairman of the Data Management Institute, and author of 13 books, including Disaster Recovery Planning: Preparing for the Unthinkable (Pearson Education, 2002) and The Holy Grail of Network Storage Management (Prentice Hall PTR, 2003). Write to him at [email protected].

Our editors nicknamed this package "Storage Virtualization II: Son of Storage Virtualization" because we received monster responses from vendors in response to our RFI. We were a little concerned because, in the Big Daddy of NETWORK COMPUTING virtualization coverage--our May 2002 cover package titled "Nice, Neat Storage: The Reality," (find it at panned all the responses and withheld our Editor's Choice award, a rare occurrence.

At that time we concluded that market infighting and over-the-top vendor hype had put a damper on enterprise adoption of storage virtualization technologies. Although we remain skeptical of some claims, for example, that mixing and matching vendor arrays is OK and virtualization is a storage-management panacea, we like what we heard from vendors, analysts and IT consumers this time around.

In "The One-Disk Enterprise: Virtualization Revisited," Jon Toigo dispels some remaining FUD surrounding storage virtualization, and in "LUN Is the Loneliest Number", we analyze four vendor takes on engineering the marriage of two disparate, some may even say irreconcilable, storage infrastructures. This time, we awarded our Editor's Choice award to StoreAge, but all the proposals are worth consideration (see the complete RFI responses).

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights