Hybrid Clouds: No Easy Concoction

If vendors don't help customers mesh their internal clouds with public cloud services, customers will find a way around them.

Charles Babcock

September 3, 2009

13 Min Read
Network Computing logo

Every data center provisions its workloads for a worst-case scenario. IT managers put an application on a server with extra memory, CPU, and storage to make sure the application can meet its heaviest workload of the month, quarter, or year and grow with the business. This approach is so deeply ingrained in IT that, prior to virtualization, applications typically used 15% or less of available CPU and other resources. Storage might reach 30% utilization. Energy was cheap, spinning disks were desirable, and abundant CPU cycles were always kept close at hand.

In today's economic climate, such compulsive overprovisioning and inefficiency are no longer acceptable. What if, instead, applications throughout the data center could run at closer to 90% utilization, with the workload spikes sent to cloud service providers (a process called "cloudbursting")? What if 85% of data center space and capital expenses could be recouped, with a small portion of that savings allocated for the expense of sending those bursts of computing to the public cloud?

InformationWeek Reports

This tantalizing possibility--enterprise IT organizations managing an internal cloud that meshes seamlessly with a public cloud, which charges on a pay-as-you-go basis--embodies the promise of the amorphous term cloud computing. Step one with virtualization has been server consolidation. The much bigger benefit will come with the ability to move workloads on and off premises. "Anyone can build a private cloud," says Rejesh Ramchandani, a senior manager of cloud computing at Sun Microsystems. "The gain comes if you can leverage the hybrid model."

As Sun CTO and cloud advocate Greg Papadopoulos suggested during Structure 09 in San Francisco on June 25, "it will be really expensive and hard to move legacy pieces over. It's a much better strategy figuring out what are the new pieces that I want to move to the cloud."

Papadopoulos was implicitly pointing out that most public cloud services run virtual machines based on an x86 architecture. Sun's Solaris has been ported to x86, but IBM's AIX and most other Unixes have not, to say nothing of the non-Unix operating systems that preceded them. But those operating systems run mostly large, proprietary databases, the stuff that's hardly ripe for the public cloud anyway.

Other Obstacles
Moving data center workloads would immediately run into two more likely obstacles: the need to use the same hypervisor in both clouds, and the need to match up server chipsets. If you think you're already paying enough for virtualization software, prepare to pay more if you ship workloads to the public cloud. Call it vendor lock-in.

VMware and other hypervisor vendors have agreed only to create a common "import format," not a neutral runtime format. To avoid the complication of reconverting from the public cloud's format to your own, you'll want to use the same hypervisor if you plan to get your workload back behind the firewall in its original configuration. (Even that wasn't possible with the initial offering of Amazon.com's Elastic Compute Cloud, or EC2. You shipped off a task, it ran, then it disappeared. You got the results, but if there were any special settings or other one-time-only information contained in the configuration and its data, they simply disappeared. Amazon's Elastic Block Storage had to be invented to give the whole workload persistence.)

Did you want the option of using open source Xen or Linux KVM in the cloud, but you use VMware in-house? Too bad. Kiss some of those cloud savings goodbye as you buy more VMware.

Virtualization's live migration feature, where a task is whipped off one physical server and dropped onto another before its users are aware of it, would appear to give you the option of moving workloads at will between your private and public clouds. VMware's VMotion and Citrix Systems' XenMotion offer this capability today; Microsoft says its Hyper-V tools will be able to do so by the end of this year.

But so far, live migration can take place only between physical servers that share exactly the same chipsets. That's because different generations of AMD and Intel chips incorporate minute changes to the x86 instruction set and sometimes within different iterations of the same product line, such as Xeon. Want to shift a spike in your workload off to the public cloud? First check that you're both running servers with exactly the same chipsets.

We Will Overcome!
There are signs that some or all of these obstacles eventually will be overcome. The chip manufacturers want to iron out kinks in x86 instructions and make movement of virtual machines possible across different chips. It may take several years and one or two more generations of chips to get there.

Exchanges between hypervisors so far have been the province of the DMTF, the standards body formerly known as the Distributed Management Task Force. Meantime, third-party vendors such as Vizioncore, DynamicOps, and VMLogix offer management tools that cut across hypervisors and manage virtual machines interchangeably. Sameer Dholaki, CEO of VMLogix, says his company is developing tools that will manage VMs in both the private and public cloud "from the same pane of glass"--from one console--and will offer the first version by the end of this year.

Likewise, VMware says its goal with vSphere 4 ultimately is to supply tools not just for its ESX Server in the private cloud, but also for the public cloud. It plans to offer APIs that will let private cloud implementers invoke services from another, external vSphere 4 cloud. The APIs are still in beta with no announced delivery date, but VMware is working with Skytap, EngineYard, and others to illustrate how the internal and external clouds can be federated--and coordinated.

Not to be outdone, Citrix announced XenServer Cloud Edition and Citrix Cloud Center at about the same time VMware launched vCloud. C3 will give cloud providers the tools to manage and load-balance large numbers of virtualized servers and connect them via an enterprise bridge, Citrix Repeater, that can accelerate and optimize application traffic between a cloud and enterprise data center. Citrix says it will establish open APIs and interfaces through which the private cloud will connect to XenServer-based external clouds. For its part, Microsoft continues to catch up in virtualization management.

Still, you're going to be limited to the new workloads designed for x86 execution, as opposed to all those legacy workloads, as Sun's Papadopoulos suggested. How is the hybrid cloud not going to turn into another one of those pipe dreams that IT chases?

Sun's Ramchandani has priced out a set of IT business expenses to illustrate his case that hybrid cloud savings are real. Ramchandani applies Amazon EC2 pricing to a business that needs a lot of bandwidth to distribute films to customers, and he shows that storing all the video in Simple Storage Service, or S3, and EC2 eats up savings over a three-year period through Amazon's bandwidth charges. Storing all the video in-house is expensive as well because of the huge amount of storage and servers needed.

But distributing the most frequently requested films, 2% of the total, from the business' data center, combined with storing less frequently sought films on Amazon's S3--the hybrid cloud--is the most effective cost combination, Ramchandani maintains.

Such a store needs lots of bandwidth to download those 2% of videos in high demand, and by storing them in-house, the business would pay $102,800 over the course of three years, compared with $343,000 to distribute them from Amazon S3 storage, he estimates. Most of the difference lies in S3's bandwidth charges, Ramchandani says. On the other hand, testing such a large-scale file-moving business can be done more cheaply in the cloud, instead of building out a large-scale, permanent data center. Over the same three-year period, testing would cost $1.29 million in the cloud versus $4.97 million in-house, according to Ramchandani's estimates.

To be profitable, the business needs a combination of in-house storage, at least of its most frequently accessed videos, and cloud-based testing. Such a hybrid operation would cost $1.39 million over three years, compared with $5.1 million for an in-house-only operation, and $1.6 million for a cloud-only approach. Ramchandani acknowledges that his example involves a bandwidth-consuming video business. But the same math applies to any business trying to supply large amounts of content to customers, he maintains.

Need For Speed
The hybrid cloud advantage also applies to workloads where a service must be provided quickly. Eidetics, a company that conducts research on the marketplace acceptance of new drugs, is one such example. After Eidetics was acquired last year by Quintiles, a company that conducts multimillion-dollar clinical case studies on drugs for big pharmaceutical firms, it was functioning as an independent unit in Boston with its own specialized, column-oriented database. And that database, called Vertica, didn't mesh with the parent company's Oracle systems.

Pieter Sheth-Voss, Eidetics' research director, considered appealing to Quintiles IT for a centrally managed version of Vertica. Then he found that Amazon offers Vertica as a simple-to-use system on EC2. "We're a 40-person professional services firm with no idea of how to work with a central IT staff," says Sheth-Voss, with some embarrassment. "Quintiles is extra-rigorous about how it manages data," and it would want to impose its data-handling processes on the Eidetics team.

As an alternative, Sheth-Voss tried uploading a large patient-care data set to Vertica in the Amazon cloud at 9:00 one night and had his research results by 10 p.m., he says. "We had an 8.6 million-patient data set that we tried with Oracle, and it took one-and-a-half minutes to find out what percentage of it was female," he recounts. A typical Eidetics query examines hundreds of factors per patient per query, and the results are likely to lead to yet another complex query. To accomplish such research quickly, Eidectics needed to move off the Quintiles in-house Oracle systems to the cloud, and fortunately, its Vertica database is available on EC2.

Sheth-Voss: Doing the same work in-house would have taken many weeks

Quintiles associates who wanted to see the Eidetics research could access Amazon S3 through a browser. No complex integration problems had to be resolved. All that was required up front was 15 minutes to provision Amazon servers. The same task in-house "would have taken meetings and discussions over many weeks. A lot needs to be considered to provision a new Quintiles server," Sheth-Voss says.

In this example, the hybrid cloud functions between Eidetics' in-house version of Vertica and the cloud's version. Much of the value of the hybrid flows from other Quintiles researchers examining the results in the public portion of the hybrid, Sheth-Voss says.

Where Are The Standards?
Hybrid clouds will become more common in enterprise computing only as standards are developed. For starters, the DMTF has established the Open Virtual Machine Format, or OVF, an "import" format for VMs moving on a one-way street from one hypervisor to another. The major virtualization vendors have agreed to use OVF. The DMTF recently established the Open Cloud Standards Incubator, adding vendors Savvis and Rackspace to its governing board.

DMTF president Winston Bumpus, director of standards at VMware, says the incubator's chief task will be to address "manageability between enterprise data centers and public clouds, or private clouds and public clouds." The OVF is a "key building block," he says, but the Incubator also will have to come up with management interfaces and ways to define security levels common across cloud practitioners. The hybrid cloud is hampered in part by the lack of a taxonomy--terms that mean the same thing to competing vendors and their customers.

The DMTF's activity is spurred in part by an open source project called Eucalyptus, whose work, funded by a National Science Foundation grant, aims to give academic researchers access to public cloud resources. In the process, the project's developers, based in the computer science department at the University of California at Santa Barbara, created open source APIs that match the functions of Amazon's EC2, S3 storage, and ESB Elastic Block Storage offerings.

In April, Eucalyptus Systems--a company with $5.5 million in venture capital backing led by Benchmark Capital, which funded eBay and Red Hat--was formed to promote Eucalyptus APIs and code that supports provisioning and other back-end cloud operations. Because it's compatible with EC2, the Eucalyptus code lets enterprises develop internal clouds that will synchronize with the leading public cloud services. More recently, Canonical, supplier of Ubuntu Linux, announced it's forming a services unit with Eucalyptus Systems to advise companies on how to build their internal, Amazon-like private clouds.

The Eucalyptus open source code is the platform on which the Ubuntu enterprise cloud will be implemented, says Rich Wolski, CTO of Eucalyptus Systems, on leave from UC-Santa Barbara's computer science department. Ubuntu cloud services will in effect encourage the use of hybrid clouds, Wolski maintains. And building on open source technology avoids the issue of vendor lock-in, said Mark Shuttleworth, CEO of Canonical, in announcing the services unit July 1.

Beyond those efforts, the feds have decided that cloud computing may be too important to be left to commercial experts. At the U.S. National Institute of Standards and Technology, Peter Mell, senior computer scientist, and Tim Grance, program manager of cyber and network security, have posted their definition of a hybrid cloud alongside definitions for public and private clouds and are proposing standards to encourage their implementation. The move puts even more pressure on proprietary cloud practices.

"Standards are critical," Grance says. "One of our important charges is to enable that portability between clouds." NIST is loath to arbitrarily set standards, he says, and their creation is "a delicate balance between prescribing something and prescribing too much too early." But in the absence of standards, Grance says, NIST is trying to draw a road map, define requirements, and create "a common vocabulary around the topic. It's easy to say, a challenge to do. There's a lot of turf at stake, a lot of interested parties."

Grance: With cloud standards, "there's a lot of turf at stake"

Still, he says, "the future is much brighter if everyone embraces a certain amount of interoperability in the form of cloud APIs and virtualization."It will do little good to exchange the inefficiencies and high expenses of the old data center merely for a new set of proprietary practices in the cloud. Customers must have some say over how the workloads will move around. A neutral runtime format for virtual machines, which the leading commercial vendors could easily develop, would let customers migrate from one cloud to another if they found their first choice to be unsatisfactory.

Companies ready to pursue hybrid clouds will want some assurance that the savings will be real and that choices will remain for processing cloud workloads after initial commitments. That's not the case today, given VMware's dominance in virtualization software and all the vendors' reluctance to create a neutral playing field. If the leading vendors aren't willing to rapidly advance the notion of a hybrid cloud, other parties, including powerful customers adopting open source code, may blaze a trail on their own. Lots of proprietary interests could be trampled on the way.

Photo illustration by Sek Leung

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights