Data security and compliance concerns are helping to drive enterprise interest in hybrid clouds rather than all-public or all-private cloud solutions. In this hybrid model, some of the company’s data resided in the private cloud, some in the public and much of it in both.
However, the hybrid cloud storage approach has problems. While it does help address compliance concerns, it doesn't erase them. There still are compliance issues that limit what data can go to the public cloud, or be in both clouds. At the same time, copying data to the public cloud temporarily takes too much time, and a lot of bandwidth, so the idea of “cloud-bursting” to handle load spikes runs into the reality that copying the data needed to feed new instances in the cloud may take longer than the spike duration.
Another issue is data synchronization. When changes are made, which is the prime copy, and how do you prevent public instances from “stepping” on the private data by using or creating out-of-sync information.
A variety of products from both startups and “big iron” vendors try to address these issues. Companies like Nirvanix, StoreSimple, Nasuni and Twinstrata offer caching gateways, but all suffer from the speed of the WAN in general operation. These gateways look like a NAS filer or a cloud (NFS or REST protocols respectively). Data is written to the local cache, then usually compressed and re-written in the cloud. The local disk or SSD acts as a read cache to speed up delivery of recent files.
Moving to a new hybrid model version where data remains in the private cloud, but can be accessed by the new instances in the public cloud, runs directly into the slowness and high latency of the typical WAN connection. Latency can reach milliseconds, compared with the microseconds expected in the private cloud, and this means public instances will be inefficient and slow.
There are several ways to improve the situation. The most useful option is to reduce the WAN traffic by a combination of caching and data compression. The benefits of compression and caching are use case dependent, but generally compression can achieve around 6x reduction in bandwidth (and in the stored data in the cloud), while caching typically reduces traffic by around 4x. The two approaches can be combined, but there are performance challenges, especially if the comparison is with SSD operation, since the compression and cache lookup are both compute intensive.
Increasing the speed of the WAN connection is another alternative. Unfortunately, in the US, the telcos decided that there was little demand for fiber Internet links. This limits WAN links to typically 50 megabits per second or lower, which can’t keep up with a single hard drive today.
Co-located private storage addresses WAN bandwidth to an extent, with storage pools located in telco facilities and connected to public clouds by dedicated links. This still leaves a low-speed connection to the private cloud, so the fiber-to-telco issue remains a problem.
There’s a more sobering question already being asked, however. When cloud security reaches the point of convincing users that it’s robust enough, perhaps in just one or two more years, will the agility, cost-effectiveness and sheer scale of public cloud offerings make hybrid cloud impractical?
This question is coupled with the move towards SaaS application mashups running in public clouds. At some point an all-public solution is compelling. From a technical point of view, the best possible colocation is on-premise with the public cloud providers, and this may be the bridge to the all-public solution.
This isn't happening today, but AWS and Google could extend their businesses into the private space without exposing any of their current sales, something that Microsoft and VMware can’t do as easily. In a sense, this is what the CIA deal with AWS appears to do. One can hear the CIOs all protesting it will never happen, but the co-location model has opened Pandora’s Box to the possibility.