Some people think web-scale systems are the same as hyperconverged architectures. They’re wrong. While hyperconverged systems are often called web-scale, this is simply inaccurate. We make this assumption because the web-scale pioneers, such as Google, Facebook and Amazon, are often seen as examples of businesses that deploy hyperconverged architectures.
This leads to the following misconception: If they can do it so well, then surely other organizations should take notice of what hyperconverged platforms can do and follow suit. This would be great, aside from the fact that none of their architectures mirror hyperconverged; they are web-scale companies.
Hyperconvergence is a new architecture that integrates both storage and compute in X86 servers and is increasingly adopted in the enterprise. Hyperconverged solutions uses replication between these server nodes to improve uptime and availability. Meanwhile, web-scale is a conceptual description of the approaches data-center management used by Google, Facebook, and Amazon.
While both configurations serve important purposes, they should not be mistaken as accomplishing the same thing. Here’s a closer look at some of the major architectural differences between hyperconverged systems and web-scale deployments.
Hardware vs. software
Consider the layer of abstraction -- hardware vs. software. Hyperconverged systems provide a reliable hardware abstraction by replicating between machines, using virtual disks over small computer system interface (SCSI) or Serial ATA (SATA). The reliable hardware layer sits under the application software, while storage and compute are consolidated on the same machines. The environment is designed to be free of custom hardware and for software and hardware to be decoupled.
In contrast, web-scale deployments rely on software abstractions, but do not use reliable hardware abstractions. This is due in large part to hardware abstractions that impose strong consistency requirements, which are extremely difficult to scale. Software abstractions can be tuned for a specific workload with reduced consistency and scale while retaining performance and availability. Web-scale vendors might combine compute and storage for given services, however they usually divide applications into micro-services that are separated by the network.
Commodity vs. custom hardware
The use of off-the-shelf or customized hardware is another point of differentiation. Hyperconverged infrastructure makes a virtue of commodity hardware while web-scale companies rely on a heavy amount of customization.
In many cases, hyperconverged vendors encourage customers to buy off-the-shelf, mass-produced servers from hardware integrators. This approach is suggested as a way to reduce cost, however some appliances are actually more expensive than individual hardware and software components combined.
Web-scale companies are the most likely to use custom-built or heavily customized appliances. They match custom appliances to very specific application needs, relying on custom built or heavily customized appliances. There is a tradeoff between cost and performance gains when using more customized hardware. This customization is only financially feasible for web-scale companies, because Google, Facebook and Amazon build applications at massive scale.
Coupling vs. decoupling of hardware and software
Web-scale companies build hardware, infrastructure, and application software designed purely for their environments. While many web-scale companies run a small number of applications, each one is very large. Tightly integrating the hardware and software can be expensive for companies without huge scale, but it helps to simplify software and hardware management.
On the other hand, hyperconverged infrastructure is designed for a decoupling of software and hardware. Vendors decouple the infrastructure software from the hardware beneath it and the application software from the virtualization infrastructure. The major benefits of this approach are interchangeability and lower cost, however efficiency and performance are not as high compared to the customization used by web-scale companies.
Which is best for you?
Unless a company is trying to achieve the infrastructure size of Facebook or Google, web-scale most likely won’t work for its environment. If you’re not operating at the scale of these cloud companies, you likely have very different needs. Systems designed for large-scale environments won’t work in small organizations and vice versa. It's important to find an architecture that suits an organization’s specific needs. You need to be thinking about your data center in terms of what you need, not what the web-scale giants are doing.
For instance, if the business has virtualized either all or its most important applications, is it more realistic to consider a scale-out infrastructure that optimizes performance and capacity for virtual machines? How will this VM scale-out infrastructure provide access to metadata and analytics about the applications and can it predict storage capacity and growth requirements?
While there are things we can take away from Google, Facebook or Amazon we need to be pragmatic about what engineering choices make sense for the particular applications and environments in which we work.