Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Is Capacity Planning Dead In The Cloud Era?

Most of us think of the cloud as the end of capacity planning. Why do we need the cloud if capacity is practically infinite, provided that our applications are built correctly? As more users come in and/or more capacity is required, we can spin up more capacity -- both bandwidth and compute. We have the same attitude towards hybrid cloud environments. At peak times, we can spill into the cloud. 

Some great technologies have emerged to make this easier. Ubernetes is something that, while in proposal stages today, will make the deployment and migration of workloads in hybrid environments, and even between cloud providers, extremely easy. 

The automation provided by such systems will do wonders for our systems availability, and in a perfect world, it will save us money as we won’t have to over-provision capacity to meet peak demand. Given this capacity cornucopia and smart automation, why do we need capacity management, let alone in real time?

Capacity costs money

Spinning up capacity on demand to meet peak load, and to plow through downtime, are both great cases where the cloud can be both cost-saving and revenue-generating measures.

But what happens when we use capacity that we shouldn’t be using? For example, what if that new upgrade introduces inefficiencies? What if the proverbial infinite loop causes issues, or an inefficient architectural change triples IOPs, CPU, or both?

In these days of agile iteration, things such as this likely occur a bit more than we care to admit. While automation will ensure that your application continues to be highly available, the downside is the bill you get at the end of the month. Instead of the customary $100,000, you might suddenly see $500,000. 

In the old days of being contained to the available capacity in our own data centers, we had an implicit limit. The lack of flexibility prevented a myriad of problems like the ones outlined above. Things slow down and availability may be limited, but at least we could plan our expenses within the next two or three quarters with cast-iron certainty.

Real-time capacity planning

Capacity planning usually isn't considered as something that happens in real time, but is meant to look quite a length into the future. It's supposed to help us forecast capex budgets and long infrastructure built-out lead times. But we're not in Kansas anymore.

In this day and age, when capacity supply is practically infinite, we need to truly begin to worry about demand. And demand -- driven by NFV, SDN, and a ton of automation -- can quickly spin out of control. Being able to report on and also project consumption in seconds becomes one of the most important functions of an IT department as we move from a capex to an opex world.

Waiting hours or days for reports – or not having capacity planning and behavior anomaly detection as a part of your infrastructure performance management system – is a recipe for disaster. At the very least, you’ll get some unpleasant surprises.

Cloud realities

The cloud is here to stay as an integral part of our go-forward infrastructure strategy. Anyone who doubts that is fooling themselves. Moving to the cloud is often thought of as a way to optimize spending and remove the inefficiencies of poorly utilized big iron through the agility of on-demand resource deployment and consumption.

However, the reality of moving to the cloud is quite different. As one study found, CIOs only use about half of the cloud capacity they've bought; the reasons why are not surprising at all. It comes down to habits, and the ingrained way many think about the cloud as essentially the same kind of resource as the on-premises data center. What’s missing is real-time visibility to plan capacity. 


The aforementioned survey shows us that many CIOs overpay for base load and actually fail to pay to get optimal performance during peak times for fear of overages, which really misses the whole point. After all, busy hour is called busy hour for a reason; most infrastructures get stressed under peak demand for only an hour or two per day. During the other 22 hours, the systems hum at their base load. 

Paying double for base load for 22 hours -- and failing to pay to increase the capacity during that critical peak time -- nullifies many of the benefits that drove us to the cloud in the first place, like optimizing spending and saving money.

The solution is simple:  It’s crucial to have clear visibility – both of your current state, as well as predicted behavioral baselines – in real time. This will help avoid unintentional overages due to application problems and optimize cloud resources during normal operations.

But let’s not forget that, for many of us, on-premises data centers are also a reality, so whatever we do, we should take a hybrid approach.

In the age of the cloud, capacity planning is not dead. It has only shifted from the supply side to the demand side.