One of the great promises of virtualization is the ability to reclaim vast amounts of resources that stood forever idle on physical servers. This is an achievable goal, but unfortunately, a large number of sysadmins are not seeing the results they expect. This is because they are going about provisioning the wrong way. Virtualization -- specifically hardware virtualization using a hypervisor -- requires a change in how one thinks about resource allocation.
The three areas where I see the biggest mistakes are vCPU allocation, memory allocation, and templates. Let's discuss each and how it affects the proper sizing of virtual machines in a VMware vSphere environment.
I would hazard a guess that at least 80% of the VMs I have come across never use more than 10% of provisioned vCPU. Yet at the same time I see far more VMs with two vCPUs provisioned than with one. Because of the way scheduling works, this is not usually an issue. However, when contention arises, an overprovisioned vCPU can cause %ReadyTime (the time a vCPU has to wait for the physical CPU) to spike and crush performance.
Adding unnecessary vCPUs can also hurt your consolidation ratio. On average, you should see four to six vCPUs per physical core. If every VM has one more vCPU than it needs, you are only getting two to three vCPUs per core.
To properly size the vCPU for a VM, look at the performance metrics of the workload. If the application is not multi-threaded and peak CPU demand is below 3000MHz, provision a single vCPU.
Sizing memory is a balancing act: Too much or too little can force contention. Due to the semi-persistent nature of memory, it's more complicated than CPUs to size properly.
When provisioning memory, it's important to understand active memory versus allocated memory, boot time behavior of the OS, and paging.
Active memory is the memory that the guest OS and application actually uses. Allocated memory is the amount of physical RAM that the guest has requested from vSphere. The idea is for the delta in these two numbers to be as small as possible during peak usage. It's better to err on the side of caution and have too much rather than too little, but try to get as close as possible.
When a Windows server first boots, it touches all addressable memory. This "claims" the memory from vSphere if it is available. If it's not available, the request can cause memory reclamation to kick in and force contention. This is why it's important to only provision as much memory as is actually needed. Linux OSes do not have this same issue; they address memory only as needed.
Accurate memory provisioning also is important in order to avoid hypervisor paging. As the hypervisor is unaware of the applications running on each workload, it will pick memory at random to page to disk if needed. If this is active memory, it can devastate application performance.
One other important point is that the less memory a virtual machine has assigned, the faster it will complete a vMotion event.
A common mistake I see often is that when new VMs are deployed from a template, they are not customized. When cloning from a template, take the time to adjust the provisioned resources to what will actually be used by the VM, otherwise resources will be wasted. This extra effort will go a long way to extend the life of your current cluster.
The most important thing that you as a vSphere admin can do is understand the actual needs of the workloads you are hosting. When application owners ask for new VMs, make them prove they need the resources they are requesting (if possible). Just because virtual resources can be over-subscribed does not mean the pool is endless. By properly provisioning VMs, you will have better performance, happier users, and less support calls at 2:00 a.m. No one likes those calls.