The Do's And Don'ts Of Virtualizing Database Servers
Virtualization conveys numerous benefits to traditional x86/64 bit server environments, but everybody knows that virtualization and heavily utilized databases don't tango; at least that's the consensus. By utilizing some simple best practices and taking advantage of smart features in vSphere, you can harness the flexibility, load balancing and high availability features of virtualized database servers.
January 21, 2010
Virtualization conveys numerous benefits to traditional x86/64 bit server environments, but everybody knows that virtualization and heavily utilized databases don't tango; at least that's the consensus. By utilizing some simple best practices and taking advantage of smart features in vSphere, you can harness the flexibility, load balancing and high availability features of virtualized database servers.
Before we talk about getting everything virtualized, let's talk about the objections of the average educated database administrator. Your DBA knows his databases, their user counts, and the type of I/O they typically perform. He knows that storage I/O characteristics rule the day, and if he knows anything about virtualization, he knows this little thing called the hypervisor interferes with the ability of his database to make writes to storage. With management buying in to a serious virtualization initiative, he feels cornered and he's ready to fight. In order to disarm his objections, we need to be armed with facts.
The first order of business for any database rollout is the underlying configuration of storage. Databases make many small frequent reads and writes, and the capability of storage to keep up with I/O governs the performance of your databases. To this end, there are a few simple practices that keep everything running smoothly.
Firstly, know your storage. The type of storage array required to handle your I/O is going to be similar regardless of whether your environment is virtualized or not. On Storage Area Networks, satisfying your I/O requirements is a matter of provisioning your LUNs (logical unit number) with the correct raid levels, but the same wisdom applies to local storage configurations. Raid 5 is not a good choice for databases, especially for log files, as the parity calculations involved in spanning the data across the disks dramatically slows down write times. Raid 1 is a better performer in this arena, but it's still not great. You can get away with proprietary Raid 6, but make sure that your storage controller offloads the parity calculations to a separate storage processor. Among traditional raid types, Raid 10 is hands down the best choice. Raid 10 has excellent read and write times, and it doesn't slow appreciably for storage calculations. There are a number of proprietary raid variations that are good, and there are other nested raid configurations like Raid 15, Raid 51, Raid 05 and Raid 50. All of these high-end nested Raid configurations have good read and write performance and are suitable for database, but are commensurately expensive.
Databases perform two types of operations: they read data from a transactional database file, and they write data to a log. Because of the separate nature of these operations, it is always advantageous to split these operations across separate arrays. The same caveats about Raid types apply when splitting the reads and writes across separate arrays, and you should take care to match your I/O to the appropriate raid type for both the Data and Log LUNS or volumes.Given the fact that a typical virtualization host system has abundant CPU and memory resources available, let's talk about minimizing hypervisor involvement in disk writes and assuring that your database has enough resources to handle whatever we're going to throw at it. By following the following recommendations, you can relax about database issues on virtual platforms.
First, we need to minimize the involvement of the hypervisor in making writes to storage. There are two ways to accomplish this: we can either allow the virtual machine exclusive access to the raid datastore, or we can give the database virtual machine a Raw Device Mapping (RDM) to the storage. Giving the VM exclusive access to the datastore is just that, we create a datastore on top of the physical array or LUN, and we provision all the available space into in a single disk. This minimizes hypervisor involvement in storage I/O because there are no other virtual machines using the array. The second option is to grant the virtual machine a Raw Device Mapping. Raw device mappings in vSphere allow a particular virtual machine unrestricted and exclusive access to an underlying storage medium. In short, the RDM prevents nearly any interference by the hypervisor in storage utilization. For moderately utilized databases, the first method is probably sufficient, but if your database is a monster, then RDMs are your only option. RDMs require additional configuration and zoning or masking to implement on the storage side, but VMware makes it easy to allocate them to a virtual machine.
Finally, we need to make sure your database server has the CPU and memory resources necessary to do its job. In VMware, we can do this with simple CPU and Memory reservations. Windows Perfmon and VMware capacity planner are both great for getting a starting point for what's required on a physical database server, and the next step is just to set a reservation for disk and CPU commensurate with your sample data. If you're really dead set on giving exclusive CPU access to your database server, set affinity for the server to a particular core or cores and unset affinity on all other virtual machines from that core or cores; this prevents any other machines from using that CPU and guarantees exclusive access.
About the Author
You May Also Like