The Do's And Don'ts Of Virtualizing Database Servers
Jasmine McTigue and Jake McTigue
January 21, 2010
Virtualization conveys numerous benefits to traditional x86/64 bit server environments, but everybody knows that virtualization and heavily utilized databases don't tango; at least that's the consensus. By utilizing some simple best practices and taking advantage of smart features in vSphere, you can harness the flexibility, load balancing and high availability features of virtualized database servers.
Before we talk about getting everything virtualized, let's talk about the objections of the average educated database administrator. Your DBA knows his databases, their user counts, and the type of I/O they typically perform. He knows that storage I/O characteristics rule the day, and if he knows anything about virtualization, he knows this little thing called the hypervisor interferes with the ability of his database to make writes to storage. With management buying in to a serious virtualization initiative, he feels cornered and he's ready to fight. In order to disarm his objections, we need to be armed with facts.
The first order of business for any database rollout is the underlying configuration of storage. Databases make many small frequent reads and writes, and the capability of storage to keep up with I/O governs the performance of your databases. To this end, there are a few simple practices that keep everything running smoothly.
Firstly, know your storage. The type of storage array required to handle your I/O is going to be similar regardless of whether your environment is virtualized or not. On Storage Area Networks, satisfying your I/O requirements is a matter of provisioning your LUNs (logical unit number) with the correct raid levels, but the same wisdom applies to local storage configurations. Raid 5 is not a good choice for databases, especially for log files, as the parity calculations involved in spanning the data across the disks dramatically slows down write times. Raid 1 is a better performer in this arena, but it's still not great. You can get away with proprietary Raid 6, but make sure that your storage controller offloads the parity calculations to a separate storage processor. Among traditional raid types, Raid 10 is hands down the best choice. Raid 10 has excellent read and write times, and it doesn't slow appreciably for storage calculations. There are a number of proprietary raid variations that are good, and there are other nested raid configurations like Raid 15, Raid 51, Raid 05 and Raid 50. All of these high-end nested Raid configurations have good read and write performance and are suitable for database, but are commensurately expensive.
Databases perform two types of operations: they read data from a transactional database file, and they write data to a log. Because of the separate nature of these operations, it is always advantageous to split these operations across separate arrays. The same caveats about Raid types apply when splitting the reads and writes across separate arrays, and you should take care to match your I/O to the appropriate raid type for both the Data and Log LUNS or volumes.