Exchange 2010 Changes Storage Demands
Exchange has long been a prima donna when it came to storage. Large monolithic databases requiring high IOPS and clusters that required shared storage made Exchange the first Windows app on most SANs. With Exchange 2010, Microsoft's traded space for IOPS, eliminating single instance storage and log shipping for shared disk for high availability. Time to rethink storage for Exchange.
October 26, 2009
Back in the dark ages of the 1990s when we switched from file based email system like CC:Mail and MS Mail to the database structured Exchange, one of the big selling points was that emails sent to fifty users wouldn't clutter up our email servers with fifty copies of the attached Dilbert cartoon. Instead Exchange would store a single copy of the message regardless of how many users it was sent to.
With the upcoming Exchange 2010 upgrade, Microsoft is abandoning single instance storage. They're also killing off the unloved shared disk clustering (single copy cluster or SCC in redmondese), pushing instead the cluster technologies first introduced in Exchange 2007 It's therefore time to re-think how we provision storage for Exchange.
While single instance storage always sounded like a good idea, it's never really delivered what admins expected. Exchange stored a single instance of a given email, but if that message was forwarded, or if a user attached the same file to another email, Exchange started storing duplicate data anyway.
Then we got Exchange 2000, which allowed us to split the single information store into several smaller databases, allowing for simplified backups, restores and reduced online database fragmentation. Since Exchange only did single instancing within each information store, then duplicate data started appearing in even more places.
Finally, the move mailbox wizard isn't smart enough to identify messages that already exist in the target information store, so every item moved looks like a new message breaking the single instance paradigm. Since upgrading from Exchange 2000 or 2003 to 2007 or 2010 requires that each mailbox be moved, any messages sent before your last Exchange server upgrade aren't single instanced today.This and the fact that the cost of storage capacity was falling fast while the cost of IOPS have been pretty constant, the folks at Microsoft decided that reducing the number of IOPS an Exchange server uses to store an item was more important than the space savings from single instance storage. Where a thousand heavy users on an Exchange 2003 server may have needed six or eight 15K RPM spindles to provide enough IOPS, that same thousand heavy users can happily run on six 7200RPM drives with Exchange 2010.
With the new Database Availability Groups, admins can create up to sixteen copies of each Exchange database. In the event of failure, another member of a cluster can start servicing the users in seconds, since it already has a copy of the database. Servers exchange log files to keep the various databases updated but still use Microsoft server clustering to handle failover.
Microsoft's pitching the combination of clustering and Database Availability Groups as not needing expensive SAN storage since Exchange no longer needs high performance or shared access to the LUN. Many organizations will continue to use shared storage for Exchange to take advantage of snapshots or to maintain a single disaster recovery replication scheme, but Microsoft will use lower cost DAS in their total cost of ownership calculations for Exchange.
Another interesting option is using storage devices that replace single instance storage with data deduplication. NetApp claims Exchange users can see 30% storage reductions with their basic deduplicaiton. I'm hoping to get a GreenBytes array or Exar/HiFn BitWackr into the lab and see just how well Exchange 2010 and data dedupe get along.
About the Author
You May Also Like