All Snapshots Are Not Created Equal

Over the past decade or so, snapshots have become a standard feature of disk arrays, volume managers, file systems and even PCI RAID controllers. The pitch from vendors of these products is pretty much the same: "With our technology you can take a snapshot in just a second or so and it will hold only the changed blocks, taking up much less disk space than a full copy." While that statement may typically be true, there are big differences in snapshot implementations.

Howard Marks

September 30, 2010

4 Min Read
Network Computing logo

Over the past decade or so, snapshots have become a standard feature of disk arrays, volume managers, file systems and even PCI RAID controllers. The pitch from vendors of these products is pretty much the same: "With our technology you can take a snapshot in just a second or so and it will hold only the changed blocks, taking up much less disk space than a full copy." While that statement may typically be true, there are big differences in snapshot implementations.

When you are considering snapshot technology, ask these questions of your vendor:
1.    What technology do you use? Copy-on-write, redirect-on-write, Black Magic?
2.    How will my performance be affected by having 5 snapshots of a volume? 20?
3.    Do I have to dedicate space to snaps ahead of time?
4.    How much space does the first snapshot take?
5.    What is the block granularity of your snap technology?

The biggest difference to consider is the underlying technology. When an application writes to a disk using the most common snapshot technology, copy-on-write, the snapshot provider copies the contents of the block being overwritten to a new location in the snapshot file. Copy-on-write requires three I/Os for a write-to-read the current contents of the block, write the new data and write the old data to the snapshot.  Redirect-on-write snaps, from NetApp's Write Anywhere File Layout (WAFL) and ZFS, among others, write the new data to free space on the disk and update the file system, or volume, metadata to include the new block in the current data set and the old one in the snapshot.

Where copy-on-write requires three I/Os per write, redirect-on-write, like a system without snaps, still performs only one. Note that both techniques require metadata updates, but they're not significant to system performance as they're almost always cached. Copy-on-write snaps can usually be sent to a different RAID set or disk tier while redirect-on-write snaps usually have to be in the same tier as the running data.

Then there are VMware's log file snapshots. These store the block changes in a log file, freezing the original data in the virtual machine disk (VMDK).  While this technique creates snapshots quickly and can be space-efficient, it means that all disk I/O must check each snapshot in turn to see if it has the latest version of the disk block. VMware snaps can significantly slow down your system if you keep them around too long or create more than two or three snapshots of a VM.Some days it seems that array vendors look at their snapshot facilities as much as a way to sell more capacity as to help their customers protect their data. The most egregious example of this was a major vendor's "unified" storage system  that a client of mine was thinking about buying a few years ago. On this system the first snapshot of an iSCSI LUN was actually a copy of the LUN's data. So storing 1TB of data and a reasonable number of snaps would take 2.5-3TB of disk. A more efficient system could need just 1.5TB.

Requiring that snap space be allocated, or even reserved, on a per volume basis is another pet peeve of mine. Like fat provisioning, snapshot reservations lead to inefficient disk usage by delegating disk space to functions based on anticipated peak usage, not real demand. I want snapshot providers that take space from the free pool, but let me set a limit on how much space will be taken by snaps.

Last, but certainly not least, is how big a block does the snapshot technology use? Storage guys hear block and think 512 bytes, but no storage system that I'm aware of uses chunks that small to allocate space or manage snapshots. NetApp's WAFL manages data in 4K paragraphs, while I've seen other systems use page or even chapter size chunks of 32K to 512K bytes. Host a database with 4K or 8K pages on a system with 512KB chapters, and your snapshots will take up several times the amount of space they would on one using 64K pages.  

Allocation size becomes an even bigger issue if you're using a system that bases its replication on snapshots, as many lower-end systems from Dell/Equallogic to Overland do.  It's a rude awakening to discover that you need a bigger WAN link to your DR site because 4K updates generate 256K replications.

If you're just using snapshots to store a consistent image of your data while a backup job completes, you don't have to worry too much about how your storage system creates and manages snapshots. If you want to be able to spawn test copies of your production servers, replicate your snapshots or keep multiple snapshots for days as low RTO (recovery point objective) restore points, the difference between ordinary snapshots and great ones will be significant.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights