• 09/30/2010
    9:00 AM
  • Rating: 
    0 votes
    Vote up!
    Vote down!

All Snapshots Are Not Created Equal

Over the past decade or so, snapshots have become a standard feature of disk arrays, volume managers, file systems and even PCI RAID controllers. The pitch from vendors of these products is pretty much the same: "With our technology you can take a snapshot in just a second or so and it will hold only the changed blocks, taking up much less disk space than a full copy." While that statement may typically be true, there are big differences in snapshot implementations.

Some days it seems that array vendors look at their snapshot facilities as much as a way to sell more capacity as to help their customers protect their data. The most egregious example of this was a major vendor's "unified" storage system  that a client of mine was thinking about buying a few years ago. On this system the first snapshot of an iSCSI LUN was actually a copy of the LUN's data. So storing 1TB of data and a reasonable number of snaps would take 2.5-3TB of disk. A more efficient system could need just 1.5TB.

Requiring that snap space be allocated, or even reserved, on a per volume basis is another pet peeve of mine. Like fat provisioning, snapshot reservations lead to inefficient disk usage by delegating disk space to functions based on anticipated peak usage, not real demand. I want snapshot providers that take space from the free pool, but let me set a limit on how much space will be taken by snaps.

Last, but certainly not least, is how big a block does the snapshot technology use? Storage guys hear block and think 512 bytes, but no storage system that I'm aware of uses chunks that small to allocate space or manage snapshots. NetApp's WAFL manages data in 4K paragraphs, while I've seen other systems use page or even chapter size chunks of 32K to 512K bytes. Host a database with 4K or 8K pages on a system with 512KB chapters, and your snapshots will take up several times the amount of space they would on one using 64K pages.  

Allocation size becomes an even bigger issue if you're using a system that bases its replication on snapshots, as many lower-end systems from Dell/Equallogic to Overland do.  It's a rude awakening to discover that you need a bigger WAN link to your DR site because 4K updates generate 256K replications.

If you're just using snapshots to store a consistent image of your data while a backup job completes, you don't have to worry too much about how your storage system creates and manages snapshots. If you want to be able to spawn test copies of your production servers, replicate your snapshots or keep multiple snapshots for days as low RTO (recovery point objective) restore points, the difference between ordinary snapshots and great ones will be significant.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.

Log in or Register to post comments