Block-Level Incremental Backup

BLI sacrifices some initial backup storage efficiencies to deliver a backup process that has minimal impact on the client

November 5, 2008

3 Min Read
NetworkComputing logo in a gray background | NetworkComputing

10:30 AM -- While Block-Level Incremental backup (BLI) is not exactly de-uplication, it is often compared to source-side de-duplication and it does reduce the amount of data that goes across the network.

BLI acceptance often comes down to overcoming a key negative that users need to decide whether they want to accept. The initial backup data set consumes as much capacity as the data you are backing up: essentially a 1:1 data cost on your first backup, compared to source-side de-dupe that, depending on the data set, may reduce the initial backup set as much as 3X or more.

From there, BLI gets interesting -- all subsequent backups are only the blocks that have changed since the first backup so data growth should slow dramatically. Typically, a snapshot is taken prior to subsequent backups to preserve a versioning capability. Testing has shown that the client-side impact of doing this backup is minimal and is fairly quick, taking less than five to ten minutes to transmit the data. Client-side impact is minimized since only changed blocks need to be identified; there is no comparison process with the backup target. This lack of client impact and slow additional growth means that multiple backups throughout the day are not out of the question.

Even though blocks are being backed up, recoveries can be granular to the file level, and, with proper integration to the backup application, those recoveries can be all driven from a single GUI.

In some systems, the backup target is also "active," meaning that the data being backed up is stored as a file system and is accessible outside of the backup application. The ability to amortize the backup data for more than just storing backups can dramatically improve the ROI of the backup process.This active target can be used for in-place recovery, directly accessing the backup set via an iSCSI mount. Not having to restore data could be a huge time saver. In addition, snapshots of this active target can be mounted to alternate servers for development, litigation support, or decision support.

BLI only works on a volume-by-volume basis. So unlike source-side de-duplication it does not eliminate data that might be similar but be on different volumes. How much redundant data actually exists between volumes will vary from customer to customer, so it is hard to generalize whether this is a big issue or not.

Although some vendors like Symantec have target BLI backups at databases like Oracle, others are promoting a broad coverage of OSs and platforms. Today, NetApp and backup software vendors that support Open Systems Snapvault can provide this capability. Syncsort has a stand-alone system that can work with NetApp or without it. Both companies have an integrated move to tape for further protection. There are others that offer BLI in other forms, but the integration to the backup application is either limited or non-existent.

In short, BLI sacrifices some initial backup storage efficiencies to deliver a backup process that has minimal impact on the client.

— George Crump is founder of Storage Switzerland, which provides strategic consulting and analysis to storage users, suppliers, and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators. Previous installments of his discussion on data de-duplication can be found here.6668

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights