Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Analysis: Data De-Duping: Page 7 of 9

These backup solutions could reach even higher data-reduction levels than the backup targets by de-duplicating not just the data from the set of servers that are backed up to a single target or even a cluster of targets but across the entire enterprise. If the CEO sends a 100-MB PowerPoint presentation to all 500 branch offices, it will be backed up from the one whose backup schedule runs first. All the others will just send hashes to the home office and be told, "We already got that, thanks."

This approach is also less susceptible to the scalability issues that affect hash-based systems. Since each remote server only caches the hashes for its local data, that hash table shouldn't outgrow available space, and since the disk I/O system at the central site is much faster than the WAN feeding the backups, even searching a huge hash index on disk is much faster than sending the data.

Although Televaulting, Avamar Axion and NetBackup PureDisk all share a similar architecture and are priced based on the size of the de-duplicated data store, there are some differences. NetBackup PureDisk uses a fixed 128-KB block size, whereas Televaulting and Avamar Axion use variable block sizes, which should result in greater de-duplication. PureDisk can be managed from NetBackup, and Symantec promises greater integration in the future, which we hope means de-duplication integrated into data center backup jobs. Asigra also markets Televaulting for service providers so small businesses that don't want to set up their own infrastructure can take advantage of de-duplication too.

Backup targets, including FalconStor's VTL, Quandum's DXi series and Data Domain's appliances that can replicate data after it has been de-duped, can see the same kind of bandwidth reductions for branch data center off-site backups and disaster recovery of applications that don't require real-time replication.

Data de-duplication is here to stay for at least a while. We spoke to several users who report they really do get 20-to-1 and greater data-reduction factors without making major changes to their backup processes. Small organizations can use the new-generation backup programs from Asigra, EMC and Symantec to replace their conventional backup solutions. Midsize organizations can use backup targets in the data center. Large enterprises with very high backup performance needs may have to wait for the next generation.

Don't Fear Collisions