Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Now That We Can Dedupe Everywhere; Where to Dedupe?

Data deduplication first appeared on specialized appliances designed to be used as the target of an existing backup application like NetBackup or Networker. My friend W. Curtis Preston recently posted a chart comparing the performance of the current generation of these appliances to his Mr. Backup Blog. While this helps answer some questions you may have about deduplicating appliances it begs another. In an era where I can dedupe data at just about any stage in the backup process where is the best place for me to backup my data? Today we just don't have answers to these questions. Maybe someday.

For large data centers with many terabytes of data to backup every night, appliances are the way to go.  If you're generating enough backup traffic to keep a high end appliance like Data Domain's DD800 or Quantum's DXi 8500 that can ingest data at 1.5GB/s or better fed, you're likely doing it through multiple media servers if not multiple backup applications.  

In those environments, using a single large appliance, or an array like NEC's HydraStor that can accept data at a mind boggling 27GB/s (97TB/hr), lets you have a single device to manage holding all your fresh backup data. It also means all your data is in a single globally deduplicated storage pool so the Windows C: drives you backup from physical servers with NetBackup and the virtual ones backed up with Veeam Backup will, if the stars align just right, result in WINSOCK.DLL being stored just once.

I'm less sure about the smaller outfits with 2-50TB of data to protect.  Should that outfit buy a Quantum DXi6510 or Data Domain DD610 which will give them 6-TB of net disk space (After RAID but before deduplication) for $50,000 or use the deduplication feature of their backup software and relatively low end disk array from Overland Storage, Promise, Infortrend or the like.  

Depending on what backup software they use the deduplication option will add $2-20,000 to their costs and a low end array with 12 1TB drives another $8-12K.  Is the mid-range appliance worth twice the price?

  • 1