The acquisition battle over Data Domain was a business newsworthy subject for a number of weeks. The culmination, with EMC's successful bid, signifies that while this particular skirmish is over the data deduplication wars are going to heat up even more. In this difficult economic climate, being able to make a powerful economic case for enterprises to actually spend money to do something is challenging, at best. Data deduplication is one of those rare opportunities where the economic and technological benefits are well-recognized so it should come as no surprise that vendors are moving troops into this market as quickly as they can.
Note that EMC's acquisition of Data Domain is by no means the first acquisition of a data deduplication company by an information infrastructure vendor nor is it likely to be the last. Recall that IBM bought Diligent Technologies, one of the leading companies in the data deduplication space, well over a year ago. IBM has announced new capabilities for its TS7650 ProtecTIER gateway and appliance family, which uses data deduplication to support virtual tape library (VTL) technology. The announcement has been planned for some time so that is simply coincidental to EMC's Data Domain acquisition news.
A core use of data deduplication technology has been in conjunction with disk to disk backup using a VTL. Storing multiple full backups on disk is not economically feasible so older copies of backups would have to be kept on tape. Although most recoveries are from data stored recently, there are occasions when older data has to be recovered -- and doing that from tape could be very time consuming. Elimination of redundant data on disk through data deduplication means that older backup data can be stored economically on disk. That can also facilitate the recovery process of older data from disk should that prove to be necessary.
However, that has tended to be at the local level. If data is needed at a remote site for disaster recovery (DR) purposes, the backup data on disk is first written to a tape library at the local site. The backup tapes are "exported" (i.e., physically removed) from the tape library and then physically transported (typically by truck) to the DR site. Transportation of data involves a transportation cost, security issues (such as lost or stolen tapes), and time (say 24 hours when all elements of the transportation process are taken into account).
This transportation process is called vaulting. Rather than physically transporting the tapes, electronic vaulting is the process of sending the data electronically from disk at the local site to disk at the DR site. This speeds up the process and improves both security and reliability. In addition, recoverability planning is a lot easier. The problem is that the high bandwidth to transfer all the data tends to be expensive. Enter data deduplication which requires significantly less bandwidth to transfer all that backup data, and, lo and behold, electronic vaulting is now economically viable as well as managerially attractive.David Hill is principal of Mesabi Group LLC, which focuses on helping organizations make complex IT infrastructure decisions simpler and easier to understand. He is the author of the book "Data Protection: Governance, Risk Management, and Compliance." View Full Bio