Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

A Data De-Duplication Survival Guide: Part 1: Page 2 of 7

General purpose de-duplication systems

Several vendors, including Data Domain and Quantum, offer de-duplication systems that are not associated with particular VTLs or backup appliances. These devices can be termed general-purpose de-duplicators.

The advantage of working with a general-purpose data de-duplication storage system is that it is designed solely to de-duplicate data. As a result, these systems are source neutral, meaning that the source backup data can come from multiple applications (backup software, application utilities, archiving applications, or directly from the user).

General-purpose systems provide multiple data access protocols (NFS, CIFS, or tape emulation) and offer multiple types of physical connectivity (Ethernet or Fibre Channel). In the real-world data center, there are many sources of backup data, and there is distinct advantage in being source neutral.

Although input can be taken from multiple sources, in a general-purpose system, the data de-duplication process is leveraged across all of them. For example, the Microsoft SQL environment may be backed up by an administrator through the backup application to the general-purpose data de-duplication system. Later, the same data may be dumped to the data de-duplication system by the SQL DBA. After that, it may also be captured as part of a VMware image using a VMware backup utility to move the data to the data de-duplication system.

In the above example, all data is similar and the redundant segments from each of the sources are eliminated before the data is stored. Be aware that this example is for one file that changed slightly on one day. This type of multi-protection is not uncommon in today’s data center, so the space savings across a week or month could be staggering.