Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Source-Side Deduplication: Symantec

With the combination of NetBackup and Backup Exec, Symantec has a market dominating 43 percent share. Many of the target-side appliance systems consider support of Symantec's OpenStorage (OST) an important capability as we discussed in our series on target-side deduplication. Symantec's latest move is to integrate deduplication capabilities into NetBackup and Backup Exec.

Symantec has had an offering called PureDisk that they acquired a few years ago and started to integrate into their back-up software framework. This started with using the software as a remote office back-up application, then allowing it to dedupe a disk repository created by NetBackup, and then finally in the most recent releases, by enabling full integration with the NetBackup and Backup Exec clients and media servers.

The basic architecture has NetBackup client installed on the server to be protected. Deduplication then becomes an option that the administrator can select on a client-by-client basis. For systems where there is a performance concern, admins can choose to not dedupe client-side and let the media server do the deduplication or not to dedupe at all, possibly letting a back-up appliance do the deduplication. In addition, the block size of the deduplication engine can be adjusted. The smaller the block size, the more likely the chances of finding redundant data, but also the more processing power that will be used. Larger segments also mean less work on the target disk, since there are fewer data segments to organize. This fine-tuning allows the administrator to match up the data set and the power of the client for the right balance of storage efficiency and performance.

The repository for the deduplicated data can be up to 32TBs in size. That is raw capacity, not total dedupe capacity. If a dedupe engine larger than that is needed, then a second media server can be set up to handle the data flow, or you can choose to select the native version of PureDisk which scale beyond 32TBs.

The choice between client-side deduplication, media server or leveraging OST with a target system comes down to the environment and the situations within that environment. For example, for servers that are backed up across a thin IP network segment like 1GBE or smaller, source-side dedupe can be selected. The decision is that it's better to burn local processing power than it is to squeeze a bunch of data through a very thin pipe. Servers that are on a SAN or 10GB IP connection may be better off sending all the data to the media server and letting it do the deduplication work. If the media server is overworked, then it may make more sense to offload that work to a target deduplication system that leverages OST.

  • 1