Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Your Mileage Will Vary: Chunking: Page 2 of 2

Let us, for the sake of clarity, assume that the backup application always backs up the files in the root folder first. Then let us further assume that we create a 129-byte file in the root. (I know--bad server administration, but let me get away with it for the sake of a simple example.) When we run another full backup of the server, the data in that backup will be offset 129 bytes from where it was in the previous backup because our little file was backed up early in the process.

The fixed-chunk dedupe engine will break the data into its fixed-size chunks, and those chunks, because they contain data 129 bytes offset from what they contained the last time, will generate different hash values and the system will get a very low deduplication ratio. The variable block system may start off forming chunks that are similarly offset and therefore different, but will soon establish block boundaries in the same places it did the last time because it's looking at the same data it looked at the last time. Therefore it should deliver a higher data reduction ratio.

Of course, your mileage may vary, and fixed-chunk-size systems can work well when they're used with the right kind of data or tweaked by using some context from the available metadata.