Deduplicating Replication - Quantum

Quantum's deduplication method is called adaptive inline deduplication. This means that they will automatically adjust between inline deduplication and post-process deduplication as needed. Basically, if the system thinks it is getting too throttled down by the deduplication work, it will shift into a post process mode for as long as needed. If needed, the post-process deduplication can be forced by enabling a backup window in order to defer the deduplication process - it's called the deferred m

George Crump

January 19, 2010

3 Min Read
Network Computing logo

Quantum's deduplication method is called adaptive inline deduplication. This means that they will automatically adjust between inline deduplication and post-process deduplication as needed. Basically, if the system thinks it is getting too throttled down by the deduplication work, it will shift into a post process mode for as long as needed. If needed, the post-process deduplication can be forced by enabling a backup window in order to defer the deduplication process - it's called the deferred mode. Depending on the mode that unit is performing its deduplication, this will affect how it replicates the data.

Quantum's deduplication is leveraging disk. When the backup application sends a backup job it will be chopped into multiple chunks and pieces stored on disk instead of memory while the deduplication process executes. They have the ability to store most data in native or unduped format. Assuming you have the space, this could help in recoveries of complete systems by circumventing the need to re-inflate or undeduplicate data as it is being recovered.

As it relates to replication, with the unit in adaptive inline mode, you do not have to wait until the whole job is done to start the replication process. As soon as the first chunk of data lands on disk, it gets deduplicated, so the unique variable size block data are ready to be replicated to the secondary site.  

If the unit is in a post-process mode, you have to wait until the entire job is complete. Since with post-process there may be a delay in getting the replication process started, as there is with other post-process deduplication products, you may want to alter the size of your backup jobs so that you have more smaller jobs to evenly distribute the load if possible.

The replication process itself fits my definition of global deduplication, which is that if three sites are sending data to the disaster recovery site, only data that is unique across the three sites will be sent to the DR site. For example, each site will send a list of blocks that need to be replicated to the DR site. If the DR site realizes that it has already seen that block, regardless of source, it will tell that site not to send those blocks. This process helps further thin the replication bandwidth requirements in many-to-one replication strategies.Another interesting wrinkle is Quantum's ability to snapshot the remote replicated volume, allowing for multiple recovery points from the replicated data. They can store multiple snapshots of the replicated data independent of the deduplication method.

Quantum allows for ten versions or points in time. They don't take any measurable extra space since they are really only namespace snapshots. At any point, the backup administrator (from the GUI) could have Monday's replication, Tuesday's replication, Wednesday's replication and rollback to whichever version they wanted. This would be ideal in the event of a corruption or virus sneaking into a data set and then invalidating that day's backup.

Finally, support of Symantec's OpenStorage Technology (OST) for NetBackup (NBU) has also been important during these discussions, and Quantum has good support of OST at this point. Quantum uses OST to both manage replication jobs as well as providing and integrating path to tape. This provides NBU the ability to manage retention policies on two backup copies of the same data set with the data movement being supported by the Quantum system. While Quantum does lack the ability to support OST's capability as a high end transfer protocol across IP networks, they do see adding this support as a high priority and would expect that support sometime this year.

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights