Storage

12:53 PM
George Crump
George Crump
Commentary
50%
50%

Deduplicating Replication - Quantum

Quantum's deduplication method is called adaptive inline deduplication. This means that they will automatically adjust between inline deduplication and post-process deduplication as needed. Basically, if the system thinks it is getting too throttled down by the deduplication work, it will shift into a post process mode for as long as needed. If needed, the post-process deduplication can be forced by enabling a backup window in order to defer the deduplication process - it's called the deferred m

Quantum's deduplication method is called adaptive inline deduplication. This means that they will automatically adjust between inline deduplication and post-process deduplication as needed. Basically, if the system thinks it is getting too throttled down by the deduplication work, it will shift into a post process mode for as long as needed. If needed, the post-process deduplication can be forced by enabling a backup window in order to defer the deduplication process - it's called the deferred mode. Depending on the mode that unit is performing its deduplication, this will affect how it replicates the data.

Quantum's deduplication is leveraging disk. When the backup application sends a backup job it will be chopped into multiple chunks and pieces stored on disk instead of memory while the deduplication process executes. They have the ability to store most data in native or unduped format. Assuming you have the space, this could help in recoveries of complete systems by circumventing the need to re-inflate or undeduplicate data as it is being recovered.

As it relates to replication, with the unit in adaptive inline mode, you do not have to wait until the whole job is done to start the replication process. As soon as the first chunk of data lands on disk, it gets deduplicated, so the unique variable size block data are ready to be replicated to the secondary site.  

If the unit is in a post-process mode, you have to wait until the entire job is complete. Since with post-process there may be a delay in getting the replication process started, as there is with other post-process deduplication products, you may want to alter the size of your backup jobs so that you have more smaller jobs to evenly distribute the load if possible.

The replication process itself fits my definition of global deduplication, which is that if three sites are sending data to the disaster recovery site, only data that is unique across the three sites will be sent to the DR site. For example, each site will send a list of blocks that need to be replicated to the DR site. If the DR site realizes that it has already seen that block, regardless of source, it will tell that site not to send those blocks. This process helps further thin the replication bandwidth requirements in many-to-one replication strategies.

George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for datacenters across the US, he has seen the birth of such technologies as RAID, NAS, ... View Full Bio
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Cartoon
Hot Topics
6
The Rise Of White-Box Storage
Jim O'Reilly, Consultant,  8/27/2014
White Papers
Register for Network Computing Newsletters
Current Issue
Video
Slideshows
Twitter Feed