Deduplication's Replication Mode

According to every deduplication supplier that I talk to, replication has a high attach rate for deduplication products. In most cases over 50 percent of their systems are sold with the replication module or capabilities enabled. Over the next couple of entries I'll review some of the specific vendor's claims and name names as it relates to replication. If your in the dedupe space and I have not spoke to you, please reach out to me so I can include you in the conversation.

George Crump

November 5, 2009

3 Min Read
Network Computing logo

According to every deduplication supplier that I talk to, replicationhas a high attach rate for deduplication products. In most cases over50 percent of their systems are sold with the replication module orcapabilities enabled. Over the next couple of entries I'll review someof the specific vendor's claims and name names as it relates toreplication. If your in the dedupe space and I have not spoke to you,please reach out to me so I can include you in the conversation.

While moving backup jobs to a remote site electronically is a keycapability for deduplication products, it should not be your solemethod of DR. It's important to keep in mind that the data in the remote site is in a backup format and needs to berecovered to DR servers to be of value. The time it takes to move thisdata from the disk deduplication device to the production server willstill take time. That time may push you outside of your recoveryservice level agreement. For many data centers, having a data set thatgoes off-site in an inexpensive fashion, a few hours after local backupis complete may be all they can afford and may still represent a hugeimprovement in recoverability.

There is one exception to the recovery first problem: servervirtualization. Since some of the appliance based devices presentthemselves as disk targets via CIFS or NFS, you could mount serverimages via NFS at the DR site and be back in production. None of theappliance based systems bill themselves as primary storage, so theintent would be to use a capability like VMware Storage VMotion tomove those images quickly to production storage. This concept is worthan article all by itself and something I will dive into later.

While some of the deduplication vendors that I spoke with arerelatively new to providing replication capabilities to theirsolutions, all of them seem to have something. Some of thededuplication providers are delivering replication via a basic filesystem replication technique. Basically they are leveraging the factthat deduplication only writes unique blocks and they are using filesystem replication to identify those writes and then replicate themacross the wire. While this certainly works from a "point A to point B"perspective, it does cause some problems when you are trying to do amany to one or cascaded type of replication.

Also how the vendor does deduplication, the old inline vs. postprocessing debate, will affect how the replication mode works. Mostvendors will agree that both methods have their strong points and weakpoints. It's how they take advantage of the strengths and design aroundthe weaknesses that matters. For example, when it comes to replication,an inline system or even an adaptive inline system should be able toreplicate data either as data is written to the device or as thespecific backup stream to end and provide a file closure. In typical post process data deduplication, the entire backup has to complete before deduplication occurs.Replication then occurs as unique blocks are identified and written todisk.How data is ingested by the system or software affects how and when thereplication job can occur. How soon you need to have your DR copyto be able to do restores matters most. As I mentioned earlier, for some, having data at a DR site or DR hosting provider isbetter than what they have now. Its important to remember that while itseems like deduplication is everywhere, most studies that I have seenput penetration below 25 percent. That means that less than 12 percent or so ofpotential data centers are using replicated deduplication.

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights