Deduplication's Replication Mode
Posted by
George Crump
November 05, 2009
According to every deduplication supplier that I talk to, replication has a high attach rate for deduplication products. In most cases over 50 percent of their systems are sold with the replication module or capabilities enabled. Over the next couple of entries I'll review some of the specific vendor's claims and name names as it relates to replication. If your in the dedupe space and I have not spoke to you, please reach out to me so I can include you in the conversation.
While moving backup jobs to a remote site electronically is a key capability for deduplication products, it should not be your sole method of DR. It's important to keep in mind that the data in the remote site is in a backup format and needs to be recovered to DR servers to be of value. The time it takes to move this data from the disk deduplication device to the production server will still take time. That time may push you outside of your recovery service level agreement. For many data centers, having a data set that goes off-site in an inexpensive fashion, a few hours after local backup is complete may be all they can afford and may still represent a huge improvement in recoverability.
There is one exception to the recovery first problem: server virtualization. Since some of the appliance based devices present themselves as disk targets via CIFS or NFS, you could mount server images via NFS at the DR site and be back in production. None of the appliance based systems bill themselves as primary storage, so the intent would be to use a capability like VMware Storage VMotion to move those images quickly to production storage. This concept is worth an article all by itself and something I will dive into later.
While some of the deduplication vendors that I spoke with are relatively new to providing replication capabilities to their solutions, all of them seem to have something. Some of the deduplication providers are delivering replication via a basic file system replication technique. Basically they are leveraging the fact that deduplication only writes unique blocks and they are using file system replication to identify those writes and then replicate them across the wire. While this certainly works from a "point A to point B" perspective, it does cause some problems when you are trying to do a many to one or cascaded type of replication.
Also how the vendor does deduplication, the old inline vs. post processing debate, will affect how the replication mode works. Most vendors will agree that both methods have their strong points and weak points. It's how they take advantage of the strengths and design around the weaknesses that matters. For example, when it comes to replication, an inline system or even an adaptive inline system should be able to replicate data either as data is written to the device or as the specific backup stream to end and provide a file closure. In typical post process data deduplication, the entire backup has to complete before deduplication occurs. Replication then occurs as unique blocks are identified and written to disk.
Page: 1 | 2 |Next Page »
Related Reading
More storage-networking-management Insights
| To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy. | |











