De-Dupe Fragmentation

For something that was supposed to become 'just a feature,' people sure do get passionate about data de-duplication

October 28, 2008

3 Min Read
Network Computing logo

Welcome to the Hot Zone, our new blog on Byte and Switch where we will discuss the storage and virtualization trends that are affecting the data center. I'll use the first several entries to finish up the de-duplication topic. Well, maybe not finish -- how about continue? I'm sure we will revisit the topic from time to time.

For something that was supposed to become "just a feature" people sure do get passionate about de-duplication.

One of the areas I wanted to address was source side de-dupe, which a few posters have proclaimed the be-all and end-all of de-duplication. But before we handle that firecracker let's discuss the fragmentation of the de-dupe market.

First, de-duplication is showing up everywhere: primary storage, backup storage, archive storage, and even the wide-area network. Of course, there are different types of de-dupe implementations on each of these platforms, and each of the vendors thinks its solution is the best.

The guys that OEM technology seem to be at a disadvantage as the de-duplication use case expands. They end up with a de-dupe technology for primary storage, a different one for NAS storage, another one for backup (maybe two? or three?), and one for archives. This has to be confusing. Their advantage is they can move into this market faster -- but at what cost, a totally defragmented data reduction strategy?What is interesting is the guys who have had de-duplication as part of their core -- Data Domain Inc. (Nasdaq: DDUP), NetApp Inc. (Nasdaq: NTAP), and Riverbed Technology Inc. (Nasdaq: RVBD) come to mind -- seem to have a better chance to get a more seamless data de-duplication strategy put together. It probably will take them longer to cover all the de-dupe angles, but a single de-dupe approach across all storage types would seem easier to use.

Lets focus on the NetApp example for a minute. NetApp seems to have been successful with its de-duplication strategy on primary storage, and now it's planning to add de-dupe technology to its VTL solution. It plans to do so by leveraging essentially the current storage hardware, but it looks as if it will be an additional system.

Clearly, leveraging the same hardware platform could be an advantage for NetApp going forward. Its challenge is that customers have different objectives for de-duplication. Saving space, reducing backup windows, and increased file retention times are all potential benefits of de-duplication but may require different implementations. NetApp will need to be prepared to craft a system that meets customer needs. For instance, if a client is using NetApps data de-duplication technology on primary storage, and that data needs to be replicated and/or backed up, NetApp must be prepared to offer a system that addresses this need and fully capitalizes on its de-duplication capabilities. Can it do so without un-de-duping data, sending it to the VTL, and then having to re-de-dupe?

We should know soon.

— George Crump is founder of Storage Switzerland, which provides strategic consulting and analysis to storage users, suppliers, and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators. Previous installments of his discussion on data de-duplication can be found here.6668

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights