Despite the recent jump in attention because of EMC's purchase of Data Domain, data deduplication is old news to the typical Byte and Switch reader. Interestingly though the overall penetration is not what you may think. Our research indicates less than 20% of data centers have implemented any form of the technology. This is supported by the current poll on dedupe2.com which indicates that just over 33% of respondents have actually implemented data deduplication in their environment. What's holding deduplication back?
Part of the problem, as I indicated in my previous two posts, is that end-users today have to choose where to implement it. Do you use it for backup, archive or primary storage? Today, while a few companies claim ubiquitous deduplication, the reality is that at best one supplier may cover two tiers of storage with their current technology, if that. It is likely that if you implement deduplication on three or more tiers of storage, you are going to support different deduplication engines.
As a result, deduplication is something you actually think about before implementing, but it shouldn't be. Ironically, for deduplication to break through the 50% level of adoption, it has to be a feature that you either turn on or is transparent all the time. Consider all of the planning aspects around leveraging deduplication: what storage tiers should you dedupe, what data types should you dedupe, what data types should you compress, should just be handled for you. You shouldn't have to think that much about the impact of using deduplication on your data.
This doesn't minimize the deduplication investment on the part of suppliers. To get to the nececssary level of transparency is going to require additional research and development. This really ups the ante for providing deduplication to customers. There has long been a drumbeat that deduplication is just a feature; ironically, for this to happen is going to require a massive investment by each supplier in getting the deduplication engine to the point of it being feature-like. As the many failed attempts to add deduplication to a product set have proven, deduplication is a not an easy feature to perfect.
This presents an interesting opportunity for the companies that have been providing deduplication in their products for much of this decade. If they could somehow compartmentalize their deduplication technology and provide it in some fashion as to allow bigger storage manufacturers to easily integrate it into their storage offerings, they could in effect allow those companies to outsource the deduplication process to them.
As many manufacturers have discovered, adding dedupe to a product is harder than we thought it was going to be. Leveraging someone else to provide that component may be the most cost effective way to get there.George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for datacenters across the US, he has seen the birth of such technologies as RAID, NAS, ... View Full Bio