Upcoming Events

A Network Computing Webinar:
Avoiding Downtime: How Virtualization Can Help In Times of Trouble

June 12, 2013
11:00 AM PT / 2:00 PM ET

Are you caught between a desire for the benefits of the cloud and concerns about security and control? Then you should attend this insight-packed webinar to learn how private data networking technologies like MPLS IP-VPNs can address your concerns and allow you to safely and intelligently reap the savings, agility and other benefits associated with cloud computing.

Join us to hear top industry experts discuss the private data network technologies that are best suited for enterprise cloud access requirements. You won't want to miss this opportunity to learn how your organization can best mitigate risk while reaping the full potential benefits of the cloud.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

The Reality Of Primary Storage Deduplication

Should you be deduplicating your primary storage? For storage-obsessed IT, primary deduplication technologies is too sweet to ignore when you can eliminate duplicate data in your high-priced, tier one storage and cut storage costs by 20:1. Deduplication for back-up and off-line storage is a natural fit. Still, primary storage access demands means the reality of primary storage deduplication is a lot less rosy than you might expect.

With deduplication technologies, the deduplication software breaks files down into blocks and then inspects those blocks for duplicate patterns in the data. Once found, the deduplication software replaces copies of the pattern with pointers in its file system to the initial instance. Deduplication originally began in backup storage, but with IT's storage worries, it was only a matter of time before the technology was applied to primary storage. Today, EMC, Ocarina, Nexenta, GeenBytes and HiFn, to name a few, are all bringing deduplication to primary storage.

More specifically, there are three factors that George Crump and Howard Marks, Network Computing Contributors, point out driving this phenomenon. For one, storage is growing too fast for IT staffs to manage. Extra copies of data are going to occur, such as several copies of data dumps, multiple versions of files and duplicate images files. Primary storage deduplication catches these instances. The second play for primary storage deduplication is in storage of virtualized server and desktop images. The redundancy between these image files is very high. Primary storage deduplication will eliminate this redundancy as well, potentially saving terabytes of capacity. In many cases, the read back from deduplicated data offers little or no performance impact.

The third and potentially the biggest payoff is that deduplicating primary storage will effect optimization. Copies of data, backups, snapshots and even replication jobs should all require less capacity. This does not remove the need for a secondary backup; every so often it seems like it will be a good idea to have a stand-alone copy of data not tied back to any deduplication or snapshot metadata. Being able to deduplicate data earlier in the process does potentially reduce the frequency that a separate device is used, especially if the primary storage system replicates to a similarly enabled system in a DR location.

Deduplicating the primary storage isn't without risks and misconceptions. Deduplication ratios are dependent on the type data being operated on. Backup data, for example is highly repetitive, letting deduplication ratios to run as much as 20:1, but those opportunities don't existing in primary storage where ratios tend to run closer to 2:1. There's also a performance penalty with deduplication that won't be acceptable in certain situations, such as online transaction processing (OLTP) applications. Finally, as Marks points out, deduplication systems aren't all alike and using a primary deduplication system with the wrong backup application could result in significant problems.


Page:  1 | 2  | Next Page »


Related Reading


More Insights


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Data Deduplication Reports

Research and Reports

May 2013
Network Computing: May 2013


TechWeb Careers