Do We Need Primary Storage Deduplication?

With the recent buzz around a few new primary storage deduplication products, I've seen the question of primary storage deduplication's value come up more than once. After all, if you are managing your storage correctly there shouldn't be much duplicate data, especially on primary storage right? Sure, and we all archive all of our old data to tape as soon as it has not been accessed for 90 days. Even in a well-managed system there is redundant data on primary storage, so deduplication's benefits can be enormous.

First, as I hinted to earlier, storage is growing too fast and IT staffs are too overworked to manage it all. Extra copies of data are going to sneak in. That DBA is going to keep several copies of dumps, users are going to save version 1, 2 and 3 of a file under different names and never go back and clean out the older copies. You get the picture. Then there are the more legitimate cases like the company logo that is inserted into every slide of every presentation and memo that is stored on your servers. Primary storage deduplication will catch all of these instances for you when you don't have time to.

The second area where primary storage deduplication will have a roll to play is in the storage of virtualized server and desktop images. The redundancy between these image files is very high. Primary storage deduplication will eliminate this redundancy as well, potentially saving terabytes of capacity. In many cases, the read back from deduplicated data offers little or no performance impact.

The third and potentially the biggest payoff is that deduplicating primary storage will effect optimization--copies of data, backups, snapshots and even replication jobs should all require less capacity. This does not remove the need for a secondary backup; every so often it seems like it will be a good idea to have a standalone copy of data, not tied back to any deduplication or snapshot meta data. Being able to deduplicate data earlier in the process does potentially reduce the frequency that a separate device is used, especially if the primary storage system replicates to a similarly enabled system in a DR location.

This effect makes backups merely copies of the same data. The backup application could back up to the same storage system. No need for a second one. Archives become copies of files with maybe a Write Once Read Many (WORM) flag thrown on them, but the archive application would copy that data to the same storage system.

Juniper Networks Announces AI-Native Networking Platform

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 31, 2024

Bob Friday, Chief AI Officer for Juniper Networks, explains how the advanced technology is transforming operations.

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 29, 2024

Contact center leaders from 8x8, Awaken Intelligence, and 360insight discuss the importance of agent experience.

AI Drives the Ethernet and InfiniBand Switch Market

David Curry, Technology Writer

January 27, 2024

AI may force enterprises to rewire parts of their data centers so they are fully optimized to run such workloads. The question is do you use Ethernet or InfiniBand?

Do We Need Primary Storage Deduplication?

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market

Search form

Do We Need Primary Storage Deduplication?

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market