Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Dedupe's Next Era - Part Two

In my last entry we began to frame up the next era of deduplication. The most significant component of this next era will be the move out of backup focused deduplication and up the storage stack to the secondary and primary storage tiers. Clearly there is already significant work going on in these tiers but in this second era deduplication on secondary and primary storage will become a requirement. Every primary storage system vendor will need to have a solution in this space.

These higher tiers are also where deduplication gets interesting because the data set is not as ideal for it. There simply is less duplicate data and any performance impact will be more noticed. The algorithms will need to be smarter, either more content aware or more granular and of course faster or less resource intensive.
Moving up the storage stack will rekindle the debate of inline vs. post process because of those performance concerns. Can you make the dedupe engine fast enough to dedupe inline on primary storage or does it make more sense to dedupe post process?  There is an alternate method not yet commonly used that can perform a parallel dedupe that will allow performance at near inline deduplication speeds, but if under heavy write conditions the dedupe process begins to affect performance the process can shift out of the way and become a post process dedupe until it catches up.
The next era of dedupe when used on primary storage will need to also be able to move that data. Archive can be a huge cost control mechanism for IT administrators, but having the time to implement those processes has been a challenge; building it in to a primary storage dedupe makes a lot of sense. Then extending the capability to offer an optimized migration to the cloud as part of a comprehensive migration archive strategy can be very appealing as we discuss in our article Deduplicating Cloud Storage.

Finally compression has to enter this conversation at some point. Our findings have repeatedly shown that compression, especially on primary storage and possibly on archive storage can deliver as good if not greater efficiencies than deduplication alone. Deduplication to be effective requires redundant data, compression compresses, at varying degress, just about everything.
The next era of dedupe has begun, the suppliers are already jockeying for position and it starts the moment the ink is signed on the Data Domain acquisition.