Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

DeDupe's Next Era -- Part One

As the Data Domain, NetApp, EMC saga draws to a close, it appears as though we are ready to enter a new era in data deduplication. The first era was backup. It has quickly become a capability being delivered by just about every supplier in the backup space, with some delivering several solutions. In this era, Data Domain has about 2 billion reasons to claim victory. As we move into the next era of deduplication, what should we be looking for?The first era of dedupe solved a problem that caused users massive amounts of pain, which they wanted to fix simply and with as little disruption as possible. The ability to easily extend the architecture and not replace it was appealing. Now, as we move to the next era, there are going to be greater areas of concern. This is partly because dedupe will enter more critical areas of the environment and the amount of data being put through a dedupe engine will continue to increase.Many will say that the next area of focus for dedupe will be in archive and primary storage -- and it will. But understand that backup will continue to be a major focus as it remains the greatest source of redundant data sets. Also, don't assume that all vendors have completed their first era of work; many, for example, can't replicate well, or handle a variety of data types from different data sources.In backup, we will need to see more of a scale-out type of architecture that will allow a large single dedupe repository or the ability for independent nodes to communicate so that redundant data only needs to be stored once. These scale-out architectures will allow for greater inbound and outbound performance.The other big battle will be what exactly does the dedupe. Will it be an appliance type of architecture that allows for multiple software applications to leverage the same dedupe repository backup software, which could require a single vendor approach, or would an API extension to the backup software make more sense? As we discuss in our recent article "Integrating Deduplication", this provides backup administrators the flexibility to pick the deduplication strategy that makes the most sense for them.

  • 1