Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Enterprises Begin To Embrace Primary Data Deduplication

12 Top Big Data Analytics Players
12 Top Big Data Analytics Players

(click image for larger view and for slideshow)

GreenBytes on Wednesday joined a bunch of other vendors in deduplicating primary, not secondary backup, data. The company introduced a new deduplication appliance, the HA-3000, which deduplicates structured and unstructured data.

The company's announcement is the latest in the burgeoning world of primary data deduplication. Many businesses hit with increasing expenditures for additional storage capacity are looking to appliances that allow them to reduce the amount of storage they consume by reducing or eliminating redundant data and files.

Primary deduplication of data reduces the amount of data stored by identifying unique instances of data and storing it. Primary storage systems typically have a significant amount of duplicate data, such as multiple versions of files, virtual servers with the same or similar operating system, application files, and user mailboxes.

Vendors of primary data deduplication gear estimate that storage consumption can be reduced by as much as 80% (depending on the types of data) by using their gear. Preliminary results from an end-user adoption survey conducted by Storage Strategies NOW and the Storage Networking Industry (SNIA) indicate that 46% of the respondents are using primary storage deduplication, while a larger percentage--59%--are using appliances to deduplicate secondary backup data.

A large number of vendors have jumped into the market with products for deduplicating primary storage. Among them are GreenBytes, Actifio, Exagrid, HP, IBM, NetApp, Permabit, Quantum, and Symantec to name a few. Each of them varies in its implementation.

GreenBytes' new HA-3000, which deduplicates data stored on storage area networks, is an inline iSCSI-based appliance that uses solid state drives for caching frequently accessed data and pairs them with less-expensive but still fast hard disk drives. Each HA-3000 also has dual controllers for fault-tolerance and is designed to support databases and virtualized environments. The appliance connects to the network with four 1-Gbps Ethernet network ports and two 10-Gbps Ethernet network ports. The HA-3000 has raw capacities ranging from 26 TB to 78 TB and is expected to be available in January 2012.

Dell last week also announced that it has integrated its Ocarina deduplication and compression technology into its DX Object Storage. The company has big hopes for primary data deduplication--it will also integrate Ocarina into its Dell Compellent Storage Center arrays, its EqualLogic storage systems, and into its Dell PowerVault systems.

Further, Actifio and Symantec recently announced systems to deduplicate primary data. Actifio does this by dealing out virtual copies of data rather than making copies of data for each application that needs them. Symantec with its FileStore appliance reduces unstructured data.

Finally, IBM and NetApp, which can be credited with starting this flurry of primary data deduplication, both have technologies built into their systems. IBM has its Real-time Compression software and its STN6500 and STN6800 appliances that shrink primary NAS data. NetApp has added its deduplication and compression technology into its Data ONTAP operating system that runs on its FAS and V-Series storage systems.

Deni Connor is founding analyst for Storage Strategies NOW, an industry analyst firm that focuses on storage, virtualization, and servers.