Primary Storage Deduplication: NetApp

One of the first entrants into primary storage deduplication market was NetApp, with their Advanced Single Instance Storage (A-SIS, commonly known as NetApp deduplication). To my knowledge, NetApp was first to provide deduplication of active storage as opposed to data that had been previously stored. NetApp deduplication has certainly gained traction within the NetApp customer base, recently claiming that more than 87,000 deduped storage systems have been deployed with about 12,000 customers benefiting from its storage efficiency technology.

NetApp deduplication is somewhat unique in that deduplication is really part of a vertically integrated stack of software based on their OS, Data ONTAP, and their file system Write Anywhere File Layout (WAFL). WAFL, like any other file system, uses a series of inodes and pointers commonly called extents to manage the information that the file system holds. Everything that is stored on a NetApp system is stored as a file whether it is actual file data or a blob that is presenting itself as an iSCSI or FC LUN. All these files are broken down into blocks or chunks of data, and in the WAFL file system all of the blocks are 4k in size.

As a result, each time a file is stored, its blocks are associated with a system of pointers. They leverage these 4k chunks to implement technology like snapshots and cloning. NetApp deduplication is enabled at the volume level. When a volume is enabled, the system begins an inline process of gathering fingerprints for each of these 4k chunks via a proprietary deduplication hashing algorithm. At intervals, either specified by the user or automatically triggered by data growth rates, a post-processing routine kicks in to determine any match in fingerprints, meaning that redundant data has been found.

After a byte-level validation check confirms identical data, the pointer to the redundant block is updated to point back to the original block, and the block that has been identified as redundant is released in the same way a block attached to an expired snapshot is released. The fingerprint itself leverages existing NetApp code "write block checksum," which WAFL has used since its inception. The bottom line is that NetApp should be commended for leveraging the capabilities of its existing operating system to deliver a modern capability.

There is a two-step process to adding deduplication, total time of which should, according to NetApp and in our personal experience, take about 10 minutes. The first step is to enable deduplication by installing the license. NetApp still does not charge for deduplication, so enabling the license is mostly a reporting function to let NetApp know who is using the feature. Once the license is enabled, there is no change in the behavior of the box, it just allows the system to execute the various deduplication commands.

Juniper Networks Announces AI-Native Networking Platform

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 31, 2024

Bob Friday, Chief AI Officer for Juniper Networks, explains how the advanced technology is transforming operations.

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

Zeus Kerravala, Founder and Principal Analyst with ZK Research

January 29, 2024

Contact center leaders from 8x8, Awaken Intelligence, and 360insight discuss the importance of agent experience.

AI Drives the Ethernet and InfiniBand Switch Market

David Curry, Technology Writer

January 27, 2024

AI may force enterprises to rewire parts of their data centers so they are fully optimized to run such workloads. The question is do you use Ethernet or InfiniBand?

Primary Storage Deduplication: NetApp

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market

Search form

Primary Storage Deduplication: NetApp

Tags:

Recommended For You

Juniper Networks Announces AI-Native Networking Platform

Understanding Why Contact Center Agent Empowerment is Critical to a Great Customer Experience

AI Drives the Ethernet and InfiniBand Switch Market