Network Computing is part of the Informa Tech Division of Informa PLC
More On Chunking
The last time we looked at the chunking process in data deduplication engines ("Your Mileage Will Vary: Chunking"), we were looking pretty favorably at variable chunking that used the contents of the data to assign chunk boundaries. However, as deduplication moves from backup appliances accepting tape, or other backup application-specific format data, into backup applications and primary storage, the advantages of fixed-chunk deduplication start to become apparent.
The primary advantage of fixed-chunk deduplication is lower CPU overhead. Fixed-chunk systems don't have to spend any CPU cycles examining data and determining where the chunk borders should be. They just break data up into chunks like any other file system. In fact, some primary storage deduplication, like NetApp's, uses just the underlying file system's chunks for its foundation.
Lower overhead also means lower latency; computing where to put the chunk boundaries takes some time. While vendors have done their best to reduce this additional latency, and will claim it's not noticeable, it exists and might be a problem for primary storage deduplication systems.
Backup applications are a simple lot. In their heart of hearts they just want to be sending a stream of data to a tape drive somewhere. Since they're making large sequential write requests to a small number of large files, a few milliseconds of latency per request won't have a big impact. For conventional backup applications like NetBackup or Networker, throughput is all important, and latency less so.
Primary storage applications, even simple ones like hosting users' home folders, are much more latency-sensitive. In addition, rather than writing to a small number of very large files like backup applications do, primary storage environments have millions of files of all sizes. Since each file begins on a fresh data chunk, an insertion or other change that could throw off the chunk alignment will only affect one file's worth of data. Every new file will realign the process.
Recommended For You
What skills do network managers really need to properly secure industrial networks? What new protocols, frameworks, and regulations are important? And what conferences and certifications can help? Here are five tips to get started.
A full-stack approach to retail edge offers retailers a way to optimize operations and adapt to changes in a post-pandemic world.
Network management tool sprawl is getting in the way of network management. It’s time for IT to do something about it.