Arkeia Source Side And Sliding Window Deduplication

Arkeia Software is next up on the examination table as we plow our way through the source-side deduplication vendors. Arkeia is a backup application software supplier known mostly for their Linux product, but they also have a strong solution set for Windows and virtualized environments. They are adding deduplication to their offerings through the acquisition of Kadena Systems.

George Crump

May 19, 2010

3 Min Read
NetworkComputing logo in a gray background | NetworkComputing

Arkeia Software is next up on the examination table as we plow our way through the source-side deduplication vendors. Arkeia is a backup application software supplier known mostly for their Linux product, but they also have a strong solution set for Windows and virtualized environments. They are adding deduplication to their offerings through the acquisition of Kadena Systems.

Kadena Systems developed, patented and  distributed a block-based deduplication method that promises to improve space efficiency over traditional deduplication methods. Instead of using the more common fixed-block or variable-block methods, the Kadena method uses a 'fixed-length, sliding window' to parse the file and see if there is new data in it. This method may provide greater efficiency in finding redundant data. The technique is similar to what rsync does prior to synchronizing files. The size of this window can be either automatically or manually adjusted based on the type of file, which should allow for a greater chance of finding redundant data within the file. For example, image files that won't typically contain any redundancy will use a large window, allowing for a quick and not very CPU-intensive examination. Alternatively, database files will use a much smaller examination window because the size of the data repetition is generally the length of the row of a table.  PowerPoint files would use a medium-size examination window.  

Arkeia makes the case that its sliding window technology offers the speed of fixed-block deduplication, because all the blocks are the same size, with the compression benefits of variable-block because it accommodates file changes due to byte inserts. The traditional concern with the sliding window approach is that it typically requires significantly more CPU horsepower to process than either fixed-block or variable-block deduplication because each time the window slides one byte, a new fingerprint (e.g. a message digest) must be calculated. Arkeia gets around this concern by doing something called progressive matching.

An example of this technique is using checksums to quickly determine if a block is new or if it is potentially a duplicate. If the latter case is true, then a more CPU-intensive calculation (e.g. a message digest) is run to determine the level of redundancy. As a result, heavier CPU resources are only expended when there is a greater likelihood of redundancy. Arkeia believes that the net effect is greater deduplication efficiency without greater than normal CPU requirements.

Once the non-redundant data is identified, it's compressed and sent across the network to the backup server where it is stored in its deduplicated state. As with other source-side deduplication products, this allows for effective use of network bandwidth as well as the reduction in the amount of storage required. It also enables the use of disk as a longer term repository for backups.Once stored by the backup server, Arkeia can leverage its replication module to transfer the data to a remote site. Alternatively, if a copy to tape is desired, moving data to tape is integrated into the Arkeia backup solution via a copy or clone command. Although it does re-inflate the data when stored on tape, whether deduplication to tape has value is a matter of mixed opinion.

Arkeia has a unique deduplication technology integrated into a robust backup application, which may mean even better storage and network bandwidth efficiency thanks to the sliding window technology. Beyond deduplication, it has all the application and device modules that you would expect, and its support for Linux is more thorough than the average backup application. It also has complete support for VMware via the vStorage API as well as Hyper-V and Xen.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights