Cofio's Unique Approach To Deduplication
June 15, 2010
Deduplication, at least from a backup standpoint, is about efficiently storing data on a backup device. Some suppliers leverage either block-based incremental, continuous data protection (CDP) or source-side deduplication to increase the efficiency of data going across the network to the backup target. Cofio's AIMstor application takes the unique approach of using all of the available techniques for maximum optimization across the network and on secondary storage devices.
Cofio is a software-based solution that has a client-side and a target-side component. The application will first do a source-side deduplication for the initial seeding of data to the target, which provides maximum network and storage efficiency for that first pass of data. Their deduplication process is also content-aware, and knowing how to examine an Exchange store vs. a Word document should provide greater deduplication ratios.
As with all source-side deduplication, there is some performance impact on determining duplicate data and filtering it. What makes the Cofio solution interesting is that all subsequent backups of the source disk are done via a CDP process, meaning that updates are captured and transferred in real-time at a byte level when the change occurs, or at a scheduled time if desired. CDP and block-level incremental (BLI) examination of a server does not typically impact performance as much as source-side deduplication. As a result, Cofio gets the network and storage efficiency gains of deduplication while at the same time getting the resource efficiency and frequency of protection that CDP or BLI provides.
The typical challenge that CDP or BLI technologies have is maximizing storage efficiencies on the target device. There is the possibility that a file could be added to multiple servers and then stored redundantly on the backup target. A good example would be VMware, where multiple hosts may have VMs that receive an OS patch update. Under the typical CDP, BLI use case these would all be stored redundantly. To address this issue Cofio performs a post-process deduplication pass on the disk repository on a scheduled, typically once per night, basis. This pass then will identify redundant data segments that have been stored since the initial seed or since the last post process dedupe pass.
They will soon provide VMware integration leveraging the vSphere API but the client is also lightweight enough that it can run within the guest OS. Running at the guest OS level should provide greater granularity of data examination and recoveries. While you can't launch a protected VM directly from their environment like some CDP tools, you can restore the backup of a physical server into the virtual environment, making physical to virtual machine migration another capability of AIMstor.