Druva Brings App-aware Dedupe To The Laptop

Given the fact that almost half the computers in corporate America are laptops, and a significant amount of data is created on those portable devices, I am constantly amazed at how frequently organizations leave the data on those laptops essentially unprotected. By taking a unique application aware approach to data deduplication, Druva's Insync may be the best solution to date to this seemingly intractable problem.

Howard Marks

February 10, 2011

3 Min Read
Network Computing logo

Given the fact that almost half the computers in corporate America are laptops, and a significant amount of data is created on those portable devices,  I am constantly amazed at how frequently organizations leave the data on those laptops essentially unprotected. By taking a unique application aware approach to data deduplication, Druva's Insync may be the best solution to date to this seemingly intractable problem.

When I talk to corporate IT managers about backing up data on laptops; I as often as not get resistance to the very concept. Users, they say, should be responsible for backing up that data, after all it is their data. Unfortunately we all know what the result is when you try and make users responsible for backing up their own data, some users will get incredibly compulsive about the process  and backup everything, including 47 copies of the latest Justin Bieber video, but most will continue in blissful ignorance of the process until something goes wrong.

When something goes wrong it will of course be corporate IT's problem and corporate IT will not only spend thousands of dollars for data recovery at OnTrack, but the company will end up spending even more if limited recovery fails and the user has to spend their time doing the work over or even worse the order gets lost.

It's not like there haven't been tools for backing up laptops. Way back in the 20th century I was writing whitepapers about Seagate Software's Client Exec and several generations of products have come and gone in the meantime. While some of these products, including Client Exec, did do single instance storage at some level, they still took a huge amount of bandwidth to backup the poor road warriors laptop.

Source-based deduplication in products like Symantec's PureDisk or Avamar addresses the bandwidth problem by not only eliminating duplicate data from a single laptop but also by duplicating across the set of machines being backed up to a single repository. Druva's application aware deduplication promises to be even more bandwidth, and storage efficient.All source deduplication systems break data into chunks, generate hashes for the chunks and compare the hashes against the data at the central repository to determine what data is new on this client and has to be backed up. Other deduplication systems use fixed size chunks or a simple algorithm to chunk the data. While this can be effective when the same data appears in different places, in multiple files it may generate different hash values because it's offset in the chunk from where it was the last time, which reduces the deduplication factor.

Druva's Insync knows the file formats for common applications and breaks the files down into component parts. So if the same .JPEG image is embedded in multiple emails, PowerPoint presentations and word documents, Insync will identify the JPEG as a single chunk and get 100% deduplication on it.  This process can also reduce the CPU load of chunking and hashing as embedded objects will generate fewer large chunks than the alternative.  The technique sounds similar to what Ocarina was pitching before they were acquired by Dell and went quiet last year.

An agent on the user's laptop backs the system up whenever the road warrior coughs up the $12.95 a day for hotel WiFi. Up to 2000 remote systems can backup to a single destination server that can have up to 16TB of storage, in 4TB deduplication realms. Administrators can define what to backup and how much CPU and bandwidth to use for the backup process.  Users can restore from the agent, through web and via iPad self service apps. Admins can also burn off DVDs or do a bare metal restore for the lost, stolen or totally mangled system.

So no more excuses folks.  Laptop data is a corporate asset just like data on the server.  Time to start backing that stuff up.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights