Cloud Backup Not as Easy as It Looks

The problem is scale. Building an architecture to handle all that traffic and not lose anyone's data is a challenge

Howard Marks

March 31, 2009

3 Min Read
Network Computing logo

1:00 PM -- At first glance, online backup is one of the simpler cloud applications to get up and running. Write or buy some software that uses Windows NTFS change log and VSS to create block incremental backups, encrypt and compress the data, and send it to a data center for storage. In the data center you need a whole bunch of SATA drives to hold the data. Simple, right? I mean it's the same architecture you use to backup your exec's laptops using Asigra Televaulting, Atempo Live Backup, or any of the other products that have been around to do the job since I wrote the documentation for Seagate Software's Client Exec lo these many moons ago. You do backup laptops, right? Buehler? Buehler?

The problem of course is scale. As HP learned with their Upline fiasco, now thankfully put out of its misery, and EMC learned with the long restore delays some Mozy users have reported, what's easy for 10 or 100 corporate laptops becomes a much bigger deal when tens of thousands of tech-ignorant consumers are accessing your systems at the same time. Building an architecture to handle all that traffic and not lose anyone's data is a challenge. Being able to do it and make a profit at $50/PC/yr requires ingenuity at least.

In the latest chapter of "As the Online Backup Service Turns," Carbonite is suing Promise Technologies, claiming Promise's Vtrak M500i external RAID arrays suffered complete data loss and Promise was unable to correct the problem. This case points out the risk in relying on low-cost components in mission-critical applications. While Byte and Switch readers may snicker and insist they would use Clariions or FASs with real-time replication, that architecture would put their services in the high end of the online backup market at best. According to comments made by Carbonite president Dave Friend in response to a story on the Boston Globe site, Carbonite now uses Dell kit running RAID-6.

It seems to me that the Google-, Cleversafe-, Atmos-, Parascale-type storage grid, which uses standard servers and SATA drives distributing multiple copies of the data across multiple servers rather than relying on RAID and RAID controllers, makes more sense for Web 2.0 applications where there may be huge aggregate data flows, but any individual user's perceived performance is throttled by his or her Internet connection speed. By using low-cost servers -- OK, that leaves out Atmos -- these architectures can survive equipment failures pretty gracefully.

So what do I tell my friends, relatives, and former in-laws about backing up their valuable data:

  • Don't keep the only copy of your data in the cloud. Even the best cloud providers have issues. If you're really mobile, providers that store data in multiple data centers or multiple providers may be in order.

  • Have local backups for fast restores and online backups for disaster recovery.

  • Using a provider that allows Web retrieval, in addition to restore through their software, lets you get to data when you're away from your PC.

  • Remember that your data is encrypted with your password -- lose your password and you lose your data. Just like in the Harry Potterverse, you need a trusted secret keeper for the password. (Yes, it's over-simplified, but how long do you want to spend explaining crypto you your mother-in-law?)

I've even gone so far as to set up my own buddy backup system. More on that next time.

Each year, InformationWeek honors the nation's 500 most innovative users of business technology. Companies with $250 million or more in revenue are invited to apply for the 2009 InformationWeek 500 before May 1. Howard Marks is chief scientist at Networks Are Our Lives Inc., a Hoboken, N.J.-based consultancy where he's been beating storage network systems into submission and writing about it in computer magazines since 1987. He currently writes for InformationWeek, which is published by the same company as Byte and Switch.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights