Market Analysis: Continuous Data Protection

Restoring from days- and even hours-old backup won't get your company back in the game. Continuous data protection is the IT equivalent of no harm, no foul. We explain what

January 27, 2006

16 Min Read
Network Computing logo

The technologies we use for backing up data have come a long way from nine-track tape, but too many companies' policies haven't kept pace with the real-time, online applications on which today's businesses depend.

A database that stores hundreds of customer transactions every hour needs more than a routine nightly backup. The solution is continuous data protection. CDP offerings aren't replacements for conventional backups, but they can save your butt if a high-traffic system suddenly goes south.

Companies are embracing the concept, slowly. In our reader poll for this article, 40 percent of respondents said they have CDP in place now or will within 12 months. Forty-one percent say they have no plans, and 19 percent cite a 24-month implementation window.

So where does CDP fit into your IT strategy? A comprehensive data-protection plan must include three basic tasks. First, protect data from the host and applications that create and manage it, typically by making a copy. If data is corrupted by a rogue process--say, a worm or virus on your database server--you must be able to access this protected copy to restore the application. These protected data copies, or views, must be sufficiently granular so you can revert an application's data to a usable view without losing too much data. Some vendors would argue that snapshot technologies cover this need, but snapshots have limitations (more on those later).Second, remove backup data views from the storage devices where the primary data copy is stored. This prevents a single disk-array failure from destroying, or preventing access to, both your primary data and backups.

Third, remove data from the local premises--the proverbial off-site backup. Distance is key to disaster recovery. Indeed, in our poll offsite backup was the process most often named as something readers would like to integrate with CDP.

Data-replication systems, which can be provided by host software, combined with an intelligent SAN device or disk-array features, can copy data from your data center by duplicating it from primary to secondary storage, essentially in real time. But replication doesn't take the data out of the control of your users and applications. In our practice, we've seen an executive decide that he doesn't want two copies of a sensitive document on the server, so he deletes one of them. That deletion removes the data from both primary and secondary systems in the flash of an eye.

The conventional method for taking data out of range of users, malware and other sources of corruption while providing granular restore points has been to take periodic snapshots. With a host-based--or more commonly, a disk-array-based setup--we can create multiple restore points using split-mirror or copy-on-write technology. As a data-protection mechanism, however, snapshots have significant limitations. They're additional volumes on the same host that's running the database you're trying to protect, and therefore vulnerable to the software glitches and malware that corrupted your data in the first place, or they're in the same disk-array subsystem, so that a failure of the disk system destroys both your data and all your snapshots.

Another problem with disk-array-based snapshots is that all you can generally do is mount a snapshot as an additional volume, typically read-only, or revert the primary volume to the state it was in when the snapshot was taken. If you're an experienced database administrator restoring a production database from the snapshot before a wayward developer corrupted it, and carefully playing log files forward to capture the data between snapshots, this can be enough. Note that in the IT dictionary, carefully means "pain in the ass that will keep you up past 4 a.m."But if you're trying to find the Excel spreadsheet a senior VP needs restored right away, and she can't remember exactly when she saw it last, you could spend an inordinate amount of time mounting snapshots until you find the right version of the file. In addition, most systems keep only a few snapshots. If you want to take hourly snapshots during the 12 hours a day someone may be working, a typical cache of 64 snapshots will let you recover data only from the past week.

When designing a data-protection plan, realize that not all data is the same. Of course, older data should be archived (see "Archive Or Else," ). For fresh data, rather than obsessing about mission criticality or the assumed value of any given set, think in terms of reproducibility. It can, for example, be argued that Microsoft Word documents are a law firm's product. Critical yes, but also reproducible. If we have to revert the file server at a law firm to a backup or snapshot that's an hour--even two hours--old, users will grumble, but they'll be able to rewrite their briefs and contracts.

On the other hand, if the e-mail server has to be rolled back an hour, no one will ever know that opposing council offered to settle if terms could be wrapped up before court convenes tomorrow. Similarly, if the database behind your e-commerce site loses an hour's worth of transactions, you'll forfeit not only that hour's sales but also some of those customers. In our poll, transactional databases were cited more than twice as often as files when we asked what readers want to protect with their CDP systems.

To address these topics, vendors ranging from giants like Microsoft, Symantec and IBM to the proverbial garage start-up are announcing products that provide CDP, or continuous data protection. However, from where we sit, the only thing most of these offerings have in common is the use of the CDP abbreviation.A Rose by Any Other Name

Our first exposure to CDP was at NetWorld in 1990, where Vortex Systems was showing its RetroChron, an external disk subsystem for NetWare file servers that journaled writes to a second disk drive. Through command-line utilities, you could mount an additional volume that contained the data from the primary volume at an arbitrary point in time.

Being way ahead of its time--and, if we remember right, rather slow as a disk subsystem--the RetroChron was a market failure, and CDP flew low on our radar screen. Now, advancing technology has made affordable what just 15 years ago looked like the mythical element unobtainium.

Today, the CDP Special Interest Group (SIG) of Storage Networking Industry Association (SNIA) provides a working definition for CDP: "A methodology that continuously captures or tracks data modifications and stores changes independent of the primary data, enabling recovery points from any point in the past. CDP systems may be block-, file- or application-based and can provide fine granularities of restorable objects to infinitely variable recovery points."

Some industry purists have latched on to that definition and insist that to be called CDP, a product must let data be rolled back to an infinite number of restore points. We take a less hard-line view, and we take exception to the term infinite. In reality, every time a write occurs, changes are recorded, and you can roll back to an unlimited number of those change points. We consider products that use a snapshot-and-export model, or other methods that don't send changes to the backup repository in real time, pseudo CDP. Pseudo-CDP offerings, including FilesX's Xpress Restore, Microsoft's Data Protection Manager and Mimosa System's NearPoint, can make managing backups for less-than-critical irreproducible data a lot easier, but be aware that relying on one may result in the loss of several minutes of data in the event of a disaster.We're a little more generous with products like Symantec's Backup Exec Continuous Protection Server , which replicate data to the backup repository in real time but don't provide unlimited restore points, instead taking periodic snapshots. Although they might not be the right solution for busy online transaction processing systems, we'll let those vendors get away with calling their products continuous.

Indeed, having unlimited restore points may look better on paper than it works out in practice. If you're trying to restore your SQL Server database to a point before your developer tested a new inventory routine on the production server and subsequently corrupted your data, you don't want to guess at when that was. You want your CDP system to be SQL Server--or Oracle or DB2--aware and to annotate the restore timeline with system checkpoints, Named Transactions and other application-specific events. Easier said than done, which is why we really like the products that can manage this.

Factor in that, if you're just picking an arbitrary time and restoring the files that make up a database, you're more likely than not to choose a time when there were one or more partially posted transactions or other internal inconsistencies. Database designers call this state "crash inconsistency" and build recovery routines into the database engine that recognize this state and clean up after it by discarding partially posted transactions. A consistency trigger, like the one provided by Windows Volume Shadow Copy Service (VSS), can force a database into a consistent state every 10 minutes and you can roll forward from transaction logs.

Vendors have taken several different approaches to creating CDP products. Replication vendors, including XOsoft, FalconStor and Kashya, have modified their existing technologies, which capture data changes in real time and continually transmit to the remote system, enabling CDP by journaling changes, providing time- and/or application-specific annotations to the journal, and building a rollback engine.

XOsoft has gone so far as to make rollback a standard feature of its WANSync host-based replication products. It also has created a product called Enterprise Rewinder, combining data capture and rollback features without the replication. Enterprise Rewinder stores its journal on the system being protected.In addition to host software offerings, including those reviewed in detail in "CDP: Can-Do Protection", several vendors now offer CDP appliances. Putting an application like CDP in an appliance speeds up implementation, leaves valuable processor cycles for host applications, lets the vendor spend its R&D money on functions, features and testing rather than porting and, for an agentless appliance, lets users with less-popular applications or OSs take advantage of CDP. In fact, more than half of respondents to our poll said they'd prefer their CDP products as appliances, with the remainder split evenly between an application on a separate server and one integrated in storage arrays.

Revivio's CPS 1200 appliances, for example, look to host servers and applications like just another disk array. Unlike the old RetroChron, Revivio's appliances don't store your server's data but instead use your own Fibre Channel arrays to store its CDP data while your server talks directly to your primary storage array. All you need to do is set up your volume manager to mirror your application's primary disks to the LUNs presented by the appliance.

Because there's no agent, Revivio can't know when your application reaches checkpoints or other significant events. If you want access to an older version of the volumes used by an application, you create a set of virtual volumes for a point in time, down to the second. Once the view is created you can mount it on your host and validate the data.

Although several products can create this kind of restore view, most simply provide a read-only view from which data can be copied and restored. On a large database, that could take a long time. Revivio not only provides read-write virtual volumes that can be used right away, it also rolls the primary copy of your data to match the recovery view in the background, copying just the blocks that need to be changed, so you can return to being fully protected as soon as possible.

For replication, you'll need two appliances, FCIP or dark fiber FC, and lots of bandwidth. Security is typically provided by zoning, LUN masking and other standard FC techniques. If you want data encryption, you'll need a separate device, like those from NeoScale Systems or Kasten Chase (see "Tape Encryption Devices: Host-based vs. Appliance").Kashya's KBX5000, similarly, sits in a Fiber Channel SAN outside the data path between your servers and their primary storage. Using a host agent or an intelligent SAN switch, like Cisco's MDS 9000 with SANtap to duplicate writes, the KBX5000 can replicate synchronously or asynchronously, over IP or Fiber Channel, to a second KBX5000 for remote and local protection. If you use the host agent, it will annotate the time stream with events for Exchange, Oracle, SQL Server and other VSS applications.

Mendocino Software is an up-and-coming vendor that recently signed deals with Hewlett-Packard to resell its RecoveryOne software bundled into an HP appliance, and with EMC, which uses Mendocino's technology as the core of its Recovery Point appliance. With agents for Unix, Linux and Windows, Mendocino is championing application-aware CDP. The Mendocino agent duplicates disk writes at the block level, sending this data to the appliance to journal. It also sends metadata on database checkpoints and other events, which the RecoveryOne appliance displays along the timeline when you choose a point at which you want to create a recovery view.

There are appliances for all price ranges as well. Lasso Logic, recently acquired by SonicWall, put together a series of CDP appliances that start around $2,000 for 160 GB of capacity and an optional integrated remote backup service; small and midsize businesses with minimal archiving needs could get continuous protection for roughly the cost of a tape autoloader. In terms of how much data can be stored for how long, that depends on how fast your data changes--the starting rule of thumb is three iterations of 50 GB of data for 30 days. Although Lasso Logic does disk-to-disk SQL Server and Exchange backups periodically, continuous protection is just for files.

Other vendors, capitalizing on Exchange's not entirely undeserved reputation for being difficult to restore, have developed CDP systems dedicated to protecting Exchange servers. Storactive's LiveServ, for example, meets the SNIA CDP definition for Exchange Server protection while adding easy extractions of individual mailboxes and/or messages.

Lucid8's DigiVault provides continuous protection but can't restore to unlimited points in time, while Mimosa System's NearPoint uses the unique technique of replicating each 5-MB transaction log file as it's filed to the protection server, where it can be played forward into a hybrid backup and message archiving system.If you have to deal with executives calling to get messages restored to their mailboxes, one of these devices will make your life easier, especially if your users are vague about exactly where the message they want was last seen. Finding a lone message when you must restore the whole Exchange data store could take days, even with tools, like OnTrack's EasyRecovery Email Repair or Quest's Recovery Manager, that let you search a data store without mounting it on an Exchange server.

Although saving every change to all your data online forever might sound attractive, it's neither practical nor desirable. A simple CDP system will run its change journal as a FIFO (first in, first out) stack, posting the oldest changes to repository data irrevocably and discarding them from the journal as it reaches either an age or space limitation. If you want to restore older data, you'd better have made conventional backups.

Some more sophisticated offerings, like Symantec's Backup Exec CPS, TimeSpring's TimeData and LiveVault's InControl, will consolidate their change journals. This reduces the number of restore points as your data ages and keeps, for example, unlimited versions for a day, daily versions for 10 days and weekly versions for a month. Experience shows most restore requests are for files that were recently modified, deleted or corrupted, so consolidated journals will let you fulfill these requests with great granularity while giving you access to older data with lower granularity.

One critical difference among file-oriented systems, like Microsoft's DPM and Symantec's Backup Exec 10d, as well as block/volume-oriented products, is the restore methods available. Block systems will typically create a virtual volume from which you can mount and extract data. File-based offerings frequently provide more granular restore options that will let you search for all versions of a file without mounting multiple volumes. Some are even designed to let end users view, search and restore their own files through a Web or other user interface.CDP can be a great solution for a whole host of data-protection problems, but it's still a maturing technology. With the exception of LiveVault, vendors are relying on server access controls and network security to protect data from prying eyes rather than including encryption in their products.

We don't recommend replacing conventional backups with a CDP offering. Keep your old backup system around for backing up systems and applications that have lower data-change rates and for system, as opposed to application data, protection.

Some CDP systems support bare-metal restores. But even if brand-new CDP offerings were as good at that difficult job as conventional backup products, using CDP for system drives would clutter up the CDP data repository as a result of its tracking of every change to system temp files and logs.

The latest three-letter acronym to shake up the storage world is CDP. Continuous data protection products let IT recover a current version of a file--and with some offerings, e-mail messages or database transactions--in the wake of an accidental deletion or corruption. Some purists insist that to be called CDP, a product must provide infinite restore points. We say, stop misusing the term infinite and focus on what matters: Getting the business up and running after a mishap with minimal data loss.

As we discuss in "Data Do-Over" we consider products that replicate data to a backup repository in real time, then take periodic snapshots, worthy of the CDP moniker because they provide comprehensive protection, and that's what it's all about. Pseudo-CDP products, defined as those that don't send changes to the backup repository in real time but instead use, for example, a snapshot-then-export model, beat nightly backups even though you could lose a few minutes of data.In "CDP: Can-Do Protection" we tested eight CDP and pseudo-CDP products to see which could best protect our file server. Microsoft DPM, LiveVault InControl and FilesX Xpress fall under the pseudo-CDP category; XOSoft's WANSync and TimeSpring's TimeData are true CDP. Availl and Tivoli CDP for Files store changes when files are saved; this is good for documents, not so good for databases or mail servers.

Our Editor's Choice, Symantec 10d, stakes out the middle ground by sending data to the backup server continually and creating hourly snapshot restore points. We also like 10d's ease of use, and its price can't be beat. Companies interested in CDP for Exchange and SQL Server should check out FilesX >>


0

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights