Storage Pipeline: Data-Replication Software
Enterprises spend a fortune setting up data-replication systems. We'll show you how to make the most of this critical element in your disaster-recovery plan, and offer overviews of available product
May 20, 2005
We reviewed the products in our New Jersey partner labs; our test network included a Gigabit Ethernet switch and six servers running Windows 2000 and 2003 Server. We looked for maximum efficiency in WAN bandwidth and CPU utilization, as well as initial replication performance. We also evaluated ease of installation and management, features, failover and price.
Replication ProductsClick to Enlarge |
Considerations
When designing a disaster-recovery environment, pay attention to three main factors:
• RTO (recovery time objective) defines the maximum time your organization can do without a given app. Even if you have a replacement server ready, restoring a multigigabyte file store or database from tape or disk can take hours, while replicating data to a hot-spare server may let you fully recover in minutes or even seconds. Of course, keeping a hot spare costs more.• RPO (recovery point objective) reflects what point in time your data must be restored to.
• DLO (data loss objective), which is directly related to RPO, defines how much data your organization can afford to lose. Although a DLO of zero is attainable, it may not be cost effective.
One factor to consider when defining your RPO is data reproducibility. Generally, if you lose documents like legal briefs, those who created them can generate functional replacements. However, online operations like sales, ATM transactions and stock trades may be lost forever. You also must consider performance. Depending on your requirements and budget, synchronous, asynchronous and snapshot-oriented replication can be used and affect both data reproducibility and performance. See "Distributed Storage: Extending the SAN Over the WAN," page 12, for an in-depth description of these techniques.
Also, remember that disaster preparedness isn't the only reason to implement a data-replication system. You can use these products to distribute information, such as price and inventory updates, to stores and branch offices, and you can collect data, like unit sales, from remote sites for consolidated reporting. In addition, organizations that have branch locations without local IT staff can replicate branch-office data to the central office for backup.
Helping HandsWe didn't perform head-to-head comparative tests, or award an Editor's Choice. Rather, we aim to present an overview of what product options are available and rate each product on ease of use and management, features and failover time versus cost, bandwidth consumption, and CPU usage. Product reviews are in alphabetical order by type.
NSI Software Double-Take for Windows 4.4 Advanced Edition
NSI's Double-Take, among the first replication products for Windows, remains the market leader. This prototypical asynchronous replication system uses a file filter to capture data updates and both in-memory and on-disk journals to hold updates at the source until the target can receive them.
Double-Take was launched in the dark ages of Windows NT networking, before Microsoft added clustering to Windows Server, and many early installations weren't for long-distance replication but to provide failover between servers in the same data center! As a result, Double-Take's automatic failover capabilities are extensive. We especially liked its failover control center, which gave us a bird's eye view of our servers' states. In testing, Double-Take quickly determined we'd knocked our source server off the network and had the target server automatically assume the source's name, IP address and file shares. If the failed server had been more than just a file and print server, Double-Take could run scripts at failover to configure and start services on the target to have it take over additional tasks. We could even have the target server take over for multiple source file servers simultaneously, as long as all the share names were unique.
Replication Software FeaturesClick to Enlarge |
After installation, we fired up the console and dived into creating a replication set by right clicking on the source server. We chose the folders we wanted to replicate and created a connection to the target server. As you might expect from a product with such a long history, Double-Take has a host of configuration options. For example, we could specify data compression and whether we wanted files deleted from the target as they were deleted from the source.
Double-Take can run in many-to-one, one-to-many and daisy-chained configurations, but they're all implemented and displayed in the user interface as a series of one-to-one connections. As a result, it took more effort to build and maintain these configurations in Double-Take than in Replication Exec or WANSyncHA, which show these setups explicitly.
Rather than building knowledge of a SQL Server and Exchange into Double-Take, NSI has developed a extensive series of application notes for supporting common applications and Windows services. The company also provides an external utility that does the heavy lifting when failing over to a backup Exchange server, including updating Active Directory.
• Double-Take for Windows 4.4 Advanced Edition, starts at $4,495 per server. NSI Software, (800) 775-4674, (201) 656-2121. www.nsisoftware.com
Software Pursuits SureSync, SPIAgent and Advanced Open File SupportIf you're looking for a simple, lightweight way to replicate files, Software Pursuits' SureSync is just the ticket. In its basic configuration, SureSync periodically scans directories on source servers and copies changed files without requiring an agent on the target server.
Once we installed SureSync and got the console running, we created a relation between a directory on our source server and a directory on one of our target servers. The wizard helped us create a rule to specify which files to copy and whether we wanted to simply mirror the files from source to destination. The user interface for selecting multiple directories and file types made us type several include- and exclude-file specifications; we prefer point-and-click selections. Finally, we created a schedule to synchronize files every four hours, then sat back and watched the data fly.
A SureSync rule can use one of eight methods for synchronizing files, ranging from "copy all files from master to replica even if they already exist" to "delete all files on all replicas matching the wildcard" to full-blown multimaster replication: "copy changed files that change on any replica to all the other replicas."
Next, we installed the optional SPIAgent, which provides features such as replication-stream encryption and compression, and delta file replication that uses block checksums to replicate only changed blocks in a file. Although SureSync with SPIAgent will automatically synchronize files as they're modified, this is just a trigger to start a modified block synchronization, not asynchronous replication that journals the changes and maintains write-order integrity.
SureSync does provide bandwidth throttling, but we had to specify it as a percentage of our source-server bandwidth. Unfortunately, we assumed Gigabit Ethernet links on remote servers and wanted to allocate half a T1 line, but the minimum of 1 percent would be 10 Mbps. Still, SureSync's flexibility and low price make it a good choice for apps that don't require write-order fidelity or short RPOs.• SureSync, $495 per source and destination server; SPIAgent, $150 per source and destination server; Advanced Open File Support, $299 per source machine. Software Pursuits, (800) 367-4823, (650) 372-0900. www.softwarepursuits.com
Veritas Replication Exec 3.1
Veritas has wisely renamed its asynchronous file-based replication tool back to Replication Exec and added a user interface consistent with its popular Backup Exec.
A setup wizard asked if we wanted a mirror, distribution (one-to-many) or centralization (many-to-one) job. Specifying the files and folders to include required more mouse clicks than it should have. Compared with Double-Take or WanSyncHA, Replication Exec has fewer configuration options and is lacking compression and the ability to force file-checksum checks during initial synchronization when a job starts.
We were disappointed in Replication Exec's monitoring as well. Other tools would tell us how much data was backed up in the journal buffers or what percentage of the initial sync was complete, while Replication Exec told us only that the job was running and when the sync was complete.Replication Exec doesn't have any application-specific features to support Exchange or SQL Server, and Veritas' white paper on Exchange recovery outlines a lengthy process that will take most admins an hour.
Veritas has integrated Replication Exec with its Backup Exec so that users can create backup jobs that are linked to replication sets. Backup Exec can then check that the replica it's backing up is current when running the backup and report problems with the replication in the Backup Exec console.
Now that it's got the replication basics down, Veritas should develop a failover process. WANSyncHA and Double-Take are both well ahead in that area.
• Veritas Replication Exec 3.1, starts at $1,495 per Windows Server. Veritas Software, (800) 327-2232, (650) 527-8000. www.veritas.com
XOsoft WANSyncHAXOsoft's WANSyncHA is the friendliest, if most clunkily named, product of this group. WANSyncHA's user interface made it easy for us to configure one-to-one, one-to-many and daisy-chained replication scenarios. Behind the scenes is a file-filter asynchronous replication engine that runs on Solaris and AIX as well as Windows.
Rather than build a single all-purpose application, XOsoft has turned its core technology into an array of variations on a theme. WANSync is the basic replication product; application-specific versions add intelligence for SQL Server, Exchange and Oracle servers. WANSyncHA adds automatic failover, and WANSyncCD is optimized for content-distribution applications with added support for many-to-one replication scenarios. Finally, Enterprise Rewinder, which uses the XOFS file system filter driver for continuous backup, let us roll a server back to a point before data was corrupted or deleted.
After running the setup program and typing the ridiculously long license key, we started WANSyncHA's console--without the usual reboot--and created file and Exchange replication scenarios. To set up a session, we right-clicked in the left of three vertical panes and added a scenario, specified the hosts in the middle pane, and specified the directories to replicate and connection options in the rightmost pane.
We found all the options we wanted, from use-block checksums to select data to replicate, to data compression, replication scheduling and bandwidth throttling. The latter elicited our one complaint: We could specify line rates from 56 Kbps to 100 Mbps, but only by choosing one of 12 preselected rates that mimic common line rates but don't provide flexibility.
WANSyncHA is the only product we reviewed that's truly application-aware. While most rivals can protect Exchange and SQL Server, they treat these databases as just another set of files. But to protect our Exchange server with WANSyncHA, we simply created an Exchange scenario and let it automatically discover the Exchange database locations, registry entries and other resources. Very convenient.WANSyncHA also provided a unique rewind feature that let us roll our target server back to a point before data was damaged. When we rewound our Exchange server, we could choose any of several Exchange database checkpoints WANsync displayed.
• WANSyncHA Server, starts at $3,500; WANSyncHA Exchange/SQL, $5,200. XOsoft, (866) WANSYNC, (781) 685-4965. www.xosoft.com
Veritas Storage Foundation with Veritas Volume Replicator
Veritas Volume Replicator (VVR) optionally extends Storage Foundation's mirroring across IP networks. Storage Foundation, previously known as Volume Manager, extends the Windows logical disk manager to add features like N-way mirroring, split-mirror snapshots and dynamic expansion of RAID sets.
The block-based VVR supports both synchronous and asynchronous replication. However, unlike truly synchronous array-based systems, when in synchronous mode VVR sends a write-complete message to the host application before it receives a write-complete message from the remote host. VVR writes the data to a replicator log on the primary host and data volume, and sends the data to the secondary host immediately. The secondary host acknowledges it has received the data, and VVR sends a write complete to the application. Finally, when the data has been written to the secondary server, it sends a write complete to the primary, which removes the data from its logs.This modified synchronous replication reduced the delay we typically found with the synchronous setups, improving app performance while only slightly increasing the risk of secondary being out of date. In our testing, we didn't see a significant performance loss, even with moderate latency in the link.
Installing VVR was painless because Veritas sent us a field engineer, standard operating procedure for all VVR customers. We quickly created volume replication groups and selected targets on our secondary servers. Block-based VVR's replication logs require a dedicated volume for each replication group, rather than the usual files, which means you'll need to do a fair amount of upfront planning.
VVR is the only product we reviewed that let us choose safety over app performance. We could set VVR to return a disk-write failure when secondary servers are offline and have it stall app writes when its journal fills up, ensuring no data is created that isn't protected. If you haven't chosen these options and the journal overflows, say because the line went down, VVR will keep a modified block bitmap and replicate the modified blocks when it's able to do so. Nice.
• Storage Foundation High Availability for Windows with Veritas Volume Replicator, starts at $3,995. Veritas Software, (800) 327-2232, (650) 527-8000. www.veritas.com
Softek ReplicatorSoftek Replicator takes a block asynchronous approach to data replication. As a host application writes to a protected volume, Replicator's journals disk writes to a memory structure Softek humorously calls the BAB--for Big Asynchronous Buffer--and updates its PSTORES file's bitmap as to which disk blocks have been modified. It then sends the data to the target system.
If the traffic level exceeds the available bandwidth or the target server is unavailable and the memory buffer overflows, Replicator must resynchronize the volumes by sending all the blocks marked as dirty in PSTORES. Replicator will similarly resync dirty blocks if the source server crashes and the in-memory buffer is lost. Unlike file-based products, Replicator can't throttle its bandwidth utilization, and the Windows version won't journal to disk.
Given enough bandwidth, Replicator's design delivered RPO of just a few seconds. But in marginal bandwidth situations, we'd prefer a system that journals to disk so that disk I/O peaks don't force it over the cliff to resynchronizing the entire volume. Replicator's volume approach has a few other limitations: We couldn't replicate the system volume or any volumes that contain paging files. Although we could replicate between systems running different Windows versions, the limitations imposed by variations in NTFS are so significant we don't recommend it.
In addition to simplifying administration by grouping volumes to be replicated, Replication Groups ensure data integrity for applications that spread their data across multiple volumes by imposing write-order integrity across the entire group. Most of the products we reviewed provide logging and error reporting at the source and/or destination server. Softek Replicator took that one step further with a centralized collector server that stores this data along with system and replication set information in an MSDE database.
• Softek Replicator Unix V2, starts at $3,495; Softek Replicator Windows V2 (tested), starts at $2,495; Softek Replicator z/OS V3, starts at $22,401. Softek Storage Solutions, (703) 288-5800. www.softek.comLeftHand Networks NSM 150 with SAN/iQ
LeftHand Networks' SAN/iQ turns an industry-standard 1U Xeon server into the full-featured NSM 150 IP storage array by presenting the Xeon's four 250-GB ATA drives to servers via iSCSI and LeftHand's proprietary AESP protocol, which LeftHand developed before the iSCSI standard was released. As an iSCSI SAN array, the NSM 150 covers all the bases with multiple RAID levels, snapshots, dynamic LUN expansion and more.
SAN/iQ has two replication features: For high bandwidth environments, we could configure a group of NSMs into a cluster, then create volumes to have two- or three-way replication. A replicated volume will have its replicas spread across all the NSMs in the cluster and will synchronously replicate changes among them.The other option is LeftHand's Remote IP Copy feature, which creates copy-on-write snapshots of volumes on a schedule and then replicates snapshots to a remote volume on another NSM.
LeftHand normally installs NSM 150s as part of the purchase, so its SE racked up our two NSM 150s and walked us through configuration.
Because each volume has a separate snapshot schedule, Remote IP Copy may not be the best choice for apps like Exchange or SQL Server that store same-database elements across multiple volumes. Once LeftHand supports Windows Volume Shadow Copy Service, it will be able to coordinate the snapshots. Until then, we recommend snapping the database volume a couple of minutes before the log volume as the reverse may create a data-integrity problem.• LeftHand SAN NSM 150, $25,400 (includes Xeon server). LeftHand Networks, (866) 4-IPSANS, (303) 449-4100. www.lefthandnetworks.com
Howard Marks is founder and chief scientist at Networks Are Our Lives, a network design and consulting firm in Hoboken, N.J. Write to him at [email protected].
The ideal for data replication is to have up-to-the-second copies--right down to the latest online transaction and last byte of e-mail--all stored at an ultrasecure location far, but not too far, from your main data center. Unfortunately, like "happily ever after," it's an elusive goal. Not only is this level of backup cost-prohibitive, the technology required to achieve it suffers from limitations.
We set out to explore data-replication options by creating a fictional law firm looking to link to a remote disaster-recovery site, and we invited a wide range of vendors to submit their systems to our New Jersey partner labs. This is not a head-to-head comparative; rather, we sought to present the major categories of replication software.
NSI Software's Double-Take for Windows 4.4, Software Pursuits' SureSync, Veritas Software's Replication Exec 3.1 and XOsoft's WANSyncHA Server all perform scheduled file copies and file-filter asynchronous replication. Softek Storage Solutions' Softek Replicator and Veritas' Volume Replicator cover the true synchronous replication angle, and LeftHand Networks' LeftHand SAN NSM 150 with SANi/Q performs snapshot replication.We also examine other factors, like recovery-time objective, recovery-point objective and data-loss objective as well as data reproducibility. Although you may not achieve a fairy tale ending after a catastrophic system failure, our tips can help ensure you don't get cast as the villain either.
To test replication offerings, we created a fictional law firm and emulated an enterprise data center and a remote disaster-recovery location. In each location we set up a Fast Ethernet LAN with a Dell 1600SC serving as a file server and domain controller, and a P4 2.4-GHz white-box server running Exchange 2003. We connected the two LANs through a Shunra Virtual Enterprise WAN emulator configured to act like a T1 line with 10 ms of latency.
We then replicated a 4-GB file system and 2-GB Exchange database using each product. Once initial replication was complete, we ran scripts that used Word and Excel to modify a group of files on the source file server and a script that sent e-mail to the Exchange server; we monitored the amount of data each product sent across the WAN link by querying the SNMP counters on our Extreme Summit7i switch. We ran an additional script that modified a 1-GB Access database to check each participant's ability to replicate only the changed portions of files.
All Storage Pipeline product reviews are conducted by current or former IT professionals in our Real-World Labs or partner labs, according to our own test criteria. Vendor involvement is limited to assistance in configuration and troubleshooting. Storage Pipeline schedules reviews based solely on our editorial judgment of reader needs, and we conduct tests and publish results without vendor influence.
You May Also Like