|
Adventures In Data ReplicationJeff HallReplication has been practiced on mainframe and minicomputer environments for at least a decade. However, only recently have vendors produced general tools to perform replication on behalf of applications based on predefined rules. Users and application developers like replication because it offers high database availability, features fault tolerance, provides subsets of databases to remote systems and supports peer-to-peer database access. You must recognize that data replication is not just a networking issue, but also an operations and application issue. The importance of coordinating these areas cannot be understated. Without cooperation, a data replication project will result in either failure or unacceptable processing levels. What Is Replication?In a nutshell, replication is the copying of information to multiple computer systems in such a way that the information is consistent across systems. There are two basic types of replication, synchronous and asynchronous.Synchronous replication uses the two-phase commit process available with most Relational Database Management System (RDBMS) products. In a two-phase commit environment, when an update to the master database is processed, the master system connects to all other systems (slave databases) that require the update, locks those databases at the record level and then updates them simultaneously. If any connection to another system is not available, the update will be rejected. Asynchronous replication (also referred to as "store-and-forward") comes in two types, periodic and aperiodi c. Periodic are replications executed at specific intervals and aperiodic are those replications executed only when necessary (usually based on a triggering mechanism). Synchronous Replication IssuesBecause the synchronous approach requires access to all slave databases at the time of update, 100 percent network availability is required to ensure that all transactions are completed successfully. The implication of 100 percent availability is obvious -- without the network connections a synchronous replication will not process, causing all update processing to be suspended and potentially locking the application.Network managers working with synchronous replication need to plan for a high network availability with some sort of circuit redundancy, either high-speed dial-back (Switched 56 or higher speed) or self-healing networks. The application's users are the people to decide what type of redundancy will be required (with the network groups assistance) so that an appropriate cost justification can be prepared. Transaction volume and the speed at which transactions occur and are processed will dictate network requirements. High-speed connections do not guarantee successful implementation of an application. If the slave database's processor is not capable of processing the transactions in a timely manner, the network capacity will only increase the transaction processing backlog of the slave system and slow down the processing of the master system. Oracle supports replication two ways: a real-time data replication service and a store-and-forward process that will be discussed later. Under the first approach, Oracle assumes that the network and connected computer systems are extremely reliable. When an update is executed, a trigger associated with the table updates is activated, causing the update to invoke the two-phase commit process with all other databases that require the update. Asynchronous Replication IssuesAsynchronous replication is what most vendors recommend and is the type o f replication currently in vogue. Asynchronous replication typically involves some form of replication engine or server that tracks all updates and then ensures that these updates are shared with all other systems. If a system is unavailable, the server continues to track the unapplied updates and will update the system when it is available. If the replication process cannot be completed with all systems, all databases may not reflect current information.Asynchronous replication can be highly demanding on a network, because of the number of variables that need to be considered. Besides transaction volumes, other variables include line speeds, type of connections, speed and number of processors involved, data timeliness and number of replication servers. With any replication server approach, the server should be situated on the same network segment as the database server to speed the capture of replication transactions. You probably also will find it necessary to run multiple network interface cards in the database and replication server to ensure performance of both the capture and replication processes. While the implementation of asynchronous replication eases the real-time, immediate demands on a network, the overall demand on the network is either shifted or spread out. Shifting of demand could involve the processing of replication over night, however by shifting processing, the network capacity needs to be sufficient to handle high-volume updates to the remote databases. In spreading out demand, the network manager needs to provide reliable connections to ensure that replication queues do not become overly large, thus causing a high-volume update situation. Various RDBMS vendors have tackled the asynchronous replication issue. Sybase and Software AG implement asynchronous replication through a replication server. Ingres and Informix are also taking the replication server approach, however neither vendor has officially released their replication products. Oracle's second form of replication is a stor e-and-forward process that requires the definition of a special user-defined table and the development of a user-written process that periodically invokes itself and propagates the updates. In addition to RDBMS vendors, there are a number of software vendors using replication, including Lotus Development Corp. with Notes and cc:Mail and Jensen Jones with Commence. Lotus NotesNotes creates the biggest networking problems when there are multiple replication servers. In this scenario, updates are made anywhere on the network and the Notes servers replicate changes between one another to remain synchronized. As a result, it is possible that in a system with 10 Notes servers, all servers can initiate replication so that 10 replications can be driven simultaneously between all servers. To minimize the impact Notes has on a network, the number of servers that initiate replication should be reduced to the fewest number possible, preferably to one replication server.The replication solution I used with one client was to enforce a manual procedure that required users to replicate their information at least once a week from their office. This allowed the out-of-office personnel to use the high speed of the local area network to expedite the replication process. Dial-up replication was forbidden, except in extreme emergencies. The initial implementation of this solution caused problems on Monday and Friday mornings, however over time the users began to spread out the replication processing as they found more open processing times. This solution did have a down side in that some users did not have current information, however we attempted to minimize this by controlling when updates were posted to the Notes databases. Lotus cc:Mail MobileLotus cc:Mail Mobile is another example of asynchronous replication. In this environment, the cc:Mail Router stores directory updates and messages for remote users for later retrieval. By the same token, the cc:Mail Mobile client functions as a local post office su ch that messages created by the client are stored locally until sent. When a user dials in, cc:Mail Mobile sends the electronic mail created by the remote users and then the Router delivers the directory updates and messages to the remote user.The problem we have been seeing is that the transfer time required to send and receive messages that include attachments such as word processing files and spreadsheets is lengthy. While cc:Mail does have extensive filtering capability built in, being able to reject messages based on size or header information is not always an option to those on the road. Even though cc:Mail allows for attachment compression, the majority of users do not compress attachments and probably would not know how to expand a compressed attachment. While compression will speed transfer times, the issue of message storage on the server and local system is also a problem, since all received mail is stored on the local system and transmitted mail is not deleted automatically from the server. As a result, the user must manage both a local post office and his or her server-based in-box. One solution is to allow dial-up access to our Novell NetWare 3.11 LAN, thus allowing remote users to access cc:Mail as if they were on the network. Another solution is user retraining. Jensen Jones CommenceJensen Jones Commence is a customizable personal information manager (PIM). Commence allows for a centralized repository for client, engagement, appointment and other consulting practice information. Commence uses a Windows-based arbiter to control the updates to the central database and to control the replication of data to individual systems.For remote users, Commence offers a number of ways to perform replication, direct replication and indirect replication. Direct replication is similar to the standard asynchronous replication model described earlier. The indirect replication allows for the creation of a replication working set file that can be transmitted via electronic mail and then processed b y the remote client. The working set file contains either complete or incremental data depending on the option selected. Commence automatically compresses the data to reduce file size, thus reducing transmission time. Under our implementation, we use cc:Mail to transmit the replication working sets to remote users. This scheme allows us to avoid costly telecommunications solutions and leverage our existing technology infrastructure. Jeff Hall is a manager in McGladrey &Pullen's Information Technology practice in Minneapolis, Minn. He can be reached via the Internet at 75270.2163@compuserve.com. Network Disaster Recovery PlanningJeff HallReplication can be a big part of assuring that an effective disaster recovery solution can be worked out. If the databases in question are replicated in real time to another backup server in a different site, then switching the users over to that site can be quick and not require restoring of data. Replication is a key technology that helps make disasters easier to overcome. Synchronous replication always requires a rapid level of recovery, since this type of replication is real time. Organizations considering synchronous replication had better plan on some form of self-healing network or high-speed dial backup capability. Asynchronous replication is more forgiving but, depending on user requirements, may be just as demanding as synchronous replication. The key to successful network disaster recovery planning is the use of a comprehensive planning methodology. The critical components to network disaster recovery planning methodology is determining the business recovery requirements by application and the assessment of disaster threats. To assess application recovery requirements, the immediacy of application recovery needs to be determined and the priority of recovery needs to be agreed upon. As an example, while the payroll application is very critical to all employees, physical recovery of that application is not immediate as long as the last payroll check information is available for reprinting until the application can be recovered. The assessment of disaster threats involves examining your organization and determining where the likely risks are for having a disaster at your facility. The outcomes of these two analyses will determine what type of network recovery alternatives are available and the alternative networking requirements. |
Best of the Web
Data deduplication: Declawing the clones
Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.
Compression, Encryption, Deduplication, and Replication: Strange Bedfellows
One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.
WAN Optimization Whitelists and Blacklists
Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.
WAN Optimization as a Managed Service: It's Not About the Cost
This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.





