The IGS group had designed the unique failover architecture for its SAP, product-warranty and other servers in an IBM RTP campus building that had been powered down for renovations over the Memorial Day weekend. When the IBM Sysplex servers in Building 500 went black, a new IBM Parallel Sysplex server in Building 302 picked up the applications and their transactions automatically. IBM's Commercial Desk Top (CDT) and Server manufacturing lines in RTP that use these applications for order processing didn't blink, nor did IBM's sales and distribution operations in RTP and in Guadalajara, Mexico, which also use these systems.
"Normally, you have your own individual backup" for these types of systems, says Tim Corcoran, an ERP/SAP architect for the IGS group in RTP. "We were not able to find anyone who had ever done anything similar to this." The architecture gave IBM a new permanent failover architecture for the SAP manufacturing systems that takes only seconds to execute, Corcoran says. Other organizations within IBM, such as its server group, are designing similar failover architectures with the Parallel Sysplex, too.
Ready to Act
When IBM's Personal Computing Division (PCD) approached IGS in January to help keep the ERP applications for its CDT and server manufacturing online over the Memorial Day weekend, Corcoran and his team at IGS already had been mapping out a long-term backup plan for the SAP servers. PCD's requirements added a sense of urgency to the concept--at stake were manufacturing lines for IBM's CDTs and servers in RTP as well as the U.S. distribution of ThinkPad laptops built in Mexico. Just one hour of downtime on the manufacturing line costs IBM $48,000 in salaries for workers on the line, and that's not including managers and support, Corcoran says.
There were some tense moments early on in the project. In February, when Corcoran's group had ordered much of the server and network equipment, the PCD group added another request: that its three in-house warranty applications in Building 500 also be up and running during the holiday shutdown. The tricky part was mixing these internal applications with the notoriously stingy SAP suite, which generates a large amount of I/O traffic between application servers and the host.
Interestingly, IBM went with Gigabit Ethernet for the network instead of its proprietary Escon technology because it wasn't able to get the Escon fiber in time for the big weekend. "There are not many people running Sysplex on Ethernet," Corcoran says. "We were used to Escon, but we learned the gigabit technology."
Because Gigabit Ethernet is faster than Escon, IBM set the parameters in its SAP applications to handle the smaller packets. "If the SAP code detected that a link was down, it can reroute to a backup. We wanted to minimize the impact on the user and his or her work," Corcoran says.
And running SAP on a Parallel Sysplex required different workload management from that of the traditional MVS (Multiple Virtual Storage) system. Corcoran and his team assigned different logon groups to different CPUs in the Sysplex, based on their processing requirements. SAP sales and distribution, for instance, typically generate the heaviest traffic load, so they were assigned to a newer machine. Doing this prevents cross-system interference, where two users of the same SAP application simultaneously try to access the same records and cause a deadlock. "By putting similar users and workloads on one system, you can minimize this," Corcoran says.
Perhaps the biggest trade-off of the architecture is that while it is a true parallel system, it uses data sharing, meaning the two systems use a single image of the data. Corcoran's group is exploring ways to mirror the data, though. "We'd like to have a logical copy of the data in sync with the production system so we could fail over to the copy while we do table and other routine maintenance [during the time] the production image of data is offline," Corcoran says.
|
IT Department Info
- Size of IBM Global Services IT Staff: 30
- Corcoran's Average Workweek: 40 hours (80+ hours during Memorial Day project)
- Biggest Challenge: Helping IBM's Personal Computing Division and the rest of the company remain competitive, and staying on the leading edge with new solutions to old problems.
- Latest Projects: Investigating the elimination of SP2 frames by bringing up two Linux LPARs (logical partitions) and running the application servers in them.
- Coolest Part of the Job: Always learning something new and working with people who now have a whole new set of skills.
|