
A SYSTEM WITH MUSCLE
Even though we had Marathon Technologies' Endurance 4000 in our lab for only a few days, we saw enough to be impressed by the engineering that went into this system. It differs from the other solutions we tested because it runs a single image Windows NT Server 4.0, unmodified except for the addition of device drivers; the others run a complete second copy of NT on the second server. Take note, however: Endurance 4000 is not cheap--it's $25,000, and that price does not include the four servers and m
ultiple copies of NT.
But if you worry about hardware fai
lures, Endurance 4000 could be for you. Marathon Technologies supports hardware from all major vendors; the system we tested was based on Intel OEM servers. Endurance 4000 consists of two Compute Elements (CE) with main memory, plus a single CPU that runs NT. Each CE is connected to two I/O processors (IOPs). The IOPs also run a copy of NT and contain the network interface cards and mass storage subsystem (see diagram at right). Our server came with a single 1-GB SCSI disk in each system. This configuration is unique since NT and every application runs in lockstep on both CEs. If one CE fails, there is no interruption in service nor is there an interruption if one IOP fails.
We must, however, highlight one very important weakness. If the application running on Endurance 4000 should hang or NT displays the BSOD--blue screen of death--because of a bad Service Pack, your system will go down. Other failover solutions allow the application to continue running after a brief pause of one to five minutes, depend
ing on which applications need to be brought online.
Marathon Technologies is positioning the Endurance 4000 as a very highly available node. When a node recovers from a failure, the system temporarily pauses to copy the contents of main memory from the running system to the recovered system. We were quite impressed with this feature. Also, the contents of the disk are copied while the system is running, but this has only a marginal effect on system performance.
To see how well this works, we ran a custom SQL Server 6.5 application against a database on the Endurance 4000 and pulled the power on the first CE and on the second IOP. No problem. We then reconnected the CE and IOP and everything continued as normal, except for a 10-second pause as main memory was copied. The delay depends on how much memory is in the CE and the speed of the interconnect.
You must purchase four copies of NT for
this solution to work
but only one copy of the application. And you also must purchase four servers and the related Endurance 4000 hardware. Although Endurance 4000 is not ideal for every application, in instances where the trade-off is more-expensive downtime, the expense may be justified.
|