Upcoming Events

Cloud Connect
Santa Clara
Feb 13-16, 2012

Cloud Connect brings together the entire cloud eco-system to better understand the transformation we're experiencing and promises to be the defining event of the cloud computing industry. Learn about the latest cloud technologies and platforms from thought leaders in Cloud Connect’s comprehensive conference.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up
Business Applications
F E A T U R E  
Special Report: Are We There Yet?

  November 26, 2001
  By Kevin Novak and Patrick Mueller

Clustering Defined, Linux Style

Printer Print Full Article
Printer Print This Page
Printer Download the PDF
E-Mail E-Mail This URL

Once content to bum around the basements of übergeek hobbyists, Linux is lately finding itself in some posh and sophisticated sites previously reserved for commercial Unix platforms. High-availability Web farms and massive-parallel-processing projects in both the scientific and commercial realms are examples.

The word clustering is hopelessly overloaded and is being thrown around by anyone who hooks up two or more computers to have them work together in any way. For our purposes, we divide the Linux clustering space into the following categories: parallel processing, batch processing, and load-balancing and failover.

Parallel Processing

Certain scientific calculations -- for example, simulating the movement of a liquid at the atomic level -- require that the problem be parallelized at a very low level. This used to require expensive supercomputers and, depending on the exact nature of the computation, sometimes still does. Certain applications, however, can be run on clusters of regular workstations, connected over normal -- but private -- network connections. The advantage of these parallel-processing clusters is the high bang for the buck they offer in terms of the performance. By employing a large number of conventional workstations, you can keep costs low while assembling an amazing amount of processing power.

The Beowulf Project, started in the mid-1990s, offers parallel-processing options for the Linux platform. Donald Becker, the founder of Beowulf, has now moved on to make a commercial version of his brainchild, offered by Scyld Computing. The Scyld distribution provides a fast and polished installation, commercial support, and professional services for hire. The version we attempted to install in the lab was plagued with hardware issues. We hope customers holding commercial support contracts will have better luck than we did.

Batch Processing

Beowulf, in general, can run not only low-level parallel applications but also batch-oriented applications, such as data mining, 3-D rendering and engineering simulations. If you run a scheduler on top of Beowulf, any of these large batch jobs can be crunched on a cluster. These schedulers include Condor, from the University of Wisconsin's Computer Science Department and Portable Batch System, by Veridian Systems.

A new player on the scene is Project Nimrod, whose commercial incarnation is featured in TurboLinux's EnFuzion product.

Load-Balancing and Failover

Linux clustering solutions for load-balancing and failover seem to be popping up just as fast as new Linux distributions. In addition to offering EnFuzion for true clustering, TurboLinux also offers Turbo Clustering Server (TCS). TCS supports load-balancing and failover clustering for HTTP, FTP, SMTP/POP3/IMAP, NNTP, DNS and LDAP. The cluster manager node, called Advanced Traffic Manager, can be configured in failover mode so there is no single point of failure in the system. While some of TCS's code may be derived from the community, most of what we saw appears to be new code developed by TurboLinux. It is being released under GPL (GNU General Public License), so it may show up in other projects as well.

The High-Availability Linux (Linux-HA) project's Web site isn't fancy but is comprehensive in scope and has links to many other Linux projects under way. The Linux-HA project spawned the popular Heartbeat tool now included in several mainstream distributions, including SuSE and Mandrake. Heartbeat supports serial and Ethernet communications for failover of simple applications, including DNS and Web proxy caching services.

The Linux Virtual Server (LVS) project borrows the Heartbeat code and is collaborating with the Linux-HA folks. In turn, Ultra Monkey provides some Layer 4 switching capabilities by building off LVS.

Finally, as always, commercial support is essential. Companies like SuSE and Silicon Graphics Inc. (SGI) are teaming up in an effort to port SGI's FailSafe product to Linux, so enterprise support should be a nonissue.


   Page: 1 | 2 | 3 | 4 | 5 | 6 | 7 | Next Page

Research and Reports

Hypervisor Derby
August 2011

Network Computing: August 2011

TechWeb Careers