A Solution to Linux Management

There's no need to spend big bucks upgrading and troubleshooting your Linux machines. Plenty of free, open-source tools will centrally manage your servers and workstations. The key is choosing the

July 1, 2005

9 Min Read
Network Computing logo

Commercial Linux management products like Red Hat Network and Novell's ZENworks Linux Management are a step up, offering system updates and configuration management. These tools are great if you're using that vendor's products exclusively, but they'll cost you: $192 per year per machine for Red Hat Network (with advanced provisioning), $130 per year per machine for Novell's ZENworks.

Instead, you might want to try one of the many open-source solutions for centrally managing your Linux systems. And even if you use commercial tools, these products still can come in handy. Our favorites for managing Linux servers and workstations are ClusterSSH, rsync and cfengine, which aren't specific to Linux or Unix and let you manage different types of machines from one console.

Cluster Your SSH

Secure shell--ssh--is the standard for remotely manage Linux machines securely. But ssh alone connects to only one computer at a time. To manage a bunch of machines, use ClusterSSH. ClusterSSH speeds up the process by opening a window for each connection as well as a master repeater window that repeats your keystrokes on each of the ssh connections simultaneously.So if you have three machines--huey, dewey and louie, for instance--and you type:

cssh huey dewey louie

How To Deploy CfengineClick to Enlarge

ClusterSSH will then open with four windows, one for each host plus one window for you to type your commands. As you type in the main window, these keystrokes are automatically replicated at each of the hosts with which you're connected. You can even do complex things like edit a file with your favorite text-based file editor.

The ClusterSSH tool goes a long way for relatively simple commands such as editing files, restarting services and configuring identical Linux machines. But it's not necessarily a fit if your machines have different configurations and require different management tasks--you won't be able to replicate keystrokes if you must issue separate commands.Because ClusterSSH opens a window for each host, there's obviously a limit to the number of windows you can have open at any one time. ClusterSSH resizes the windows to make them fit neatly on your desktop, but the more you have, the smaller they'll be. We've found that using ClusterSSH with more than 20 computers at a time makes things a little dicey.

Get in rsync

If you have more than a handful of identically configured Linux servers and workstations, consider a management tool like rsync, which synchronizes files between computers.

Cfengine Up CloseClick to Enlarge

We use rsync to synchronize our machines at the University of Wisconsin-Madison's Computer-Aided Engineering Center (CAE) to a common image we store on our install-image server. If we want to update the configuration files on our fleet of Linux machines, for example, we merely update the master image and the next time rsync runs, it finds the files that are no longer identical to the server and corrects them. We use rsync to synchronize everything on an instructional lab workstation's hard drive, except /tmp, /var, and certain kernel and device files. Our machines run 24/7, so rsync keeps them all in line without interrupting our users with machine reboots.There are a couple of drawbacks with rsync, however. If a large number of computers are synchronizing against one server, many rsync connections started at exactly the same time can bog down the server. This is especially true if you're keeping the time on all your machines in sync with the ntp (Network Time Protocol) daemon. One quick remedy is to make each machine wait a random period of time before starting its rsync process. At CAE, for instance, we begin rsync and other overnight processes between midnight and 6 a.m, which ensures an even load on our servers throughout the overnight hours. See "Mixing It Up With Rsync," for a simple script to randomize the wait time.

Start Your cfengines

Secure shell, clusterssh and rsync work great if you have a large number of workstations that are configured identically or nearly the same. But if your Linux machines have very different configurations, you need a stronger management tool, such as the open-source cfengine. Unfortunately, more power means greater complexity. Just as a car mechanic invests his or her time to learn how to use the tools in the shop, you can expect to spend at least a week or two learning cfengine. But like the cfengine documentation says, once you start using it, you'll wonder how you ever lived without it.

Cfengine comprises several programs that execute various tasks. Cfagent is the main agent that configures a machine--changing files, restarting services, running shell scripts and so on. Cfservd is the program that listens and can share files on the network--configuration files and any other file you share with your cfagents. Cfexecd schedules tasks to be run, and Cfrun can poke at a machine to run cfagent right away. Finally, cfkey creates the public and private keys that cfagent and cfservd use to authenticate each other, and cfenvd keeps track of the processes, network connections, memory, swap and other properties of this computer.

Cfengine's configuration files use a very high-level language (see a short summary of some of the configuration roles at left). To configure the actions cfagent should take, list the way a machine should be configured. For example, here's how to make cfengine manage the ntp daemon configuration file, ntp.conf: First, configure your cfagent to manager the ntp.conf file. If the ntp.conf file is changed, restart the daemon. That's because if cfagent changes the file, ntpd must be restarted. Each time cfagent runs, it checks to see if your ntp.conf file matches the master copy. If not, cfagent overwrites the local copy with the master one and restarts the ntp daemon.All of cfengine's actions are repeatable: Each step can be interrupted, done later or repeated multiple times. Furthermore, any action cfengine takes can trigger another cfengine action. The tool can order tasks with no human interaction, and over time, your computers will reach what cfengine's author, Mark Burgess, calls "convergence," or their ideal running configuration. If any machine managed by cfengine diverges from its optimal state, the tool will take the corrective actions you specify.

Cfengine can copy files from a central repository as described above, run shell scripts and even warn the administrator when a file's MD5 checksum changes. It comes with scriptable file editing, with commands that include CommentLinesMatching, CommentLinesStarting, HashCommentLinesContaining, ReplaceAll. What's more, most of cfengine's configuration options accept regular expressions.

File editing is handy in instances where you only want to change part of a file, but leave the rest alone For example, if you want to replace all instances of 192.168.42.2 with 192.168.42.1 in a certain file, you would use the following command:

ReplaceAll "192.168.42.2" with "192.168.42.1"

The operator CommentLinesStarting is also very powerful, and you will probably want to include all of the items in an inetd.conf (the Internet superserver's configuration file) that you normally don't include, such as telnetd.One restriction with cfengine is that cfagent must be able to tell if it has already performed an action, and this is especially important when editing files. For example, the cfengine configuration line

ReplaceAll "server 192.168.*" with "server 192.168.42.1"

will not work. The problem is that the regular expression "server 192.168.*" matches both the strings that cfagent is trying to find and replace, so cfengine will refuse that action.

Cfengine is sensitive to how multiple machines running identical tasks simultaneously can wreak havoc on servers, so it has a configuration option called SplayTime (it's off by default, but you will most likely want to turn it on). SplayTime directs cfengine to wait a random amount of time before beginning its work. Additionally, cfagent by default will not restart more than once per minute to protect against loops.

Security is an important feature of cfengine. No communication can occur until you set up the proper public key exchanges, and the cfagent can operate only in "pull" mode. Client cfengines can respond only to the cfrun command "do some work," and only when the client and server exchange the correct encryption keys.Getting your initial configuration in place can be difficult if you're new to it. It also can be frustrating to figure out whether cfengine is responding to your commands (See "Step By Step," page page 76).

Then, once you have cfengine running, adding hosts is easy. Simply add a new host's public key into the cfengine ppkeys directory on your cfengine server, and the cfengine server's public key on the host's ppkeys directory.

Version 2 of cfengine includes software-package management, which lets cfagent install, for example, the most current version of the image-manipulation program ImageMagick on your Web servers. If you bring up a new Web server, cfengine will install that software package for you automatically.

Last and certainly not least is cfengine's ability to do environmental monitoring, letting cfengine's cfenvd examine different aspects of your system, from the number of root processes to the number of connections on well-known services like HTTP. It also looks at the averages over time of each of these levels and at the standard deviation. Anything three or more times the standard deviation is considered an anomaly and reported to the cfagent next time it runs. For instance, in the case of a huge crush of HTTP connections to your Web server, you can have cfagent send an e-mail alert, log this data to syslog, throttle the Web server or halt backups until the load decreases.

Cfengine is clearly the most sophisticated of the open-source tools for centrally managing your Linux machines. The key is to choose the right tool for your environment and make sure you get the most out of it.How to Deploy cfengine

Run the cfengine server initially with cfservd -v -F. This tells cfengine to run its server daemon with verbose mode in the foreground, which helps you determine the source of any problems you encounter.

Configure security properly. Most initial problems are in the public/private key exchange.

Be sure to run other programs in debug mode (adding -d). That's often the only way to find out why cfengine is not performing the way you want it to.

Add IfElapsed = ( 0 ) to the control section of your cfagent's configuration file while testing. This will override the once-per-minute default time between repeating an action, which you don't need while testing.Cfengine information, www.cfwiki.org

Cfengine's main Web site, www.cfengine.org

ClusterSSH home page, clusterssh.sf.net

Rsync home page, rsync.samba.org

Mixing It Up With RsyncJeff Ballard is the Unix systems manager for the Computer-Aided Engineering Center at the University of Wisconsin-Madison. Send your comments on this article to [email protected].

Mixing It Up With Rsync

Here's a simple script for setting a random time for an rsync task. Put it at the top of the script you are about to run. The amount of time to wait, in minutes, is its "argument"--so randomwait.pl 5, for example, will wait for a random time up to five minutes.

randomwait.pl:#!/usr/bin/perl

sleep ( rand( $ARGV[0] * 60 ) );

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights