Consider NetSaint a distant cousin to Computer Associates International's Unicenter TNG or Hewlett-Packard Co.'s HP Open-View, but realize that it is nowhere near as extensive as those products. NetSaint's data-gathering abilities out of the box are geared toward ascertaining whether common IT network services are up or down. And its power lies in its expandability. You can use it to monitor the status of common services, and you can create your own plug-ins or use existing plug-ins to extend its capabilities. Not only is NetSaint inexpensive, it's much easier to set up than OpenView or Unicenter.
NetSaint uses active testing to determine if a service is running. For example, when NetSaint tests an enterprise's FTP server to discover its status, the program connects to the FTP server and times how long the connection takes. If NetSaint can't connect or if the time it takes to connect is longer than acceptable, NetSaint will activate whatever notification system is in place. While most installations of NetSaint are run on Linux with an Apache Web server, the software will compile and run on any Unix machine and work on any standard Web server. We have had NetSaint compiled and running on HP-UX, Linux and Sun Microsystems' Solaris using Apache as the Web server without any problems.
Under the Halo
NetSaint's power comes from its core daemon, which calls out to many different programs and modules to perform the required monitoring. The core daemon is similar to plumbing; once you have your plumbing in place, you can add many different types of faucets for various tasks. The NetSaint daemon can be configured to call plug-ins at defined times. For instance, you may want to check your HTTP (Web) server every five minutes, but check your users' desktop machines once an hour. The NetSaint daemon controls the running of the plug-ins and collects their results. It also notifies you by Web page, e-mail message or page in the case of error.
Plug-ins are available for the typical monitoring chores ("NetSaint Plug-Ins," gives a partial list). For instance, the check_ping module sends an ICMP Echo Request (commonly called a ping) to a host and waits for its reply. The module sounds an alarm if it does not receive a reply or if the reply comes too late. For each machine you plan to monitor, you'll want to create a container that lets you add definitions of services to check. Common network services to check are ping (network connectivity), SMTP, POP3, IMAP and HTTP.
One of the systems you are likely to monitor is the network. To monitor services on the hosts, NetSaint must talk over the network. Let's say there is a malfunctioning network switch between the NetSaint machine and the service you are trying to monitor. If you tell NetSaint a little about your topology, it will know that the main switch is down, and it won't worry that the hosts behind that switch are unreachable. This ability can extend between pieces of network equipment as well. That means you could set instructions that first say there is the main switch, then that the switch for the rack that the server is in, then there is the machine, then there are the services on the machine. If the main switch goes down, NetSaint will report that switch down and will report "unknown" for the other dependent services that rely on that switch.
Also included with the core distribution is a Web interface plug-in. This lets you view the network status, a history of problems (including when and how notifications were sent) and NetSaint's log file. The Web interface should work with most available Web servers.
In all, the plug-in model is extremely agile because it lets you insert different smaller programs to do the dirty work.
With the latest stable version of NetSaint, 0.0.6, you can also set up NetSaint to escalate notifications if specific errors continue. For instance, if your DHCP server goes down, NetSaint can e-mail you; then, if the server is still down the next time it checks, NetSaint can send you an SMS (short message service) e-page.
If the server is still not fixed the next time it's checked, NetSaint can send out a page to a backup contact, and then to the group and so forth.