May 22, 2000
Brought to you by:
Table of contents:
Got a tough Linux deployment question? Ask the experts!For a limited time, you can put the authors of "Deploying Web and FTP Servers" to the test. Post your question, and if they answer it, you'll receive a free Network Computing collectable. Click here for more info.
|

The success of the Internet lies in its ability to provide information,
quickly and cheaply to anyone at any time, and to facilitate fast communication
on a global scale. As a result, computers have now been installed in homes and
in workplaces that are able to connect to the Internet and draw from an almost
unlimited supply of information. Indeed, every business that wants to succeed
well into the twenty-first century needs to make use of and contribute to this
technology. At the center of this communication network are the web servers that supply information
back to the client. However, web servers should supply not just static
information, that is unchangeable text and graphics on a standard HTML page,
but are able to respond to the needs of the user. This requires a web server to
resolve the request of the user and respond accordingly. Technologies now exist
that enable the web server itself to fetch and process data before sending it
to the client.
Telnet was originally used to run programs on remote computers, e-mail was used to communicate and FTP or Gopher was used to transfer large files. Telnet and e-mail are still
used, and FTP has become the preferred way to make files available across the
Internet, but the real revolution in Internet use over the past few years has
been the introduction of a new method of searching and viewing information, the
World Wide Web (WWW).
The WWW was originally developed in the European Center for Particle
Research (CERN). CERN developed a program to serve information in HTML format
across HTTP -- a web server, and another program to receive and view it -- a web
browser. These programs were subsequently developed by the National Center for
Supercomputing Applications at the University of Illinois and renamed NCSA and
Mosaic respectively. Mosaic went on to become the commercial Netscape Navigator product, which from version 5 onwards has been
decommercialized and is now open source again. On the other hand, the NCSA
server has been open source all along. The server has been improved by
successive patches, and has been renamed Apache. In over ten years of dedicated development, Apache has been
extended to implement numerous technologies and is considered to be the most
stable web server available, when run on its native Unix or Linux operating
systems.
Apache's scalability, zero cost and its customizability, make it the
most popular web servers available -- running over 4.3 million web sites, over
half of the WWW, as of October 1999 (figures from Netcraft).
The development of FTP servers has not followed the same pattern as
that of web servers. However, the ability to resume broken downloads, by
starting the download process part way through a file, is a major recent
advance. There are numerous FTP servers available for Linux, but the most fully
featured and well-tested server is the one developed at Washington University. WU-FTP provides a complete
FTP server solution, as well as having a large user base which ensures
continued development, and updates, where necessary.
As
I have said at the very beginning of this section, web servers now require the
ability to respond to the user by server-side processing. The CGI (Common Gateway Interface) allows you to execute scripts such as
Perl on your server, providing particularly powerful text handling
capabilities. More recently, the Java servlet has providing the ability to
execute more complex routines on the server. Either technology allows you to
provide different content and perform different actions depending on the user's
actions. JavaServer Pages (JSP) are the latest addition to the Java family,
which do the same task as Microsoft's Active Server Pages (ASP) -- processing user requests using server-side
code and returning the information to the user as plain HTML.
Apache has excellent
implementations of each of these technologies with the release of the mod_cgi/mod_perl modules (for fast CGI
script execution) and ApacheJServ (a servlet), with JSP support through Jakarta
becoming available at the time of writing. There is another advantage to these
technologies; if you decide to change operating system or web server, you will
be able to transfer these scripts with very little editing -- the same cannot be
said of ASP!
This chapter will not only demonstrate how to install
both the Apache web server and the WU-FTP server, but will show you how to
configure the installations to meet your requirements, including providing
basic security to your servers. It will also show you how to perform essential
administration tasks, such as analysing log data and will suggest potential
options to replace proprietary technologies, such as ASP, and will cover
technologies such as ApacheJServ, CGI and SSI. Finally you will be given tips
on how to get the server working as quickly as possible following a crash.
Deploying the Apache
Web Server
In this section you will learn how to set up Apache on a Linux machine.
You will be shown the system requirements for setting up the server and the
modifications you will need to make to the Linux operation system configuration
files to prepare it for installation. In addition to stepping through the
installation process itself, you will find out how to configure Apache to meet
your own requirements. Then, you will learn how to add a virtual host to the
server, review some technologies to provide interactivity to your web site and
how to provide useful reports on web site usage by examining the contents of
access log files.
System Requirements
The following list gives the requirements that need to be met when
setting upg an effective production web server. For a test web server, it is
possible to install to any machine with as little as 30 MB of free disk space,
any processor speed, and a static IP address if you want to test it on line.
Permanent Internet connection -- as an
experienced system administrator, you will probably already know exactly the
requirements for your site. If not, a good rule of thumb is to allow a minimum
of 10 kilobytes per second per simultaneous user. So if you expect a maximum of
twelve users on your site at any one time, a 128K leased line would be the
minimum requirement. Obviously the content of your pages will make a big
difference, and experience is the best teacher as far as choosing a connection
is concerned. If your site is purely for intranet use, this will not concern
you.
URL and IP address -- You will need to purchase
at least one domain name and IP address. Entering your primary IP address and
host name when originally installing Linux is the quickest way to gaining a
bulletproof basic network configuration -- if you didn't set them during the
installation, they can be edited later. You will need a further IP address and
hostname for each additional virtual host you wish to run -- although these hostnames can be
subdomains of a single registered domain name (e.g. apache.wrox.co.uk is a subdomain of wrox.co.uk). An FTP
server and a web server may share a host name but they will use a different
port.
DNS server -- You will need access to a
DNS server to allow Internet users to resolve your domain names. For many
companies, this access will be provided through a corporate account with a
major network provider -- although Linux is itself capable of running fully
featured name servers if you will make sufficient use of it to justify the
maintenance effort. Adding entries will be as simple as contacting that
provider.
Linux machine -- The ideal requirements here are plenty of memory and a fast,
ultra-wide SCSI hard disk, since these are what will see the most work -- IDE
hard disks may prove to be a bottleneck if server demand increases. The entire
software (including the operating system) will unlikely take more than 1GB, so
a typical 10GB hard disk available today will be more than ideal and allow
plenty of space for growth. 64MB of RAM should prove enough for up to 10 000
hits per day; if you find memory swapping is taking place, it is a simple
matter to add more RAM (and you should do this because swapping causes serious
performance degradation). If using server-side Java technologies, add at least
another 32MB for the Java Virtual Machine (JVM). Where significant server-side processing
is used, the speed of the processor is also important -- the faster the better.
If connected to the Internet, the Linux machine should not be used for any
non-Internet purposes. If you are using the server as part of an intranet, then
the installation of a firewall will be necessary; this way the effect of any
security breach will be minimized.
Preparing Linux for
Installation
The installation should always be performed in a clean environment --
this involves formatting the hard disk and installing Linux from scratch. Make
sure your version of Linux is up-to-date -- stack and TCP/IP bugs are
occasionally found in the Linux kernel, and an up-to-date distribution will
keep you one step ahead of the hackers. The latest security bulletins are available
from CERT and it is useful to
subscribe to their security mailing lists. This chapter assumes you are using
Red Hat Linux 6.1, but the steps are very similar for all Linux distributions.
It
is crucial that only essential services are running; Linux is capable of
running everything from ping to fully featured name
servers, and by default, it will. When not fully configured, these represent a
security loophole -- so it is best to unselect all the obvious communication
services during the Linux installation (anything with ftp, mail or web in the title). Make sure that make (an install utility) and gcc (the C compiler) are selected -- we will use them later.
Once
the install is completed, edit the /etc/inetd.conf file, or equivalent, which
contains a listing of all the network services started on the machine. Insert a
# symbol at the start of the line containing the name
of the services which are not needed; particular services to disable include finger,
cfinger
and portmap which provide useful information for hackers, and if you don't require it, telnet.
Telnet allows easy remote management, but if you don't intend to use it,
removing it closes one more possible security loophole. Finally, after a reboot,
we can type netstat --a | more for a listing of remaining
services:
| tcp | 0 | 0 *:printer | *:* | LISTEN |
| tcp | 0 | 0 *:linuxconf | *:* | LISTEN |
| tcp | 0 | 0 *:auth | *:* | LISTEN |
| tcp | 0 | 0 *:login | *:* | LISTEN |
The
first 20 or so lines contain the useful listing of services -- the lines to look
for have the word LISTEN at the end, signifying that
they are ready to accept connections. As long as you don't see portmap, finger, sendmail or ftpd (unless deliberately installed and configured), we
have a relatively safe environment to continue with.
©1998 Wrox Press Limited, US and UK..