home
NEWS       BLOGS       FORUMS       NEWSLETTERS       RESEARCH       EVENTS       DIGITAL LIBRARY       CAREERS  
Network Computing Network Computing Powered by InformationWeek Business Technology Network

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers





Chapter 5: Deploying Web and FTP Servers

May 22, 2000

Brought to you by:





Table of contents

Got a tough Linux deployment question?
Ask the experts!

For a limited time, you can put the authors of "Deploying Web and FTP Servers" to the test. Post your question, and if they answer it, you'll receive a free Network Computing collectable. Click here for more info.

Configuring Your Web Server

The greatest asset of Apache is its flexibility in configuration -- you will never be limited to the settings the original developers thought you would want! Everything can be configured to a per-directory level, or even a per-file level if necessary. All Apache configuration is performed using one file: httpd.conf. There are many configuration commands, but all follow a similar format.

 

Apache will work fine with its default settings, as long as the ServerName directive is set, so you can begin with the original settings and gradually move toward your requirements changing a few settings at a time, restarting Apache each time to see the results. Apache only reads the httpd.conf file when starting. In fact, the chances are most of the default settings will never need changing, and that you will use a few useful commands repeatedly. We will cover all these essential commands.

 

An important general rule is to avoid using hostnames (e.g. www.trampolining.net) unless the command requires them. Hostnames in the configuration file will work most of the time, so long as you have told Linux where to find a DNS server. However, if you have filtered any content using hostnames, Apache will need to perform a DNS check against every client IP address, increasing server load. On the other hand, if you filter against IP addresses, Apache already has the information in order to function without forcing an unwanted DNS check. Also, if your DNS server is down when Apache is started, any configuration directives containing hostnames won't be parsed, and parts of the server will not be started. This can cause intermittent problems when the DNS server is down or contains bad data.

 

The effect of a command depends on where it is placed. The first section of httpd.conf contains global environment directives, which affect the entire server  and all virtual hosts running on it. The next section configures the main, or primary server. Settings here also provide the default settings for all virtual hosts. The third and final section configures the virtual hosts themselves.

 

Within section two or three you might want to apply settings to a single directory only. This is achieved using a directory container; the settings are placed within a pair of HTML-like tags, which define which directory to apply the settings to. (We will meet other containers when configuring virtual hosts.) Note there is no trailing slash on the directory.

 

<Directory /any/directory>

  Settings here

</Directory>

Section 1: Global Environment

The first directive that you will come across is this one:

 

ServerType standalone

 

ServerType may be either standalone or inetd. For all but the lowest-use servers, use standalone as Apache will be permanently ready and waiting for any requests itself. The inetd option caters for users who wish to start Apache when requests are received on a specified port; this introduces start-up delays which will only be acceptable if the server rarely functions as a web server.

 

The following directives tell Apache to maintain a pool of between 5 and 10 spare server processes, ready for new requests:

 

MinSpareServers 5

MaxSpareServers 10

 

Adaptive spawning implemented in Apache versions 1.3 and upwards means there should be no reason to change these except on very high load servers. However, be wary of over-trusting benchmarking utilities, as these generate a step change in request volume over a few seconds, which do not occur in reality. The directive below prevents more than 150 clients connecting simultaneously, to prevent the server locking during periods of high usage:

 

MaxClients 150

 

If you find that clients are being refused a connection during periods of high usage, try increasing this number. If you find the server is locking up or becoming very slow during periods of high usage, you may consider lowering this number as a temporary measure to keep the server running until you can provide higher capacity. More information on optimizing for high loads can be found in Professional Apache by Peter Wainwright, published by Wrox Press (ISBN 1861003021).

Section 2: Primary Server Configuration

To reduce the chance of malicious damage to your system, we give Apache processes the minimum possible security privileges on your machine:

 

User nobody

Group nobody

 

Any CGI scripts run by Apache will inherit these settings. CGI scripts will be discussed later in the chapter.

 

The following e-mail address will be suffixed to any error messages Apache sends to the client. Setting it to your address ensures visitors have a way to inform you of problems with your site. You may of course feel this is not a good thing!

 

ServerAdmin richard@trampolining.net

 

Obviously you would put your address in here, not mine.

 

The ServerName directive tells Apache the hostname of the primary host. It is essential, and proves a common cause of headaches if it is set wrongly:

 

ServerName www.trampolining.net

 

Suppose a client requests http://www.trampolining.net/news, where news is a directory. This is not a valid HTTP request as the trailing slash is missing. Apache will ask the browser to visit the correct URL, http://www.trampolining.net/news/ using ServerName to reconstruct the URL. If ServerName is not correctly set, the redirect URL returned will be invalid (e.g. http://not-set/news/) and a 404 'File not found' error, or a DNS resolution failure will be returned. If you do not yet have a valid hostname, you can use the machine IP address here instead.

 

This directive specifies where to look for the contents of the web site; this directory will appear as the root directory of your web site, and is where all the HTML files will be placed.

 

DocumentRoot "/home/www/trampolining.net/"

 

The following container contains directives that apply to the entire system:

 

<Directory />

  Options FollowSymLinks

  AllowOverride None

</Directory>

 

These permissions prevent anyone browsing around private system files. Because of this default denial of access, we will need to explicitly allow access to any directories we intend to use using another directory container. Subdirectories inherit the same Apache permissions as their parent unless refined by later directory containers. The following tag specifies the beginning of the main directory container.

 

<Directory /home/www/trampolining.net>

 

This is used to allow access to the main server root directory. Later directory containers may refine the permissions we have set here for the whole of /home/www/trampolining.net.

 

The Options directive tells Apache what it is allowed to serve to the user.

 

Options Indexes FollowSymLinks Includes ExecCGI

 

An index allows Apache to produce a listing of most of the files in a directory if there is no index.html or other file specified in DirectoryIndex. If Indexes is not included here, a 'forbidden' HTTP response will be returned. FollowSymLinks will allow users to follow any symbolic links you create to other directories.

 

Includes and ExecCGI are the remaining valid operators, and will be covered in more detail in the section called Technologies for Effective Sites later in this chapter. Briefly, Includes tells Apache to allow Server Side Includes (SSI) in this directory to be parsed (IncludesNoExec is the same except it will not honor SSI exec commands). In the same way, ExecCGI allows CGI files in this directory to be executed, which is not the most secure way to provide CGI support, but the most convenient.

 

Now take a look at this directive:

 

AllowOverride None

 

If AllowOverride is set to All, then wherever a file called .htaccess exists in a directory, the httpd.conf settings for that directory will be overridden by the settings in that file. This allows you to keep httpd.conf to a reasonable length by defining per-directory settings in individual .htaccess files. However, this is a poor method of configuration as making changes can become a chore -- any number of .htaccess files may require editing, even for a simple update. Setting AllowOverride to All will also mean that every time a request is received for a directory, whether or not an .htaccess file is present, Apache will search for one, thereby increasing server load.

 

A better alternative, if you are concerned about the length of httpd.conf, is to group extra settings in another file, (e.g. /usr/local/apache/conf/football-websites.conf), and then force Apache to read this by placing 'Include /usr/local/apache/conf/football-websites.conf' at the end of httpd.conf. We will this principle later in this chapter when we configure ApacheJServ.

 

This section defines who is allowed to view your web site. Setting 'Allow from all' will allow anyone to visit your web site -- there is more about restricting access in the Logs and Analysis section.

 

Order allow,deny

Allow from all

 

This is the end of the directory container -- we have finished setting the default permissions!

 

</Directory>

 

When a client requests a directory, Apache first checks to see whether there is a DirectoryIndex directive defined there which it can serve instead. So when requesting http://www.trampolining.net, the actual file returned is http://www.trampolining.net/index.html. Apache will check for each of the named files in order; in this example we have chosen to serve index.html in preference to index.htm. If neither index.html nor index.htm is present, either a 500 'forbidden' response, or a directory index will be returned, depending whether Indexes is set in the Options command as above. If you use SSI or JSPs, you might want to add index.shtml or index.jsp to this list.

 

DirectoryIndex index.html index.htm

 

This is an example of a Files container; it will apply to any matching file, anywhere on this host or any of the virtual hosts. It prevents clients viewing any .htaccess configuration file which provides useful information to a hacker.

 

AccessFileName .htaccess

<Files .htaccess>

   Order allow,deny

   Deny from all

</Files>

 

The next directive of note is:

 

HostnameLookups Off

 

When set to On, Apache will attempt to resolve the IP address of every client before writing to the logs, so that the logs contains 'machine.isp.net' instead of '123.123.123.123'. It heavily increases server load, and load upon your Internet connection. In the Logs and Analysis section we will configure Analog to do this much more efficiently.

 

This directive tells Apache where to log errors:

 

ErrorLog /usr/local/apache/logs/error_log

 

This log collects most error messages, including CGI errors and errors occurring on virtual hosts (unless your virtual host has its own ErrorLog command). A useful trick when troubleshooting is to type tail -f /usr/local/apache/logs/error_log, which will show the end of the log in real time. Then browse around the site, seeing exactly what is causing errors and when the errors occurred.

 

The following directive tells Apache where to log accesses.

 

CustomLog /usr/local/apache/logs/access_log common

 

It takes an added parameter that tells it which of the predefined log formats to use. This may be common, the NCSA standard log, combined, the most useful log format (which provides nearly all information in one log), referer (which records only referrer details) or agent (which records only user agent details). It is even possible to create several different logs concurrently using separate CustomLog commands, and much more -- see Professional Apache for more details.

 

ErrorDocument commands allow you to return a pre-selected page in place of standard Apache error pages. This preserves the corporate image of a site and minimizes any unprofessional impression which would inevitably be created in this situation. ErrorDocuments can be created for any or all HTTP error responses, but the important ones to cover are 404 ('File not found') and 500 ('Server error', usually caused by script or servlet failure).

 

ErrorDocument 404 /missing.html

Adding Virtual Hosts

Virtual hosts allow you to provide another web site from the same server. To the user, the virtual host looks and feels identical to how it would if it was the primary host. For example, in addition to providing www.trampolining.net on my server, I might want to run a completely different web site called www.sport-science.net.

 

Since HTTP 1.1, virtual hosts can be accessed by each one listening to a different IP address (the traditional approach), or to one IP address, with an HTTP header telling the server which virtual host to serve. While the second approach is well supported by Apache, the small number of browsers still in use which are not compliant with HTTP 1.1 forces us to adopt the traditional approach.

 

Linux allows up to 256 IP-based virtual hosts per network card, although the number of file descriptors available will probably limit us to a mere 200 or so. There are two parts to adding a virtual host. First, you must set up the network configuration to force Linux to 'listen' to the other IP addresses. Second, you will also need to configure Apache to listen.

Network Configuration

The primary host listens to the machine's IP address as specified during Linux setup (or subsequently through Linuxconf, for example). We will need another IP address for each virtual host, and also need to register the URL and IP address with a Domain Name Server.

 

By default, network connections only listen to information sent to them. After all, why waste time listening to information meant for other machines? To provide virtual hosts, we need to listen to requests sent to our virtual hosts' IP addresses. Luckily, Linux provides direct support for this IP masquerading. (IP masquerading is one reason why network transmissions, like e-mail and telnet, are so easy to intercept.)

 

Typing ifconfig will produce a listing of network services, along with technical information about each of them. We are interested in the device eth0, which is the first to be listed (eth0 represents your first network card). If there are masquerades already defined, these will be listed underneath as eth0:0, eth0:1, and so on. If you have multiple network cards, you may reference them as eth1, eth1:1 and so on.

 

eth0      Link encap:Ethernet  HWaddr 00:50:04:86:89:61

          inet addr:123.123.123.122  Bcast:123.255.255.95  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:184263 errors:4 dropped:0 overruns:0 frame:5

          TX packets:104402 errors:0 dropped:0 overruns:0 carrier:0

          collisions:762 txqueuelen:100

          Interrupt:11 Base address:0x1000

 

lo        Link encap:Local Loopback

          inet addr:127.0.0.1  Mask:255.0.0.0

          UP LOOPBACK RUNNING  MTU:3924  Metric:1

          RX packets:7858 errors:0 dropped:0 overruns:0 frame:0

          TX packets:7858 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

 

We will set up a virtual host for www.trampolining.net with an IP address of 123.123.123.123. Type: ifconfig eth0:0 123.123.123.123, where 0 is the first available masquerade (set to 0 here since none have been defined yet) and 123.123.123.123 is the IP address of your virtual host. Next we need to set up routing: type route add -host 123.123.123.123 dev eth0:0. If all goes well, typing ifconfig should now produce this:

 

eth0      Link encap:Ethernet  HWaddr 00:50:04:86:89:61

          inet addr:123.123.123.122  Bcast:123.255.255.95  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:184263 errors:4 dropped:0 overruns:0 frame:5

          TX packets:104402 errors:0 dropped:0 overruns:0 carrier:0

          collisions:762 txqueuelen:100

          Interrupt:11 Base address:0x1000

 

eth0:0    Link encap:Ethernet  HWaddr 00:50:04:86:89:61

          inet addr:123.123.123.123  Bcast:123.123.123.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          Interrupt:11 Base address:0x1000

 

lo        Link encap:Local Loopback

          inet addr:127.0.0.1  Mask:255.0.0.0

          UP LOOPBACK RUNNING  MTU:3924  Metric:1

          RX packets:7858 errors:0 dropped:0 overruns:0 frame:0

          TX packets:7858 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

 

That's it -- our network configuration is ready for virtual hosting.

Apache Configuration

Apache configuration consists of adding a few lines to httpd.conf and restarting the server:

 

<VirtualHost 123.123.123.123>

 

Just as <Directory> containers contained directives applying to that directory, <VirtualHost> containers contain all the directives related to that virtual host. Any directives not included take as default the settings assigned in the primary server section. A notable exception is the directive Options +Includes (explained in the later section entitled Server-Side Includes), which must be placed in the Directory container of each virtual host where it is used -- it is not inherited.

 

These directives are required, and have the same effect as when used before for the primary host.

 

DocumentRoot /home/www/sport-science.net

ServerName www.sport-science.net

 

These two directives create logs specifically for this virtual host. If no logging command is given, log messages will be redirected to the primary host logs.

 

CustomLog /usr/local/apache/logs/sports_science_access_log combined

ErrorLog /usr/local/apache/logs/sports_science_error_log

 

This Directory container gives the client permission to access the DocumentRoot. Without it, a forbidden HTTP response is returned.

 

<Directory "/home/www/sport-science.net">

  allow from All

</Directory>

 

This tag ends the VirtualHost container. Restart the server, and assuming your DNS entry has had 12-24 hours or so to propagate across the web, your virtual host should be up and working.

 

</VirtualHost>

end part one...

©1998 Wrox Press Limited, US and UK..

PAGE: 1 I 2 I 3 I FIRST PAGE
 





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



InformationWeek Business Technology Network
InformationWeekInformationWeek 500InformationWeek 500 ConferenceInformationWeek AnalyticsInformationWeek CIO
InformationWeek EventsInformationWeek ReportsInformationWeek MagazinebMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingNo Jitter
space
Techweb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0 ConferenceMobile Business ExpoSoftware ConferenceCSI - Computer Security Institute
Black HatGTECEnergy CampMashup CampStartup Camp
space
Light Reading Communications Network
Light ReadingLight Reading EuropeUnstrungLight Reading's Cable Digital NewsConstantinopleInternet Evolution
Heavy ReadingLight Reading Live!Light Reading InsiderEthernet ExpoOptical ExpoTeleco TVTower Technology Summit
space
Financial Technology Network