Sendmail V8: A (Smoother) Engine Powers Network Email
Sendmail, with its cryptic single-character option tags and notorious rewriting rule sets, has nevertheless always been the premier Internet mail transfer agent. But the latest releases, with macro-based configuration and spelled-out options, add some ease-of-use and sophistication to the program's traditional power.
By Richard Reich
Please address questions regarding this article to the author at email@example.com .
Table of Contents
Sendmail is the most common SMTP mail transfer agent on the thousands of mostly Unix-based Internet hosts that handle mail routing and serve as post offices. Millions of e-mail messages are handled by Sendmail every day. Although it is very popular, Sendmail has been obscure and difficult to configure during much of its long history. Recent versions of Sendmail, however, have a much improved configuration system, based on the m4 macro processor and a large set of predefined m4 macros.
This tutorial does not pretend to be a complete treatment of Sendmail. But it will try to show the average system administrator that Sendmail, with macro-based configuration, can be set up usefully with a reasonable amount of study and attention.
The focus will be on Berkeley Sendmail version 8.7, the freely distributed version maintained and improved by its original author, Eric Allman. Most Unix system vendors include ``Sendmail,'' but often these are old versions, almost always lacking the m4-based configuration environment and other improvements. In addition, older versions of Sendmail have well-known security problems that are repaired in the later versions available from Berkeley. Although there are generally valid arguments against early adoption of new versions of critical software, Sendmail may be an exception to the rule.
This tutorial first describes Internet mail basics and a common strategy for SMTP mail handling on an Internet-connected local network. Sendmail configuration is treated in the context of implementing the example mail strategy. Sendmail's UUCP capabilities, perhaps less relevant than they were a few years ago, are outside the scope of this presentation. (Sendmail, even with tractable configuration tools, is t oo large a topic to present in the abstract in this limited space.)
The rules that permit heterogeneous computer systems to interoperate smoothly on the global Internet are set forth in documents called Requests For Comments, or RFCs. The format of Internet mail messages is defined by RFC 822 (see Reference 6) . Thus, Internet e-mail is often called ``RFC 822'' mail. The protocol used to send RFC-822 e-mail between host computers is referred to as the Simple Mail Transfer Protocol, or SMTP, and is defined in RFC 821 (see Reference 5) .
The format of Internet mail is fundamentally very simple: various required and optional message attributes come first in a ``header,'' followed by a blank line, then the ``body'' of the message. The header fields predominate in the short example message shown here:
Editor's Note: The long ``received'' lines were wrapped then indented so they'll fit in the average window.
Return-Path: firstname.lastname@example.org Received: from tempo.maclean.com (tempo.maclean.com [126.96.36.199]) by goldengate.reich.com (8.7.1/8.7.1/FultonSt-gg0916) with ESMTP id WAA01451 for <email@example.com>; Sun, 15 Oct 1995 22:09:10 -0700 Received: from petewin95.maclean.com (petewin95.maclean.com [188.8.131.52]) by tempo.maclean.com (8.7.Beta.10/8.7.Beta.10/FultonSt-tempo0806) with SMTP id WAA12144 for <firstname.lastname@example.org>; Sun, 15 Oct 1995 22:09:08 -0700 Message-Id: <199510160509.WAA12144@tempo.maclean.com> X-Sender: email@example.com X-Mailer: Windows Eudora Pro Version 2.1.2 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Sun, 15 Oct 1995 22:06:05 -0700 To: firstname.lastname@example.org From: Pete Maclean <email@example.com> Subject: A question... Just wondered if this message will appear in your Sendmail article? -p
The blank line after the ``Subject'' line divides the header
from the message body that follows. Any subsequent blank line is
part of the message body and has no structural significance.
Most header fields are brief and have an intuitively obvious
Each header line consists of a ``keyword-value pair'' that declares one specific characteristic of the message. For instance, the required line that specifies the recipient of the message consists of the keyword ``To:'', one or more space or tab (white space) characters, followed by the value that specifies the mailing address of the recipient, here ``firstname.lastname@example.org''.
SMTP is a TCP-base d, client-server protocol. Its operation is really quite simple: after a reliable connection is established, the client initiates a brief handshake sequence. Then the client sends one or more messages to the server. Preceding each message, the remote system is given a list of the message's local recipients as well as the sender's address. This information is referred to as the message's ``envelope''. The natural metaphor of physical letters is instructive: to send a letter to several people at different locations, for each recipient place a copy of the letter in an envelope, which bears both the recipient's address and the return address of the sender, and post individually to each envelope addressee.
This exchange of information takes place in a formal language of four-character commands and three-digit reply codes, but it is usually replete with human-readable comments that render transcripts of SMTP sessions quite easy to follow. A somewhat improved version of SMTP, Extended SMTP, or ESMTP, i s now in wide use. Here's a real example of an ESMTP mail exchange log. Don't worry about what everything means, but note the basic simplicity of the conversation.
Editor's Note: The long lines in this example were wrapped and then indented four spaces so they'll fit on the average window.
$ /usr/sbin/sendmail -v email@example.com < message firstname.lastname@example.org... Connecting to tempo.maclean.com. via smtp... 220 tempo.maclean.com ESMTP Sendmail 8.7/8.7/FultonSt-tempo0806; Sun, 15 Oct 1995 22:47:52 -0700 >>> EHLO goldengate.reich.com 250 tempo.maclean.com Hello email@example.com [184.108.40.206], pleased to meet you >>> MAIL From:<firstname.lastname@example.org> 250 <email@example.com>... Sender ok >>> RCPT To:<firstname.lastname@example.org> 250 Recipient ok >>> DATA 354 Enter mail, end with "." on a line by itself >>> . 250 WAA12161 Message accepted for delivery email@example.com... Sent ( WAA12161 Message accepted for delivery) Closing connection to tempo.maclean.com. >>> QUIT 221 tempo.maclean.com closing connection
The ``life cycle'' of an e-mail message involves several distinct stages. Writing a mail message is quite different than sorting envelopes, which in turn, is different than delivering mail. This is true in the realm of electronic mail as well as in the world of surface postal mail (a.k.a. snail mail).
Preparing and reading e-mail is done with a Mail User Agent (MUA). The qualities that people prefer in a MUA vary, as do the platforms on which MUAs are implemented. This leads to a wide variety of different MUAs, catering to different tastes in user interfaces, capabilities and platforms. Some examples include: Eudora, elm, pine, mh, exmh, xmail, mailx, Mail, mail, etc.
Delivering e-mail is generally handled by programs (Mail
Delivery Agents, or MTAs) that do one specific type of delive
For example, putting mail into a local mailbox file on Unix
systems is often handled by so called
because it classically had path name
Mail Transfer Agents, like Sendmail, handle everything else. An MTA determines how a message has to be routed to get to a recipient. It accepts mail from another transfer agent and relays it to an agent closer to the ultimate recipient. It handles the interpretation of address aliases. It transforms addresses so that the panoply of incompatible delivery agents can deal with them properly. It handles special actions required by certain header fields (for instance, ``Bcc:'' for blind-carbon copy, and ``Return-Receipt-To:'' to verify delivery). It queues messages when delivery can't be done immediately and handles them later. It rec ognizes bad addresses and other errors and reroutes or bounces mail as needed. And more.
Let's follow the path a message might take, starting after it's been composed and is handed to Sendmail by a Mail User Agent.
MUA Sends a Message . We've composed this simple message:
From: Richard Reich <firstname.lastname@example.org> To: email@example.com Bcc: me Subject: My Sendmail article Please read the draft of my Sendmail article. It will be in the usual place by tonight. Thanks. -r
This simple note is intended for a single recipient, with a blind carbon copy for my records. The MUA that composed it will start Sendmail and give it the message and the list of recipients.
Aliases . An alias is a convenient abbreviation for one or more full mailing addresses. That is, an alias can just be a nickname for an address or it can be the name of a list of recipients. Aliases can be maintained and expanded by a MUA or by Sendmail. Most MUAs keep alias information in their own version of an alias file. So, if you use, say, elm ordinarily, its alias file will not be available to Netscape when you use Netscape's ``Mail Document'' function. However, aliases maintained centrally by Sendmail will be recognized and expanded regardless of which MUA is used to compose a message.
There is an alias among the recipients of our example message: me. Sendmail will expand it and discover that the full address for ``me'' is the local mailbox ``richard''.
Handling Mail . The two recipient addresses are examined by Sendmail. One is local (me) and the other (firstname.lastname@example.org) is an address at a remote host.
The message must be transformed slightly to handle the ``Bcc:'' header properly. Blind copying requires that the primary recipients not be informed of the blind copy recipient. So Sendmail, after having added my address to its internal list of recipients, deletes the ``Bcc :'' header field from the message.
Local Delivery . Assuming Sendmail has been configured to use ``bin mail'' for local delivery, it directs this program to save a copy of the message in my mailbox.
Local delivery is not always so dull, however. A user can
keep private aliases in the
Remote Delivery . Returning to our example, Sendmail now has an address that it determines--by examining its format--is probably intended for a remote Internet recipient. For each remote recipient, Sendmail will call upon its domain name resolver to find out the Internet host to which the message should be sent (that is, a mail exchanger--MX--qu ery will be made). Then, to actually transfer the message, an SMTP session will be initiated with the MTA (perhaps another Sendmail) at each remote mail handling host. A failed transfer will result in the message being queued for later delivery.
The Sendmail daemon . Our sample message will be accepted by a Sendmail daemon on a remote host (the mail exchanger for maclean.com). The message will then go through Sendmail's handling process on the remote system (assuming it's running Sendmail). Presumably, the message will be delivered locally to the mailbox of the intended recipient. Alias processing, forwarding, or other kinds of required relaying, however, might result in the message being passed to still another transfer agent.
Sendmail uses the Domain Name System to help it deliver mail. Proper implementation of a domain's mail handling strategy requires that the configurations of both Sendmail and DNS be accurate and coordinated. If a message is to be sent to a non-local recipient, the domain name portion of the recipient's address must be examined to determine the host where the message should be sent.
First, Sendmail queries the local DNS resolver to find so-called ``Mail Exchange or MX records'' for the recipient's domain. For example, to decide where to send a message addressed to email@example.com, Sendmail will look for MX records for the domain name maclean.com. The DNS resolver will return any MX records it finds, often more than one. In the event that the recipient domain has no MX records defined, Sendmail will query DNS for CNAME or A records to arrive at a possible mail exchanger host. Multiple MX records--each specifying an alternative mail-handling host--can be defined for a domain name. An MX record contains a preference field that ranks its mail exchanger host relative to others for the same domain name. (The preference field's value is like a golf score: lower numbers are preferred, w ith zero the best. The maximum value is 65535). A mail transfer agent is required to choose the most preferred mail-exchange host among those that are currently functioning. Given a choice among several equally preferred hosts, Sendmail will choose one at random.
Continuing with our example (sending a message to firstname.lastname@example.org), the DNS resolver might return to Sendmail MX records for maclean.com like the following (rendered here in the textual form used by BIND's configuration file):
maclean.com. 85676 IN MX 10 tempo.maclean.com. maclean.com. 85676 IN MX 20 goldengate.reich.com.
The fields here are the recipient domain name, the TTL (time-to-live value in seconds), the data class, the record type, the preference value and, finally, the mail exchange host.
The records define a preferred mail exchanger at tempo.maclean.com and a less preferred one at goldengate.reich.com. This means that Sendmail will try to send the message to tempo.maclean.com--and failing that--goldengate.reich.com. If it gets to goldengate, the Sendmail daemon there will make its own attempt to deliver the message. Unless tempo has recovered in time, goldengate also fails to relay the mail as we'll see below.
Then a crucial bit of special handling is invoked to avoid sending mail about pointlessly. Sendmail will not relay mail to a mail exchanger that has an equal or greater preference value than its own. As long as tempo is unreachable, goldengate won't be able to relay the message because it can't find any other acceptable host. It will queue the message to disk and try to deliver it later.
Thus it's crucial to get the MX records right. If your domain has an erroneous MX record in its DNS server configuration, your perfectly configured Sendmail daemon may never see an incoming message. Remote Sendmails (or other transfer agents) may not find out that your host handles mail at all!
To get the very latest version of Sendmail, you may want to download the source package from its home at Berkeley. Compilation and installation of the Berkeley distribution is a relatively smooth operation. The source package includes make-description files tailored for many different systems and a ``build'' script that automatically chooses the correct one. Often one or two simple changes are necessary to the appropriate make-description file to match the configuration of a particular system, but these are usually quite obvious. (See below for where Sendmail and its helpers can be found.)
Berkeley ``db'' is a library for manipulation of indexed data records, such as the aliases file. Sendmail can get by with weaker data management packages (for instance, ndbm) or with none at all. But db does enhance Sendmail's efficiency and robustness.
Sendmail handles SMTP mail transfer directly, but it relies on
other programs to handle other kinds of delive
ry. In particular,
Sendmail can be configured to use one of several local mail
delivery agents, such as
The Sendmail configuration file, generally named
With very few exceptions, all of these components of the original Sendmail configuration file are hidden by the m4-based configuration macro files (as we'll see below).
For a majority of Sendmail configurations, the m4 macros in
the Sendmail distribution package will suffice. For instance,
having mail from all local hosts ``masquerade'' as though it
Sendmail options are set in its configuration file with the
single-letter command, capital O. In versions before 8.7, all
options had single-letter names. For example, the option A
held the path name of the alias file. Beginning with version
8.7, all options can be referred to by full names. For instance,
the path name of the alias file is now specified by option
To avoid any ambiguity between the older single-letter form and the new full-name form, a space (which may not appear between the O command and the single-letter option being defined) must appear between the O command and the full name. For example, to set the name of the alias file in the old style, use:
whereas with the new style, employ:
m4 configuration file, you need not worry about defining
the alias file name. An operating system specification macro
Note in the preceeding example how the arguments of the define command are quoted. First, balanced left and right single-quote marks are used. Second, non-alphabetic characters in a phrase means that the phrase must be quoted.
It's not feasible to explain each of the many global
configuration options here that can be set within
Address rewriting rules are the essence of Sendmail's power and its complexity. They can be seen as a simple, quite specialized, text-oriented programming language. Two critical tasks that Sendmail performs--rather than being hard-coded in the Sendmail program itself--are expressed in the language of rewriting rules, making it relatively easy to configure Sendmail's behavior very flexibly, without modifying its internal code.
First, Sendmail must examine each recipient's address to determine which of several mail delivery agents should be used to send the message to--or closer to--the recipient.
Second, Sendmail may transform addresses in both the envelope and the message header to facilitate delivery or reply. (This is probably the moment to address a never-ending controversy that dogs Sendmail: RFC-821, which defines the SMTP protocol, disallows mail transfer agents from modifying message header fields, with a couple of exceptions. Sendmail violates this prohibition. However, if one considers Sendmail to be a mail gateway as well as an MTA, its ``offending'' behavior can be justified as essential to its gateway function. Case closed.)
When Sendmail is presented with a message it examines the addresses in the envelope and the header fields (``From:'', ``To:'', ``Sender:'', and so forth). Each address is placed in a area called the ``workspace'', and--depending on whether the address is for a sender or a recipient and whether it came from the envelope or a message header field--certain rule sets are applied to the address in a prescribed order. Also, once the appropriate mail delivery agent is determined for a particular message, an associated rule set is applied.
Rewriting rules are organized into rule sets. A rule set is like a small program consisting of an ordered sequence of rules. The program acts on the address in the workspace, applying each rewriting rule as long as its matching clause matches the address in the workspace. When it does not, the next rule in sequence is tried. (This flow-of-control, such as it is, can be modified very slightly, as explained below.) Viewed this way, a rule set is a function acting on an address, yielding an address.
Rule sets are identified by number, each new rule set beginning with an S followed by its identifying number. Each rule in the set follows. Rules always begin with the letter R. The rule set is terminated when a non-R command is encountered. For example:
S17 R$* < @ $=w > $: $1 < @ ourco.com > ------------- ----- --------------------- | | | lhs one or more rhs tabs
Rewriting rules appear cryptic, but they are actually conceptually simple (as well as being crypt ic!). A rule contains a ``left-hand side'' (lhs), a ``right-hand side'' (rhs) and, optionally, a ``comment,'' separated from each other by one or more tabs. Note that space characters (which can be used to separate tokens for readbility) are not valid rule-part separators.
When a rule is applied to the address in the workspace, the left-hand side is compared to the address as a pattern. If the pattern matches, the address in the workspace is replaced by the rule's right-hand side.
The pattern-matching proceeds simply. Ordinary words are
matched literally. Operators, which begin with a dollar sign
$* Match zero or more tokens $+ Match one or more tokens $- Match exactly one token $= x Match any phrase in class x $^ x Match any word not in class x
If an operato
r matches part of the address in the workspace,
then the matched token(s) are assigned to the positional operator
When a left-hand side pattern match succeeds, the workspace is replaced with the contents of the rule's right-hand side. Analogous to matching, the replacement process copies literal tokens from the left-hand side to the workspace and gives a special interpretation to operators. Some of the recognized right-hand side operators include:
$ n Substitute the n th matched token from the lhs $> n Call rule set n $# mailer Specify delivery agent, mailer $@ host Specify host $: user Specify user $( token $) Look up token in a database
Continuing with our example, if the workspace address is ``email@example.com'' and the current rule is:
then the workspace will be rewritten as ``firstname.lastname@example.org''.
R$* $: $>3 $1
(The $: at the beginning of the right-hand side in this
example is the
``one-time only'' prefix. (See below, too). It
stops Sendmail from applying the rule over and over, which it
would do if not restrained. The left-hand
The mailer, host, and user specification symbols are used to resolve envelope-recipient addresses. These constructs appear only in rule set 0 (or rule sets called by rule set 0), which uses rewrite rules to parse and resolve recipient addresses. For example, after some involved application of rule set 0, Sendmail will at last decide that an address is local and resolve the host (this one), the user (the addressee) and the mailer (whatever local mailer has been configured) with this rule:
R$+ $#local $: $1
In this example the address in the workspace, which
consists of one or more tokens (
The complex token-lookup function (
In addition to the substitution operators, there are two other
operators that have special meanings when they appear as the
first token on the right-hand side. The
The first and most important step in developing a Sendmail configuration is deciding upon a mail-handling strategy. For local networks of reasonable size, a single mail hub system offers centralized administration, coherent e-mail address structure and high levels of reliability, integrity and performance. Even for networks consisting of as few as three or four systems, a mail hub approach makes sense. (In fact, a network consisting of just a single workstation can be viewed as a mail hub and client system rolled into one.)
The mail hub we will configure processes all outgoing messages and acts as the ``post office'' for mail coming in from outside the network. (I t acts as a post office for local, intra-network mail as well.) Every user who wants to receive mail has a mailbox on the mailhub machine. Enforcement of acceptable use is centralized, as is technical administration of such tasks as back up and mail system/DNS coordination. With a combination of aliases and rewriting of sender addresses on outgoing mail at the hub, all users in our example network have Internet e-mail addresses of the form ``Firstname_Lastname@ourco.com''. Using ``guessable'' names is desirable (though some disagree), especially in an environment where security or privacy concerns may prohibit open directory services (for example, finger ). Using ``domain addressing'' (``ourco.com'' instead of ``zippy.research.ourco.com'') not only hides internal domain structure, but it's simply more handsome. These policies demand some administrative effort, but without a mail hub the fragmented administrative effort required could be greater still.
Client syste ms (that is, the non-hub systems on the local network) can be configured in a few different ways, each consistent with the overall mail strategy. A ``smart'' client can run a Sendmail daemon that handles idiosyncratic alias processing as well as dispatching mail to the hub. A ``null'' client starts Sendmail locally on a per-message basis, using it simply to pass each message to the Sendmail hub with no local processing. Some hosts, such as those running Eudora on a Macintosh or Windows system, rely on establishing a SMTP connection from the mail user agent (MUA) directly to the mail hub, or use a POP (Post Office Protocol) (see Reference 7) connection with a user's mailbox on the hub machine.
The m4 macro processor can be thought of as a translator from
a simple Sendmail configuration language to the opaque native
configuration used in its configuration file
If you work with a Sendmail older than version 8.7, the m4
configuration file you write (or adapt from prototypes and
samples) should be kept in the
Compiling a m4 Sendmail configuration is very simple. Just invoke m4 with the m4-format configuration file as its argument. The standard output, which can be redirected to a disk file, will be the desired Sendmail-style configuration file.
$ m4 mailhub.mc > mailhub.cf
After successful compilation, the ``.cf'' configuration can be
tested using the
As of Sendmail version 8.7, you
can place your configuration
file anywhere and use parameters on the m4 command line to
specify a base include directory. You can also omit the first
$ m4 -I /usr/src/sendmail/cf /usr/src/sendmail/cf/m4/cf.m4 \ mailhub.mc > mailhub.cf
Daemon Mode . As a mail hub, Sendmail must be available to handle incoming mail (via SMTP connections) at all times. Sendmail can be invoked at system startup time (or any time) in ``daemon mode.''. It will listen for and process all incoming SMTP connections, creating subprocesses as necessary to complete the mail transfer work.
To sta rt Sendmail in daemon mode, lines like the following are placed conventionally in a system's network or multiuser server startup script, which is found in various places with various names depending on the Unix implementation. Here's an example:
# Start the Sendmail daemon... if [ -x /usr/sbin/sendmail ]; then echo "Starting sendmail daemon..." /usr/sbin/sendmail -bd -q 15m fi
The mailhub.mc File . This configuration file for our mail hub system (mailhub.ourco.com) is not too difficult to understand, yet it fully specifies the behavior of our somewhat customized, powerful mail hub. Let's examine it line-by-line:
include(`../m4/cf.m4') VERSIONID(`mailhub.mc Richard Reich 11 AUG 95') OSTYPE(linux) FEATURE(nouucp) FEATURE(use_cw_file) MASQUERADE_AS(ourco.com) MAILER(local ) MAILER(smtp) LOCAL_CONFIG Kuserdb btree -o /etc/userdb.db LOCAL_RULE_1 R$* < @ ourco.com. > $* $: $( userdb $1 $) < @ ourco.com. > $2
The first line (
Sendmail must know the names of all hosts or domains that may
receive mail on this system. Otherwise, Sendmail will assume the
mail should be routed to another destination system. The names
can appear on certain command lines in the configuration file, or
they can be read from a separate file. The
The last four lines of our mailhub configuration file have a potent impact on sender addresses in outgoing mail:
LOCAL_CONFIG Kuserdb btree -o /etc/userdb.db LOCAL_RULE_1 R$* < @ ourco.com. > $* $: $( userdb $1 $) < @ ourco.com. > $2
A complete introduction to address rewriting is b
scope of this article, but this simple example may show that the
subject is not completely incomprehensible. The macros
Sendmail configuration statements usually begin with a single
upper case letter that specifies a particular Sendmail command.
In this case, the ``K'' command directs Sendmail to open a keyed
mapping file (
Rewriting rules begin with the command letter ``R''. The rule
in this example means: if a sender address is of the form
For example, if the userdb file has this record in it:
then the actual local address ``email@example.com'' in outgoing mail will be rewritten as ``Richard_Reich@ourco.com''.
The last step: POP or NFS . The function of the mail hub is to deliver all mail for the entire local network in to recipients' mailboxes resident on the mail hub system. The last step is to get the mail to the recipients.
One very popular solution consists of a Post Office Protocol (POP) Server on the mail hub system, which retrieves mail when asked by a mail user agent (for example, Eudora, Z-Mail). Some POP servers and clients can negotiate the sending of outgoing mail as well.
Another common way to get users and their hub mailboxes together is to allow client systems to mount the mailbox's directory via NFS (Network File System). A mail user agent, via soft links or environment pointers, sees its mailbox file as though it were local to its own system. Care must be taken whenever NFS is used, however, to maintain user mailbox privacy and system security.
A null client m4 configuration file consists of the macro
Sendmail is freely distributed. You may be able to find precompiled versions for your Unix version. For instance, a Linux version is available via anonymous FTP from Sunsite at University of North Carolina at Chapel Hill . However, the authoritative source version is available via anonymous FTP from U.C.Berkeley . It's usually little or no trouble to compile and to get running.
Freely available m4 can be obtained from GNU's anonymous FTP archive . The current version is 1.4, which does not change frequently.
The Sendmail Usenet newsgroup is comp.mail.sendmail . The discussions are lively and participants offer timely help to those with problems. Eric Allman--the author of Sendmail--has frequently contributed answers and announcements of new versions. (Eric, who has left Berkeley for a new job, may not be able to commit as much time and energy to Sendmail as he has in the past.)