The Technical Development of Internet Email

Page created by Ashley Oconnor
 
CONTINUE READING
The Technical Development of
Internet Email
Craig Partridge
BBN Technologies

                                       Development and evolution of the technologies and standards for
                                       Internet email took more than 20 years, and arguably is still under
                                       way. The protocols to move email between systems and the rules for
                                       formatting messages have evolved, and been largely replaced at least
                                       once. This article traces that evolution, with a focus on why things
                                       look as they do today.

                   The explosive development of networked                  Each subsystem internally has a rich set of
                   electronic mail (email) has been one of the         protocols and services to perform its job. For
                   major technical and sociological develop-           instance, the UA typically includes network
                   ments of the past 40 years. A number of             protocols to manage mailboxes kept on remote
                   authors have already looked at the develop-         storage at a user’s Internet service provider or
                   ment of email from various perspectives.1 The       place of work. The MHS includes protocols to
                   goal of this article is to explore a perspective    reliably move email messages from one MTA to
                   that, surprisingly, has not been thoroughly         another, and to determine how to route a
                   examined: namely, how the details of the            message through the MTAs to its recipients.
                   technology that implements email in the                 The UA and MHS must also have some
                   Internet have evolved.                              standards in common. In particular, they need
                      This is a detailed history of email’s plumb-     to agree on the format of email messages and
                   ing. One might imagine, therefore, that it is       the format of the metadata (the so-called
                   only of interest to a plumber. It turns out,        envelope) that accompanies each message on
                   however, that much of how email has evolved         its path through the network.
                   has depended on seemingly obscure decisions.            The focus of this article is how these
                   Writing this article has been a reminder of         different pieces incrementally came into being
                   how little decisions have big consequences,         and exploring why each one emerged and how
                   and I have sought to highlight those decisions      its emergence affected the larger email system.
                   in the narrative.                                   In the interests of space, this survey stops
                                                                       around the end of 1991. That termination date
                   Architecture of email                               leaves out at least four stories: (1) the develop-
                      In telling the story of how email came to        ment of graphics-based user interfaces for
                   look as it does today, we start by describing (in   personal computers and the incorporation of
                   broad strokes) today’s world, so that the steps     those interfaces into web browsers; (2) the rise
                   in the evolution can be marked more clearly.        of UA protocols such as the Post Office Protocol
                      Today’s email system can be divided into         (POP)2 and IMAP3 (these protocols existed
                   two distinct subsystems. One subsystem, the         prior to 1991, but much of their evolution
                   message handling system (MHS), is responsible       occurred later); (3) the continuing efforts to
                   for moving email messages from sending users        further internationalize email (e.g., allowing
                   to receiving users, and is built on a set of        non-ASCI characters in email addresses); and
                   servers called message transfer agents (MTAs).      (4) the rise of unwanted email (dubbed
                   The other subsystem, which we will call the         ‘‘spam’’) and tools that sought to diminish it.
                   user agent (UA), works with the user to receive,    Furthermore, in the interests of space, I do not
                   manage (e.g., delete, archive, or print), and       consider the development of technical stan-
                   create email messages, and interacts with the       dards for the support of email lists.
                   MHS to cause messages to be delivered.
                   Readers may recognize this terminology as           First steps
                   being roughly that developed by the X.400              Electronic mail existed before networks did.
                   email standardization process.                      In the 1960s, time-shared operating systems

IEEE Annals of the History of Computing      Published by the IEEE Computer Society    1058-6180/08/$25.00   G   2008 IEEE   3
The Technical Development of Internet Email

                       developed local email systems delivering mail        had received local email. In TENEX, they got a
                       between users on a single system.4 The               ‘‘You have mail’’ message when they logged
                       importance of this work is that email requires       in. Mail was read by viewing or printing the
                       a certain amount of local infrastructure. There      mailbox file, usually with the TYPE command.
                       needs to be a place to put each user’s email.        (Almost immediately, TYPE MAILBOX was
                       There needs to be a way for a user to discover       replaced with a TENEX macro READMAIL).
                       that he or she has new email. By the early           Messages were deleted by deleting the relevant
                       1970s, many operating systems had these              lines with a text editor.
                       facilities.                                              Tomlinson made two important contribu-
                          In July 1971, Dick Watson of SRI Interna-         tions. First, he found a way to express the
                       tional published an Internet Request for             networked email address. He chose to use the
                       Comments5 (RFC-196) describing what he               ‘‘@’’ sign to divide the user’s account name
                       called ‘‘A Mail Box Protocol.’’ The idea was to      from the name of the host where the account
                       provide a mechanism where the new Network            resided, resulting in the now ubiquitous
                       Information Center (NIC) could distributed           user@remote format.11 Second, SNDMSG was the
                       documents to sites on the Arpanet. Watson            first MTA—it took a message and delivered it
                       described a way to send files (documents) to a       (using the CPYnet protocol) to a remote user’s
                       teletype printer, with different mailboxes for       mailbox.
                       different types of printers. Mailbox 0 was a             Observe that the last contribution is a
                       teletype                                             surprise. We might imagine that the first
                                                                            program was more of a user agent (UA) than
                         assumed to have a print line 72 characters         a message transfer agent (MTA). But SNDMSG
                         wide, and a page of 66 lines. The new line         could only deliver mail, it could not receive
                         convention will be carriage return (X90D9)         mail, and it delivered the email all the way to
                         followed by line feed (X90A9) … The standard       the recipient’s mailbox. Therefore, SNDMSG was
                         printer will accept form feed (X90C9) as           much closer in spirit to an MTA (and, indeed,
                         meaning move paper to the top of a new             as we shall see, was used as an MTA for a
                         page.6                                             number of years). At the same time, SNDMSG
                                                                            was primitive. If there were multiple email
                         Ray Tomlinson of Bolt Beranek and New-             recipients on the same host, it copied the
                       man (now BBN Technologies or BBN) read               message once for each recipient. If the remote
                       Watson’s memo and reacted that ‘‘it was              host was down, SNDMSG simply returned a
                       overly complicated because it tried to deal          failure message—it made no effort to retrans-
                       with printing ink on paper with a line printer       mit.
                       and delivered the paper to numbered mail-                Despite its primitive nature, Tomlinson’s
                       boxes.’’7 In Tomlinson’s view, the correct           creation took off. The next few years saw it
                       approach was to send documents to a user’s           mature from a fun idea to a central feature of
                       electronic mailbox and let the user decide if        the Arpanet (and later the Internet).
                       the document merited printing.8 So Tomlin-
                       son set out to see if he could send email this       From primitive to production
                       way between two TENEX systems9 over the                 By late 1973, email was widely used on the
                       Arpanet. His approach was simple.                    Arpanet. What happened after Tomlinson’s
                          TENEX already had an existing local email         experiment to make this happen? Obviously,
                       program called SNDMSG,10 which, given a mes-         email met a need. But there were also technical
                       sage, appended that message to a file called         steps: standardization of the transfer protocol
                       MAILBOX in a user’s directory. TENEX also had a      and the development of user interfaces.
                       homegrown file transfer service called CPYnet
                       (written by Tomlinson). In a passive mode,           A standard transfer protocol
                       CPYnet listened at a particular address for              First, the community replaced CPYnet with
                       requests to read, write, or append to a particular   a standardized file transfer service, the first
                       local file. Email was achieved by incorporating      generation of the File Transfer Protocol (FTP).
                       CPYnet into SNDMSG. If SNDMSG was given a            This process took a while. In 1971, FTP was
                       message addressed to a user at a remote host, it     simply a set of rather complex ideas written up
                       opened a CPYnet connection to the remote             in a set of RFCs by a team led by Abhay
                       host and instructed CPYnet to append the             Bhushan of the Massachusetts Institute of
                       message to the user’s mailbox on that host.          Technology (MIT).12 The goal behind these
                          Users learned that they had received net-         ideas was to create a general tool to manage
                       work email the same way they learned they            files (including deleting and renaming files) on

4   IEEE Annals of the History of Computing
remote machines and to do it in a way that           name (local to the remote host). The user id
met the needs of any envisioned application.13       could also be left out, in which case the mail
    At the same time, Dick Watson’s mailbox          was to be delivered to a printer. After the MLFL
idea was continuing to mature. In November           command was accepted, the email file was
1971, a team including Watson proposed a             transmitted over an FTP data channel (with
way to enhance (the still nascent) FTP with          the end of the file indicating the end of the
an explicit MAIL command to support                  message). The file was required to be in ASCII.
appending a file to a mailbox. They further          A separate copy of the file was sent for each
proposed that email be simply ASCII strings          recipient at a host.
of text (no binary images) and that mailbox              MLFL was an important step. A key flaw in
numbers be replaced with text user identi-           Tomlinson’s prototype email was that you had
fiers. The identifiers were ‘‘NIC handles.’’ NIC     to know where in the receiving host’s file
handles were given out by the Network                system a user’s mailbox was located, so that
Information Center to authorized network             you could append to it.18 This limitation
users (and were used as login IDs on Arpanet         probably explains why most of the email
terminal servers, called TIPS). This idea, of        activity in 1971 and 1972 appears to have
course, meant that every host would need to          taken place between TENEX systems, where the
maintain a table mapping NIC handles of              file name for the mailbox was consistent.
local users to the location of their mailbox         MLFL adopted Watson’s notion that mailbox-
file. Retaining Watson’s original idea of acc-       es are symbolic names that the receiving
essing a printer, the MAIL command could be          system translates into an appropriate user
given the name ‘‘Printer’’ instead of a NIC          mailbox file and thereby freed email from
handle and the file would be printed.                system-specific limitations.
    Concurrently,      Tomlinson      distributed        An interactive command, MAIL, was also
SNDMSG to other TENEX systems and people             defined, so that users logged into a TIP could
began to get hands-on experience with email.         type in an email message using only FTP’s
TENEX was the most common operating system           control connection. In this case, a line with a
on the Arpanet at the time, and so probably at       single dot (‘‘.’’) on it marked the end of the
least half the Arpanet users had access to           message. Ending a message with a single dot is
SNDMSG.                                              still how email is moved over the Internet today.
    In April 1972, most of the interested parties,       The MAIL—and, more important, MLFL—
including both Tomlinson and Watson, met at          commands remained the way email was
MIT to discuss revisions to the File Transfer        delivered between systems for several years.
Protocol. The meeting made several decisions,            In the fall of 1972, Bob Clements of BBN
at least one of which proved to have a long-         updated SNDMSG to use the new commands.
term impact: the group agreed to use text            Several other email-cognizant FTP implemen-
(ASCII) commands and replies (previous ver-          tations appeared. The most notable is probably
sions of FTP had used binary commands) to aid        the system for MIT’s Multics. Ken Pogran
interactive use.14 To this day, the Internet uses    wrote the FTP implementation and Mike
text commands to transfer email (and the             Padlipsky wrote the NETML program that
tradition lives on in much later protocols, such     handled email.19 Multics was exceptional for
as the Web’s transfer protocol, HTTP). A new         the time because it had good security includ-
version of the FTP specification, based on these     ing user file privileges, so Padlipsky had to
ideas and written by Bhushan, came out in            invent a special user (ANONYMOUS) to receive
July 1972.15                                         email and distribute it to users.20 The concept
    The new specification envisioned that email      of an anonymous login account caught on as a
would be delivered via the APPEND command,           way to permit FTP access to users who did not
which appended data to a file. Discussions           have an account and remains a central feature
about FTP and email continued, however, and          of FTP to this day.
a month later, Bhushan issued a revision to the
FTP specification16 to include a new com-            First user agents
mand, MLFL (Mail File). It is said Bhushan               The second development of 1972 and 1973
came up with MLFL because, one evening               was the creation of tools to create and manage
while he was writing the revision, a fellow          email. Here the center of innovation was
graduate student at MIT stopped by to suggest        within the Advanced Research Projects Agency
that a better solution was required for email.17     (ARPA) itself. Larry Roberts, head of the ARPA
    MLFL took one argument, a user id, which         office funding Arpanet, was an early and
could either be a NIC handle or a local user         aggressive user of email. Early in 1972, Stephen

                                                                                                         April–June 2008   5
The Technical Development of Internet Email

                       Lukasik, the head of ARPA, also began using
                       email and that induced a number of others,
                       including the ARPA department heads, to use          One challenge in RD and
                       email too.21
                           Soon Lukasik became frustrated with READ-           NRD was the lack of a
                       MAIL, which forced him to read through all
                       the messages in his mailbox in order. Lukasik        standard format for email
                       liked to keep copies of email he received,
                       which made the problem worse. He appealed            messages. Headers varied.
                       to Roberts for something better.
                           One night in July, Roberts wrote a tool          It was hard to find where
                       using macros for the TECO (Text Editor and
                       COrrector22) text editor to manage a mail-            one message ended and
                       box.23 The tool was dubbed RD. RD made it
                       possible to list the messages in the mailbox, to         the next one started.
                       pick which message to read next, and to print
                       individual messages.
                           Roberts’ colleague at ARPA, Barry Wessler,
                                                                          Headers varied. It was hard to find where one
                       promptly rewrote RD as a standalone program
                                                                          message ended and the next one started.
                       in the programming language SAIL and added
                                                                          Wessler remembers trying to get NRD to find
                       additional features for usability. Improve-
                                                                          the start of headers, but it was too hard because
                       ments in Wessler’s ‘‘New RD’’ or NRD included
                                                                          messages routinely had other messages em-
                       the ability to manage more than one file of
                                                                          bedded in them. Therefore, NRD (and RD and
                       messages, and mechanisms to file, retrieve,
                                                                          BANANARD) relied on the receiving system to
                       and delete messages. RD and NRD were the
                                                                          place a start-of-message delimiter before each
                       first mailbox management tools, the first true
                                                                          message in the mailbox.26 The delimiter had
                       user agents.
                                                                          four SOH (Start Of Header, also known as
                           Wessler’s NRD was not distributed outside
                                                                          Control-A) bytes followed by information
                       ARPA. (RD was.) In early 1973, Martin Yonke
                                                                          about the message (initially just a byte count,
                       was a graduate student intern at the University
                       of Southern California’s Information Sciences      later somewhat more information).27 In one of
                       Institute (ISI) and looking for something to do.   those odd quirks, part of the start-of-message
                       Steve Crocker of ARPA gave Yonke a copy of         delimiter has lived on. While some present-
                       Wessler’s code (which ran on TENEX) and            day email systems parse for a header, others
                       suggested Yonke look at improving it. Yonke        still expect messages separated by a line with
                       added command completion (type the first           four consecutive SOH bytes.
                       letter or two of a command and the rest of the
                       name would be filled in) and a help interface.     Transitions
                       A user could type a question mark in most             In March 1973, another meeting of people
                       places in a command to learn what the choices      working on FTP was held, to try to clarify issues
                       were. The revised NRD was dubbed BANANARD.24       lingering from the April 1972 meeting. It
                       (At the time, ‘‘banana’’ was technical slang for   marked a subtle transition.
                       ‘‘cool’’ or ‘‘better’’.) Yonke distributed and        Originally, clarifying and improving the
                       maintained BANANARD for a bit less than a year     support for email in FTP was part of the
                       although it remained in use for several years      agenda.28 Yet the meeting was ambivalent
                       more.                                              about the relationship between FTP and email.
                           Among the amusing stories from that year,      Prodded by a late-in-the-meeting arrival of
                       one concerned mailbox sizes: BANANARD kept an      ARPA’s Steve Crocker, who asked how they
                       index of messages in a file, so Yonke had to       were doing on email support, the group
                       estimate how big the index (which was read         decided to formally incorporate the MLFL
                       into memory) might be. Yonke estimated the         and MAIL commands into the new specifica-
                       largest possible mailbox size, doubled that,       tion29 (recall that the commands had previ-
                       and concluded that assuming a mailbox was          ously been in a separate addendum). Between
                       never larger than 5,000 messages was safe.         the meeting and the issuances of the new FTP
                       Within a few months, Steve Crocker exceeded        specification, it was decided that email should
                       the limit. So did John Vittal.25                   really be a separate, auxiliary protocol.30 Email
                           One challenge in RD and NRD was the lack       had become important (or complex) enough
                       of a standard format for email messages.           to merit distinction.

6   IEEE Annals of the History of Computing
Second, the community was shifting. Al-          nights and weekends), and when he left ISI for
though both meetings had over 20 attendees,         BBN in 1976, he took MSG with him.
they were different sets of people. Only five          MSG was, in fact, surprisingly simple. It was
people31 attended both meetings.32 Abhay            a stand-alone program with its own set of
Bhushan, who had been driving the develop-          commands. There were just 30 commands,
ment of and writing the specifications for FTP,     named such that their first letter uniquely
would soon move on to other things. Nancy           identified all but six. Combined with a
Neigus of BBN wrote the new FTP specifica-          command-completion scheme, this usually-
tion.                                               unique-on-first letter approach permitted con-
   The research focus was also changing. By         cise typing by experienced users. (Many early
year’s end, Larry Roberts (probably email’s         computer users were hunt-and-peck typists, so
most important early adopter) would leave           keeping commands to a letter or two in length
ARPA, and under his successor, Bob Kahn,            was a big time-saver.)
ARPA’s networking focus would change to                Of these 30 commands, several were new
developing networks over media other than           from BANANARD. Some were minor, such as a
telephone wires (e.g., satellites and radios) and   command to toggle the user interface between
the problems of interconnecting those net-          a concise and a verbose mode. However, three
works.                                              commands reflect important changes:
   Finally, at least from a standards perspec-
tive, the protocol for delivering email enters a    N   Move reflected Vittal’s attention to user
kind of limbo. The auxiliary protocol specifi-          behavior. He noticed that one of the most
cation for email envisioned in the new FTP              common activities was to save a message in
specification never appeared. After three years,        a file and then delete the message from the
Jon Postel wrote a two-page memo that never             inbound mailbox. Vittal created the com-
appeared online, documenting the, by then               bined Save/Delete command, Move.
well-established, practice of using MAIL and        N   Answer (now usually called ‘‘reply’’) is
MLFL. The memo suggests some sites had not              widely held to be Vittal’s most insightful
bothered to update their FTP from before the            and important invention. Answer exam-
1973 FTP meeting.33 There were multiple                 ined a received message to determine to
attempts to allow FTP to send a single copy             whom a reply should be sent, then placed
of a message to multiple recipients. All of them        these addresses, along with a copy of the
apparently failed.34 It would take seven years          original SUBJECT field, in a responding
from the FTP meeting before the community               message. Among the challenges Vittal had
seriously returned to the problems of a new             to solve were the varying email-addressing
email protocol.35 Innovation over the next few          standards and what options to give a user
years would come from user agents and a long-           (reply to everyone? reply only to the sender
running debate over the format of email                 of the note?). It took three implementa-
messages, especially email headers.                     tions to get right.36
                                                    N       The wonder of Answer is that it suddenly
Rise of the user agent                                  made replying to email easy. Rather than
   In early 1974, John Vittal worked in the             manually copying the addresses, the user
office next door to Martin Yonke’s office at ISI.       could just type Answer and Reply. Users at
Vittal had helped Yonke with BANANARD, and              the time remember the creation of Answer
about the time Yonke stopped working on                 as transforming—converting email from a
BANANARD so he could finish his graduate                system of receiving memos into a system
degree, Vittal took a copy of the code and              for conversation. (There are anecdotal
began to think about building an improved               reports that email traffic grew sharply
user agent.                                             shortly after Answer appeared.37)
                                                    N   Forward provided the mechanism to send
MSG                                                     an email message to a person who was not
      Vittal called his new program MSG. In it          already a recipient. How much of an
he sought to write a user agent that was simple         innovation Forward was is unclear. Barry
yet did all the things a user needed it to do. It       Wessler had to struggle with messages
had roughly the same functionality as BANA-             embedded in messages in NRD. But the
NARD, but the structure of its commands reflect-        formalization of the idea was new.
ed feedback Vittal sought out from users about
how they wanted to manage their email. MSG          MSG became the Arpanet’s most popular user
was a personal effort by Vittal (writing code on    agent and remained so for several years.

                                                                                                       April–June 2008   7
The Technical Development of Internet Email

                       Hermes and MH                                       MS was a user agent for the Unix operating
                          About the same time Vittal was starting          system (apparently the first Unix user agent).
                       work on MSG, Steve Walker at ARPA created a         MS was funded by Steve Walker at ARPA and
                       new committee called the ‘‘Message Services         was created by William Crosby, Steven Tepper,
                       Committee,’’ charged with thinking about            and Dave Crocker.42 MS’s defining character-
                       email issues. Its focus was on user agents (Al      istic appears to have been that it supported
                       Vezza of MIT remembers a push to get user           multiple user interfaces, including one that
                       agents to support command completion) and           sought to mimic a Unix command shell and
                       email headers. In the summer of 1975, Walker        another that mimicked MSG.
                       also created the MsgGroup mailing list, to              Soon after MS was working in 1977, Stock
                       encourage greater discussion.38                     Gaines and Norm Shapiro of RAND wrote an
                          Motivating these efforts was an ARPA             internal memo suggesting that MS was incon-
                       program called the Military Message Experi-         sistent with the style of other Unix pro-
                       ment (MME) to make email into a useful              grams.43 Unix encouraged the use of many
                       service to the military. As part of this program,   small programs, each of which did something
                       between 1975 and 1979, ISI, BBN, and MIT (in        well and creating metaprograms by combining
                       an advisory role) sought to create user agents      the small programs together using a mecha-
                       designed for the needs of the military. The         nism called ‘‘pipes.’’44 Gaines and Shapiro
                       initial goal was a system for personnel at the      suggested the same approach for email: a set
                       office of the Navy Commander in Chief for the       of small programs that managed email, where
                       Pacific (CINCPAC).39 In a related effort, RAND      email messages were stored as separate files in
                       Corporation was funded to develop a Unix            a user’s directory.
                       email user agent.40                                     Two years after the memo, a new RAND
                          Hermes (a BBN project) and MH (at RAND)          employee, Bruce Bordon, was assigned to
                       were products of this program. Another sys-         upgrade MS. He recommended to his manage-
                       tem, called SIGMA, was developed by ISI for         ment that rather than upgrade MS, he should
                       CINCPAC but never used elsewhere. They illus-       implement Gaines and Shapiro’s idea. The
                       trate some of the diversity of user agents of the   result was MH.
                       time. (An interesting side note is that John            The virtue of MH is that it makes email part
                       Vittal worked on both SIGMA and Hermes,             of the user’s larger environment.45 Output of
                       while continuing his work on MSG. So Vittal’s       email display programs can be filtered through
                       personal project was competing with the in-         search programs such as grep or simply sent to
                       house official product. At both ISI and BBN,        the printing program. MH, in some ways
                       MSG won.)                                           anticipated today’s world, where clicking on
                          Hermes was designed for an office (or            an attachment opens the correct program.
                       command) environment where much of the              Culturally, in Unix, rather than clicking on an
                       email received was kept for reference. It           attachment, one pipes data from one program
                       contained a sophisticated set of mechanisms         to the next to produce the desired result.
                       for filing and searching for messages, including        Because MH puts every message in a
                       a database that recorded key fields from each       separate file in a folder (directory), it is easy
                       message to make searches fast. Hermes also          to manipulate both individual messages and
                       provided a high degree of customization.            folders. Accordingly, MH (unlike MS46) has
                       Readers could create a template of how              powerful tools to sort folders and to search,
                       messages should be displayed, how they should       mark, and label messages.
                       be printed, and even how they should be                 Through most of the 1980s, MH was
                       created (what fields a user should be prompted      maintained by Marshall Rose, with help from
                       for). To support this customization, Hermes         a number of people, most notably John
                       had a per-user configuration file (called a         Romine, Jerry Sweet, and Van Jacobson.47
                       profile) remembered as having been large and        Others have picked up the task since and MH
                       complex, though documentation suggests it           (much evolved in its code, but still recogniz-
                       was far simpler than the MH profile file became     able as Bordon’s suite of programs) continues
                       by the mid-1980s.41 Initially known as the          to be widely used today.
                       MAILSYS project, the Hermes team at various
                       times included Jerry Burchfiel, Ted Meyer,          Message formats and headers
                       Austin Henderson, Doug Dodds, Debbie                   When Ray Tomlinson sent his email be-
                       Deutsch, Charlotte Mooers, and John Vittal.         tween TENEX systems, he used a format similar
                          MH (‘‘Mail Handler’’) was the successor and      to a business memo. But there was no standard
                       response to an earlier RAND system, called MS.      format for email messages and creating and

8   IEEE Annals of the History of Computing
revising standards for email message formats         line)? Or was it ‘‘PDL@MIT-DMS’’ (picking up
would consume a tremendous amount of                 the host from the ‘‘From: JFH@MIT-DMS’’
effort over the next several years.                  elsewhere in the header)?
                                                        Various mail programs adopted different
                                                     such ‘‘abbreviations’’ which drove me crazy.
First message format standard
                                                     … To handle all of this protocol chaos, I wrote
       Abhay Bhushan, Ken Pogran, Ray Tom-           (and rewrote, and tweaked) a sizable (for a
linson, and Jim White (of SRI) took the first        LISPish world) chunk of code to try to deduce
step to standardize email headers in RFC-561,        the precise meaning of each message header
published in September 1973.48 Their proposal        contents and semantics based on where the
was mild. Every email message should have            message came from. Different mail programs
three fields (FROM, SUBJECT, and DATE) at the        had different ideas about the interpretation of
start. Additional fields were permitted, one per     fields in the headers.
line, with each line starting with a single word        That code first tried to figure out where an
                                                     incoming message had come from. This was
(no spaces) followed by a colon (:). The end of
                                                     not so obvious as it might seem because of
this header section was marked by a single           redistribution and forwarding of messages,
blank line, after which came the contents of         and differences in behavior of various versions
the message.                                         of the other guy’s software. So it wasn’t
    The proposed standard was forward looking        enough to just look to see if you were talking
even as it lacked some basic features. The           to MIT-MULTICS. I remember having condi-
ability to make any word into a header field         tional clauses that in essence said ‘‘If I see a
was progressive and left plenty of room for          pattern like such-and-such in the headers, this
experimentation. The date field was surpris-         is probably a message from version xx.yy of
                                                     Ken Pogran’s Multics mailer.’’ With enough
ingly precise, specifying the time to the
                                                     such tests, it formed an opinion about which
minute and the time zone. The blank line
                                                     mail daemon it was talking with, and which
after the header remains a feature of email          mail UI program had created a message.
today. Yet there was no TO field, so a recipient        Having hopefully figured out the other
wouldn’t necessarily know who else was to            guy’s genealogy (and therefore protocol dia-
receive the message, and, while use of the @         lect), the code then acted based on a painfully
sign was already common, the address format          collected set of observations about how that
required using the word ‘‘at,’’ as in TOMLIN-        system behaved.52
SON AT BBN-TENEX, with the odd conse-
quence that for several years, people would           RFC-680 is notable for documenting the
send emails using ‘‘at’’ in the FROM (and soon,    increase in header fields that had taken place
TO) field and yet within the message itself list   over two years. It defined a number of widely
their email address with an ‘‘@.’’                 used but not standardized header fields,
                                                   including most notably, the TO field, but also
Partial progress                                   CC (carbon copy), BCC (blind carbon copy), IN-
   In 1975, a team of people working on email      REPLY-TO, SENDER, and MESSAGE-ID. Introduction
systems at BBN sought to update RFC-561 with       of the TO field meant a format needed to be
RFC-680.49 The work was produced under the         chosen for sending to multiple recipients. The
auspices of ARPA’s Message Services Commit-        proposal called for multiple email addresses in
tee.50 The RFC authors were Ted Meyer and          a field separated by commas. The RFC also
Austin Henderson, but email on the                 documented the use of @ instead of ‘‘at.’’
MsgGroup mailing list suggests Charlotte               RFC-680 was a clear step forward from RFC-
Mooers51 also played a major role. RFC-680         561. Still, RFC-680 had limitations. It was
set out to document a large number of fields,      based on practices on TENEX systems, which
many of which were already in widespread but       were not always representative of the Arpanet
informal use, and to standardize their formats     community as a whole. (For example, the
in a way that computer programs (e.g., user        decision to separate addresses in the TO field
agents) could easily parse.                        with commas was a TENEX convention.) Its
   That the header standard needed updating        syntax had bugs (it unintentionally permitted
was becoming increasingly clear. Jack Haverty      ‘‘@’’ and comma in mailbox names). Further-
offered the following example from his time        more, pragmatically, RFC-680, while intended
maintaining the MIT-ITS mailer.                    to become a standard, was never officially
                                                   issued as a standard.53
  [A] field like ‘‘To: PDL, Cerf@ISIA’’ was            In addition, RFC-680 revealed a philosoph-
  ambiguous was ‘‘PDL’’ really ‘‘PDL@ISIA’’        ical split between members of the Message
  (picking up the host from the end of the         Services Committee. The MIT members (Vezza

                                                                                                        April–June 2008   9
The Technical Development of Internet Email

                       and Haverty) felt email headers were primarily         The MsgGroup discussion raised two issues.
                       of use to the email handling programs and          First, was the new RFC going to cause much
                       should be designed to be machine-readable.         longer message headers that users would have
                       Others felt that headers should focus on being     to see? Second, wasn’t the major issue simply a
                       human readable. RFC-680 tried to strike a          desire to embed users’ real names into TO and
                       compromise, which apparently pleased nei-          FROM fields and, in that light, were all the other
                       ther side.54                                       header fields necessary? The conclusion was
                          The result was confusion. Some sites up-        that extra header information simply reflected
                       dated their mailers to conform to RFC-680          the reality of what had already happened, and
                       while others continued to follow RFC-561.          the desire not to see them pointed to a need for
                                                                          user agents to edit header information, and
                       A new standard                                     that yes, adding names mattered.
                          Sometime in 1976, the Message Services              The Header-People debate was rooted in
                       Committee was replaced by the ARPA Com-            specification details. The best example of the
                       mittee on Human-Aided Communication.55             tenor of discussion is a multiday argument
                       One of the new committee’s early actions was       (rich with ad hominem remarks) about wheth-
                       to seek to clarify the state of standards for      er to use 12-hour or 24-hour times in the DATE
                       email message formats. A vigorous email            field, with much debate about whether
                       discussion on the Header-People mailing list       ‘‘12am’’, ‘‘12pm’’, or ‘‘12m’’ was the correct
                       in the fall of 1976 led to a new proposed          abbreviation for midnight. The upshot was to
                       standard in RFC-724 (‘‘Proposed Standard for       eliminate support for 12-hour times.58
                       Message Format’’) written by Ken Pogran                The result was RFC-733, a revision (by the
                       (MIT), John Vittal (now at BBN), Dave Crocker,     same authors) of RFC-724. The major improve-
                       and Austin Henderson.56 It came out in early       ment in the revision (beyond the date field)
                       1977.                                              was a clear statement of how to include names
                          The RFC-724 authors, like the RFC-680           with email addresses. The format was to put
                       authors, sought mostly to document current         the email address in angle brackets (, .) as in
                       practice. Vittal nicely summarized the goals as:   ‘‘David H. Crocker’’ ,crocker@rand-unix.,
                                                                          and if the text before the brackets contained
                         to take RFC680 plus what we felt were things     any special characters such as punctuation or
                         which people were already doing that were        control characters, it had to be in quotes. The
                         useful to most, take out some things that        RFC also made clear that mailing lists looked
                         weren’t terribly useful and probably shouldn’t   like any other mailbox.59 Issued in November
                         have been in 680 in the first place, and come    1977, RFC-733 was the official standard for
                         up with a new specification. There were          message formats for five years, and a de facto
                         several things that some systems were already    standard well into the mid-1980s.
                         doing: comments (e.g. the day of week in
                         parentheses), association of people names        Today’s standard
                         with user names (like at places like Stanford,
                                                                             In 1982, as the email community was
                         CMU and MIT, also using parenthesization),
                         random date format preferences (Multics vs
                                                                          preparing to transition to the Internet, the
                         Tenex, etc.), and so on. Elements of 680 which   authors of RFC-733 were asked to update it.
                         were not perceived as necessary were mostly      The authors of 733 had several conversations
                         the military-like field names such as prece-     about what the changes should be, but only
                         dence, as well as syntactic inconsistencies      Dave Crocker (who had become a graduate
                         (bugs), and syntactic limitations. These could   student at the University of Delaware) had the
                         all be accomplished by using the notion of       time to undertake the revisions. Several fea-
                         user-defined fields.57                           tures of RFC-733 that had failed to win popular
                                                                          acceptance were deleted, and three new fields,
                       RFC-724 defined a text-only message format.        FORWARDED, RESENT-FROM, and RESENT-TO, were
                       The message header and contents were ASCII.        added (to support the common practice of
                       The authors observed that, at some point in        forwarding an email message to someone else).
                       the future, clearly email would use richer            A more startling feature (in retrospect) was
                       binary formats, but that was beyond the            the addition of the RECEIVED field. RECEIVED is
                       immediate need.                                    odd because it, alone of all the fields in the
                          The new RFC provoked a tremendous               message header, was created by MTAs rather
                       amount of debate on Header-People and a            than UAs. Every MTA was required to insert a
                       more focused (and very distinct) discussion on     RECEIVED field into the message, to track the
                       MsgGroup.                                          message’s path through the network. Looking

10 IEEE Annals of the History of Computing
back, this is an odd and subtle architectural        and they used it.64 By the mid-1970s, imple-
change that made MTAs responsible for                menting an MTA was getting harder, not
understanding the format of messages, which          because email had become more difficult, but
previously (ignoring the practical problem of        because the profusion of slightly different
address rewriting; see the next section) MTAs        MTAs meant that everyone’s MTA had to be
had not needed to understand.                        programmed to deal with the differences.
    The result, written by Crocker and pub-             For example, there was considerable dis-
lished in August 1982, was RFC-822. RFC-822,         agreement about whether one had to login to
or more commonly, simply 822 format,                 the remote system (FTP had a login command
remains the basic standard a quarter century         called User) before trying to deliver email with
later. (An updated version appeared as RFC-          MLFL. Multics required a login. TENEX did not.
2822 in 2001, but the basic format is un-            So MTAs had to include code to recognize
changed.)60                                          when they were talking to Multics and when
    Before we leave the discussion of the            to TENEX and adapt their behavior accordingly.
evolution of message formats, a few observa-            SMTP, because it was well-specified, even-
tions are in order. First, developing a message      tually solved this problem (see the ‘‘SMTP and
format was a difficult intellectual problem.         avoiding second system syndrome’’ section).
RFC-822 is 47 pages long and a combination of        Unfortunately, by this point, a new problem
an augmented Backus-Naur notation that               had arisen: multiple email networks.
defined each field’s format and briefly stated
each field’s semantics. It is comparable in          Bitnet, CSnet, and UUCP
complexity to the computer language specifi-            Between 1978 and 1981, three major email
cations of the time. Second, it is hard to           networks were created. Although the Internet
understate the importance of RFC-733. RFC-           remained the largest network throughout the
733 came out early enough to become the de           1980s, these three networks (UUCP, CSnet,
facto standard for email message formats             and Bitnet) would grow big enough to influ-
throughout much of the world. The UUCP               ence email standards. The UUCP network was
network, the Computer Science Network                comparable to the Internet in size. And, almost
(CSnet) and Bitnet all ended up using RFC-           from the start, the four networks were inter-
733 format for their email messages.61               connected,65 creating massive challenges for
                                                     MTAs of routing between four networks (not
Evolving the MTA                                     counting the smaller networks that appeared)
   SNDMSG was the earliest MTA. It simply            with different address formats.
delivered the message or returned an immedi-
ate error message saying it had failed. After
about a year, Bob Clements enhanced SNDMSG              UUCP network. The UUCP network
to retransmit messages if the remote host was        (named for the Unix-to-Unix CoPy program
down.62 About two years later, SNDMSG was            over which it was built) began inside AT&T in
updated to place each message in a file in the       1978.66 It used dial-up telephone links to
user’s directory (one file per email) and a new      exchange files and within a few months was
program, called MAILER, would periodically           moving email. AT&T soon distributed the
pick up and deliver email files in the user’s        software and the UUCP network, made up of
directory.63 (Observe that this change convert-      cooperating sites, was off and running. Over
ed SNDMSG to a user agent, with MAILER taking        the next decade it grew at a prodigious rate,
on the role of MTA.)                                 such that by 1990, its population was estimat-
   In a nutshell, that incremental evolution         ed at a million users—comparable to the
describes the experience of developing MTAs          Internet’s population.67
in the 1970s. Each operating system would               The UUCP network was a multihop net-
implement an MTA, which was then refined             work. To reach machine V, an email from
over the years to deal with environmental            machine M might have to pass through
conditions.                                          intermediate systems Q and T. The motivation
   Unfortunately, the different MTAs evolved         for this approach was to minimize phone bills.
differently. The underlying problem was that         In the 1970s and early 1980s, long distance
email via FTP was underspecified. (It is useful to   calls were expensive, and the rates differed by
observe that the specification for email delivery    hour (with evening and night rates being
with FTP was two pages long, while the SMTP          sharply lower). Modems were slow (a couple
specification, when it appeared, was 68 pages        hundred bytes per second was considered
long.) Implementers had considerable latitude,       good) and files were (relatively speaking) large.

                                                                                                         April–June 2008   11
The Technical Development of Internet Email

                       So the typical operating mode at any UUCP
                       site was to save up all email until 5 p.m., then
                       call a nearby UUCP site to forward email along           CSnet was designed to
                       and receive inbound email. Indeed, over the
                       course of the night, several phone calls would          become self-supporting.
                       be made to push outbound mail and receive
                       inbound mail. Depending on the calling                 The ARPA and NSF fund-
                       schedules and the connectivity of the ma-
                       chines, email could travel a few or several hops        ing was only to provide
                       before the nightly calling frenzy ended.
                           Initially, the person composing the email            start-up capital and an
                       had to spell out the entire path a piece of email
                       needed to take through the network. In the             initial operations budget.
                       UUCP network, the hops were separated by
                       exclamation points (‘‘!,’’ pronounced as
                       ‘‘bang’’). So, someone mailing the author via           CSnet was designed to become self-support-
                       UUCP from UC Berkeley in the 1980s would            ing. The ARPA and NSF funding was only to
                       send it to ucbvax!ihnp4!harvard!bbn!craig (in       provide start-up capital and an initial operations
                       which each text string followed by a ‘‘!’’ is       budget. For the first two years, CSnet operations
                       known as a hop; this example has four hops).        were distributed between the University of
                           In 1982, Steve Bellovin wrote pathalias, a      Wisconsin and the University of Delaware, with
                       tool designed to compute paths from a               help from RAND (which ran a gateway on the
                       network map. He refined it with Peter Honey-        West Coast). Beginning in 1983, the network
                       man.68 Pathalias was distributed widely. Now,
                                                                           was operated by BBN, where a team of roughly
                       by keeping a map of regional connectivity, it
                                                                           10 people provided technical support (includ-
                       became possible to email via landmark sites
                                                                           ing writing or maintaining much of the email
                       and have them fill in the missing hops. So, for
                                                                           software used by CSnet members), user services,
                       instance, the author’s address could be re-
                                                                           and did marketing and sales. By 1988, CSnet was
                       duced to ihnp4!bbn!craig and the harvard hop
                                                                           self-supporting and had approximately 180
                       would be dynamically inserted.
                                                                           members, most of them computer science
                           In 1984, Mark Horton began an effort to
                                                                           departments in North America.
                       create a complete UUCP network map, which
                                                                               Technologically, CSnet did everything pos-
                       reached fruition about 1986. After that, UUCP
                                                                           sible to make its members feel part of the
                       users could simply type sitename!user, and
                                                                           Internet community. Initially, connectivity
                       pathalias would compute a path to sitename
                       for them. An even fancier trick was to add a        was almost entirely email only, using dial-up
                       network domain to the sitename, such as             phone service. Over time, direct access via IP
                       bbn.arpa!craig, and pathalias would compute a       was also supported over a variety of media,
                       path to an email gateway between the UUCP           including IP over X.2571 and the first dial-up IP
                       network and the Internet.                           network.72
                                                                               After 1983, email in CSnet all went through
                          CSnet. By the late 1970s, the computer           a single email gateway, CSNET-RELAY, which sat
                       science research community realized that the        on both CSnet and the Internet. Email was
                       Arpanet was changing how people did re-             routed by addressing it to the relay, with the
                       search. Researchers who had access to a             user address being the target address on the
                       network got information more quickly, and           other network. The syntax used a percent sign
                       could collaborate and share work more easily.       (%) to divide the next hop user name from
                       Thus was identified the first ‘‘digital divide’’—   relay address. So, to get from the Internet to a
                       between computer science departments that           CSnet host, one emailed to user%host.csnet@
                       had access to Arpanet and those that did            csnet-relay.arpa. From CSnet, one emailed
                       not.69                                              user%host.arpa@csnet-relay.csnet. Email was for-
                          The goal of the Computer Science Network         matted according to RFC-733 and 822 stan-
                       (CSnet) was to bridge that gap. Created in 1981     dards.
                       by the National Science Foundation in coop-
                       eration with ARPA, CSnet linked computer               Bitnet. Bitnet was established in the
                       science departments and industrial research         same year as CSnet, but with a different
                       laboratories to the Arpanet (and then the           driving force. Bitnet (‘‘Because It’s There’’ or,
                       Internet).70                                        later, ‘‘Because It’s Time’’) was created by

12 IEEE Annals of the History of Computing
university computer centers (now information         instance, if someone told you he was bob@
technology offices) to interconnect their com-       princeton, one had to immediately ask ‘‘which
puting facilities with email and file transfer.      network’’ because princeton.bitnet and princeton.
Because the centers typically used IBM main-         csnet were different machines and were not
frames running the VM operating system,              interconnected. If a user forgot, or her email
Bitnet was constructed from low-speed leased         software removed the network appellation
lines running IBM networking software, on            (e.g., .csnet) the email would be delivered to
which email was overlaid.                            the bob@princeton in whichever network the
    Like CSnet, Bitnet used Internet email           sender was in.
standards (with the %-hack in the email                 The second problem was that, even if one
address for gatewaying). Unlike CSnet, Bitnet        knew which network an email address was in,
did not have a central management or support         getting it there was not easy. To take a
center. Instead, most functions were volunteer       relatively common example, consider the
activities, with coordination provided by            following four addresses:
Educom (Interuniversity Communications
                                                        ihnp4!ucbvax!bob%princeton.csnet@
Council). In mid-1988, Bitnet had nearly 400
                                                           csnet-relay.arpa
member sites.
                                                        bob%princeton.csnet%csnet-relay.arpa@
    The boards of Bitnet and CSnet overlapped
                                                           wiscvm
and the two networks eventually merged, so
                                                        bob%princeton.csnet@csnet-relay.arpa
one may wonder why they were distinct in the
                                                        bob@princeton
first place. The distinction lies in the relation-
ship, often contentious, between computer            These represent the four likely addresses for
science departments and computing centers in         reaching bob at Princeton’s CSnet host, from
the 1970s and 1980s. Computer science depart-        the UUCP network, Bitnet, the Internet, and
ments typically maintained their own comput-         CSnet respectively. If the examples are not
ing facilities, to enable research by computer       painful enough, consider the first address and
science faculty. Computing centers were uni-         how it would be handled in transit.
versity-wide resources that sought to provide           It starts in the UUCP network and is passed
stable computing environments for researchers        to ihnp4 (a key UUCP relay at Bell Labs in
in other disciplines. The stereotype was that        Naperville, Illinois). Ihnp4 must puzzle out
computer science departments ran cutting-edge        ucbvax!bob%princeton.csnet@csnet-relay.arpa and
operating systems on minicomputers and               decide if the email address is to the left of
workstations while computing centers ran             the @ sign (Internet style) or to the right of
established commercial operating systems on          the bang (UUCP style). As ihnp4 is a UUCP-
mainframes. More important, from an institu-         only system, it knows to use UUCP ad-
tional perspective, the computer science de-         dressing and passes the message to ucbvax
partment typically provided a haven for those        at the University of California at Berkeley.
on campus who were (for whatever reason)             Ucbvax is a gateway on both the Internet
disgruntled with the computing center. Neither       and UUCP networks so it must puzzle out
party particularly wanted to rely on the other for   bob%princeton.csnet@csnet-relay.arpa. Thank-
network access, with the result that there were      fully, ucbvax was not on CSnet and clearly
two networks: one for each community.                not the same system as csnet-relay.arpa, so
                                                     bob%princeton.csnet is no good. Thus the
    Email addressing across networks. The            message must be sent to the CSnet relay
four networks (including the Internet) period-       (and, because Arpanet did not strip mailing
ically viewed themselves as competitors. Yet         information, it remains bob%princeton.csnet@
the four networks were also committed to             csnet-relay.arpa). CSnet’s relay in turn extracts
making email work among them. A number of            the address to the left of the @ sign, to get
sites brought up gateways between the net-           bob%princeton.csnet and delivers the email to
works. Even more sites made a point of               Princeton.
residing on more than one network, to ensure            Observe that there’s ample chance for
ease of mailing for their users.                     confusion. Another nasty problem was that
    It is widely agreed that, by the early 1980s,    each mailer had to make sure that the FROM
email addresses were a disaster both for users       address in the email was updated (and some-
trying to email across networks, and network         times the TO and CC addresses as well) so that
administrators trying to keep the email flowing.     the recipient of the email could successfully
    The disaster had two dimensions. First, one      reply to it. Yet another challenge was that, for
had to know which network a user was on. For         a period, the United Kingdom decided to

                                                                                                         April–June 2008   13
The Technical Development of Internet Email

                       reverse the order of labels in a domain name        regarding delivery. If, by some mischance, the
                       (so Kirstein@uk.ac.ucl.cs) with the result that     message had to be queued, arpa (not deliver-
                       some mailers had to parse names backward            mail) would queue it.
                       and forward (‘‘bothways’’ mode) to see if they         To parse the address, delivermail used the
                       made sense.                                         simple expedient of assuming that an at-sign
                          It is no surprise that the people who made       meant Arpanet mail, an exclamation point in
                       major contributions to email MTAs at this           the address meant UUCP, and a colon meant
                       time were people closely affiliated with email      the local BERKNET protocols. For each address
                       gateways.                                           type, delivermail could be configured either to
                                                                           call a program to deliver the mail, or call a
                                                                           program to relay the mail to the appropriate
                       delivermail, sendmail, and mmdf
                                                                           gateway (one email gateway per type).
                           The appearance of new email networks
                                                                              The delivermail MTA had a powerful aliases
                       transformed the complexity of the MTA. Now,
                                                                           features, in which a destination address could
                       at least on systems that were on multiple email
                                                                           be expanded to a list of email addresses. It also
                       networks, the MTA had to understand multiple
                                                                           had a first class logging system (a way to record
                       addressing formats and routing rules and
                                                                           what delivermail did) called syslog. Email
                       competently move messages between the var-
                                                                           systems were developing increasingly sophis-
                       ious networks as appropriate. One sign that the     ticated logging mechanisms; syslog was so good
                       problem of writing an MTA had gotten hard           that it eventually became a standard part of
                       was that it became the subject of serious           BSD Unix and is now used by a wide range of
                       academic research. The major contributions          applications.
                       were made by two graduate students: Eric               One surprising feature of delivermail was
                       Allman at UC Berkeley (delivermail and send-        that part of its configuration was compiled
                       mail) and Dave Crocker (who had left RAND to        into the program. That is, for each machine,
                       study at the University of Delaware, where he       one compiled a custom version of delivermail.
                       wrote mmdf).                                        So, for instance, if the machine was connected
                           Both men were trying to solve essentially the   to Arpanet, one compiled delivermail with the
                       same problem: supporting multiple email net-        –DHAS_ARPA flag to the C compiler.
                       works in one system. Allman needed an MTA
                       for UC Berkeley’s main email system, which
                                                                               mmdf. About the same time that Allman
                       served as the university’s email gateway be-
                                                                           was creating delivermail, Dave Crocker was
                       tween the UUCP network and the Arpanet and
                                                                           writing the first version of mmdf (the Multi-
                       local email delivery. Crocker needed an MTA to
                                                                           channel Memo Distribution Facility).74 Rather
                       support local email, Arpanet email, and a new
                                                                           than seek to process each message immediate-
                       phone-based delivery system which eventually
                                                                           ly, as delivermail did, Crocker sought to
                       became CSnet’s PhoneNet protocol. The two
                                                                           decompose the process into multiple stages.
                       men solved the problem very differently.
                                                                               When a message arrived (via the network or
                                                                           from a user agent), the message was given to a
                          delivermail.      Allman’s delivermail, the      program called submit, which checked that the
                       simplest of these MTAs, was written for             message format was correct (here the common
                       Berkeley’s BSD Unix operating system in             use of 733 format was a big win) and then
                       1979 and was a basic program73 not greatly          looked at the address to decide what network
                       more complex in its workings than Bob               the message was to go out on. The message was
                       Clements’ 1973-vintage SNDMSG. When in-             assigned to a ‘‘channel.’’ Each channel had its
                       voked by a user agent (or the inbound FTP           own queue: a directory where messages and
                       server), delivermail expected to be given a         their ‘‘envelopes’’ (control information) were
                       message, which it would either deliver or           stored. Simply, submit placed the message in
                       return an error message. The big difference         the right queue.
                       was that delivermail implemented a layer of             Another program, called deliver, was regu-
                       indirection. Rather than delivering the mes-        larly scanning the queues for messages. When
                       sage to a mailbox or a remote system, deliver-      a new message appeared, deliver called on a
                       mail looked at the destination address and          channel-specific program (e.g., mmdf’s equiv-
                       then picked a program to deliver the message        alent of delivermail’s arpa program for Arpanet
                       to. So, for instance, to deliver Arpanet mail via   email) to deliver the message. If message
                       FTP, delivermail called an auxiliary program        delivery failed, submit was called to send the
                       called arpa and passed the mail to the arpa         message back to its sender. If there was a
                       program and waited for a (real-time) response       transient error (e.g., the remote host was

14 IEEE Annals of the History of Computing
down), the message was left in the queue and        N   The address parsing rules and message
deliver would try it again later.                       delivery rules were defined by a grammar
    The mmdf MTA also supported aliases and             in the configuration file.
had a fine logging system.                          N   sendmail now maintained its own message
    An important contribution of mmdf was               queue.
achieving an effective split of the message         N   Certain delivery programs (most notably
delivery process. Diagnosing email problems             email delivery via SMTP) were compiled
(whether configuration problems or problems             into sendmail instead of client programs
with particular messages) was cleanly com-              (e.g., arpa).
partmentalized. Similarly, submit prevented
junk from entering the system; deliver handled      But this list understates the transformation
problems in delivery. An operator knew where        from delivermail to sendmail: sendmail was
the problem was by seeing which program was         almost an order of magnitude more complex
complaining in the logs.                            (measured in lines of code) and tremendously
    Another contribution was restriction of         more flexible.
privileges. One of the key problems in any              The changes had an interesting mix of con-
mail system is that whatever program delivers       sequences. Probably the most important conse-
mail to the user’s mailbox needs special            quence was flexibility. Placing address parsing
privileges. In mmdf, that was one small             and configuration rules in a grammar made it
program, the local channel delivery process.        possible to dynamically configure sendmail for
All the other processes could be run as a regular   arbitrarily complex email environments.
user (usually called ‘‘mmdf’’).                         Another consequence was a reinforcement
    The channel model also proved flexible. A       of delivermail’s approach of putting all the
message could go through multiple channels          email expertise into one program. SMTP was
before leaving a system. Soon, mmdf developed       now embedded in sendmail. So too was queue
a ‘‘list’’ channel to handle mailing lists. A       management. It made sendmail a complex
message was placed in the list channel to have      program and hard to change. Allman later
its destination address expanded. It exited the     noted that sendmail should have been better
list channel by being placed in one or more         decomposed into constituent functions, even
channels to be delivered to members of the          if only internally.76
mailing list. Later, when MX resource records           An unexpected consequence was that craft-
were introduced (see the ‘‘Email routing with       ing and debugging sendmail’s single configu-
domain names’’ section), they introduced a          ration file (sendmail.cf) became a central
new error: a domain name that (because of           preoccupation (some would say headache)
DNS problems) could not currently be looked         for system administrators over the next several
up. In mmdf this was trivially handled by           years. A properly working email system re-
creating a new channel, where submit placed         quired the configuration file be right. And
messages whose addresses could not be re-           sendmail’s grammar (with a fondness for
solved at the moment.                               single-letter tokens, which made mnemonic
    A downside of mmdf was that rather than         naming impossible) gave administrators many
one configuration file, there were several,         opportunities to make a mistake.
scattered in different places. While each con-
figuration file was simple (a list of attribute:    Evolution and perspective
value pairs), the sheer number of them could           Over the 1980s, both sendmail and mmdf
prove frustrating.                                  prospered: mmdf was substantially reworked
                                                    by Crocker, Doug Kingston (of the Army’s
   sendmail. Based on experience with deli-         Ballistic Research Laboratory), Steve Kille (of
vermail, Eric Allman decided to write a new         University College London), and Dan Long
MTA for release with the 4.2 version of BSD         and me (of BBN) into a new release called
Unix. The new MTA was called sendmail.              mmdf2, which was used at a number of major
   Culturally, sendmail was similar to deliver-     email centers in the mid- and late 1980s.
mail. But from a practical perspective, it was         Also, mmdf inspired PMDF, a rewrite of
quite different. Major differences included the     mmdf in Pascal for the VMS operating system.
following:75                                        The initial implementation was done by Ira
                                                    Winston at the University of Pennsylvania. It
N   Configuration was determined by a file,         was then maintained and substantially revised
    called sendmail.cf, rather than being com-      by Mark Vassol and Ned Freed (then at
    piled in.                                       Oklahoma State University). PMDF became a

                                                                                                      April–June 2008   15
You can also read