by Barbara Beeton and Richard Palais
1. Introduction
Until about the early 1960s, most published mathematics was typeset professionally by skilled compositors working on Monotype machines. As this form of “hot-metal” composition became less readily available, on account of both cost and the fact that skilled compositors were retiring and not being replaced, “enhanced” typewriters began to be used to prepare less prestigious publications. Phototypesetting (“cold type”) began to appear gradually, although it was more expensive than typewriter-based composition, and generally not as attractive in appearance as professionally prepared Monotype copy.
By the mid-1970s, Monotype composition was essentially dead. Donald Knuth, a professor of computer science at Stanford University, was writing a projected seven-volume survey entitled The Art of Computer Programming (TAOCP); volume 3 was published in 1973, composed with Monotype. By then, computer science had advanced to the point where a revised edition of volume 2 was in order but Monotype composition was no longer possible; the galleys returned to Knuth by his publisher were photocomposed. Knuth was distressed: the results looked so awful that it discouraged him from wanting to write any more. But an opportunity presented itself in the form of the emerging digital output devices — images of letters could be constructed of zeros and ones.1 This was something that he, as a computer scientist, understood. Thus began the development of \( \mathrm{\TeX} \).
2. The problem
Mathematics as a discipline depends on its own arcane language for communication. Prior to the ubiquitous availability of personal computers, the options for communicating mathematical knowledge were limited to face-to-face contact, preferably with a writing surface handy, although conventions developed to enable intelligible telephone discussion, personal letters (at least bits of which required handwritten notation), or formal publication. The last mode required a highly skilled compositor, working either with traditional hand-set type or with a hot-metal typecaster, or a combination of the two.
The gold standard for typeset mathematics in the midtwentieth century was the Monotype typecaster [e1], [e3]. The audience was relatively small, and the work exacting. Since mathematical notation is essentially multi-level (see Figure 1), the Linotype, the linear-type workhorse for newspapers and most book publishing, was not up to the task. Only a few suppliers would take on such work, and mathematical composition was always considered “penalty copy”.2
Quadratic formula
\[
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
\]
\[
x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
\]
Maxwell's equations
\begin{align*}
\vec{\nabla} \cdot \vec{B} &= 0 \\
\vec{\nabla} \times \vec{E} + \frac{\partial B}{\partial t} &= 0 \\
\vec{\nabla} \cdot \vec{E} &= \frac{\rho}{\epsilon_0} \\
\vec{\nabla} \times \vec{B}
- \frac{1}{c^2} \, \frac{\partial E}{\partial t} &= \mu_0 \vec{J}
\end{align*}
\begin{align*}
\vec{\nabla} \cdot \vec{B} &= 0 \\
\vec{\nabla} \times \vec{E} + \frac{\partial B}{\partial t} &= 0 \\
\vec{\nabla} \cdot \vec{E} &= \frac{\rho}{\epsilon_0} \\
\vec{\nabla} \times \vec{B}
- \frac{1}{c^2} \, \frac{\partial E}{\partial t} &= \mu_0 \vec{J}
\end{align*}
Another system of equations
\newcommand{\gammaurad}[1]{%
\frac{\gamma u_{\text{rad}}^{} \bar{\lambda} a_{\text{eff}}^2}{2I_1 {#1}}\,}
\begin{align*}
\frac{d\phi}{dt} &= \gammaurad{\omega \sin \xi} G(\xi, \phi)
- \Omega_{\mathrm{B}} \, , \\
\frac{d\xi}{dt} &= \gammaurad{\omega} F(\xi, \phi)
- \frac{\sin \xi \cos \xi}{\tau_{\text{DG}}^{}} , \\
\frac{d\omega}{dt} &= \gammaurad{}
\bigl[ \gamma H(\xi, \phi)
+ (1 - \gamma) \langle Q_\Gamma^{\text{iso}} \rangle \bigr] \\
&\phantom{{}={}} - \frac{\omega \sin^2 \xi}{\tau_{\text{DG}}^{}}
+ \frac{\omega \sin^2 \xi}{\tau_{\text{drag}}^{}}
- \frac{\omega}{\tau_{\text{drag}}^{}}
\end{align*}
\begin{align*}
\frac{d\phi}{dt} &= \frac{\gamma u_{\text{rad}}^{} \bar{\lambda} a_{\text{eff}}^2}{2I_1 {\omega \sin \xi}}\, G(\xi, \phi)
- \Omega_{\mathrm{B}} \, , \\
\frac{d\xi}{dt} &= \frac{\gamma u_{\text{rad}}^{} \bar{\lambda} a_{\text{eff}}^2}{2I_1 {\omega}}\, F(\xi, \phi)
- \frac{\sin \xi \cos \xi}{\tau_{\text{DG}}^{}} , \\
\frac{d\omega}{dt} &= \frac{\gamma u_{\text{rad}}^{} \bar{\lambda} a_{\text{eff}}^2}{2I_1 {}}\,
\bigl[ \gamma H(\xi, \phi)
+ (1 - \gamma) \langle Q_\Gamma^{\text{iso}} \rangle \bigr] \\
&\phantom{{}={}} - \frac{\omega \sin^2 \xi}{\tau_{\text{DG}}^{}}
+ \frac{\omega \sin^2 \xi}{\tau_{\text{drag}}^{}}
- \frac{\omega}{\tau_{\text{drag}}^{}}
\end{align*}Figure 1. Samples of display math using \( \TeX \) input and output.
3. Analysis of the problem
What Knuth did next is described nicely in his lecture on the occasion of his receiving the Kyoto Prize in 1996 [e6]. Publication of the photoset volume 2 was halted, and Knuth sought out the best examples he could find of the mathematical typesetter’s art. He chose three: Addison-Wesley books, in particular the original TAOCP; the Swedish journal Acta Mathematica, from about 1910; and the Dutch journal Indagationes Mathematicae, from about 1950.
To develop rules for proper spacing in mathematics, he writes ([e7], pp. 364–365)
I looked at all of the mathematics formulas closely. I measured them, using the TV cameras at Stanford, to find out how far they dropped the subscripts and raised the superscripts, what styles of type they used, how they balanced fractions, and everything. I made detailed measurements, and I asked myself, “What is the smallest number of rules that I need to do what they were doing?” I learned that I could boil it down into a recursive construction that uses only seven types of objects in the formulas.
4. Growing pains
The initial implementation of \( \mathrm{\TeX} \) began in October 1977 and was
complete in May 1978. This tool was at first intended just for use by
Knuth and his secretary to produce future volumes of TAOCP of
which he could be proud. As a trained mathematician, he designed the
input so that it would be meaningful in its raw form to another
mathematician, but would also be easy for a secretary to type. Symbols
would be input by name, e.g.,
\gamma
,
as would the structural
components of a document, e.g.,
\chapter
or
\section
,
as opposed to the prevailing compositor’s approach of marking changes by
font and type size. (The latter approach is still evident in the
design of many word processing programs, although it’s usually hidden
from the person entering the text.) \( \mathrm{\TeX} \) was designed to be used as a
batch process, although interactive entry is possible, so the output
isn’t seen until the file has been processed; it is decidedly not
“WYSIWYG”. It was not contemplated that \( \mathrm{\TeX} \) would become a
commercial product; instead, it would be made freely
available.3
In January 1978, Knuth delivered the Josiah Willard Gibbs lecture to the annual meeting of the American Mathematical Society (AMS). The lecture, entitled “Mathematical Typography” [e2], began “Mathematical books and journals do not look as beautiful as they used to.” Armed with copious examples, both good and bad, and a firm sense of how best to present mathematical notation so that it is intelligible (at least to those who are familiar with its use), Knuth presented a view of how computers can serve to replace the vanishing expertise of traditional compositors and restore the appearance of technical publications to their former glory. In addition to the discussion of proper presentation of mathematical notation, the lecture introduced a companion tool, Metafont, for production of the needed fonts.
The chair of the AMS Board of Trustees, Richard Palais, was in the audience. Since the AMS was one of the publishers suffering from the technological transition, \( \mathrm{\TeX} \) sounded like the solution to many problems. An arrangement was set up for a group of AMS representatives to spend a month at Stanford and learn \( \mathrm{\TeX} \), “bring it back and make it work”. This group consisted of one staff member from each of the AMS offices (Barbara Beeton from headquarters and Rilla Thedford from Mathematical Reviews) and three mathematicians: the aforementioned Richard Palais; Robert Morris from the University of Massachusetts, Boston, who had extensive computer experience; and Michael Spivak, who had a proven ability to write cogent textbooks. The charge was to develop methods for dealing with the typical publication cycle and to write an interface and instruction manual for end users as well as production staff.
As one of the AMS representatives, Beeton gathered a number of “good bad examples” that she knew would be encountered in production because they already had. This turned out to be good preparation: several of these examples turned up later in The \( \TeX \)book ([e5], see vol. A) and as new features added to the program itself.4
The \( \mathrm{\TeX} \) program was duly brought back to the Providence office of the AMS, installed, and initial implementation of useful procedures was undertaken.5 The first applications were light on mathematical content; polishing of the extended instruction set for use by mathematicians (AMS-\( \mathrm{\TeX} \)) and writing of its user manual [e4] were still underway. Also, in the interim, extensive changes were made in the program to provide features not in the first iteration (known now as \( \mathrm{\TeX} \)78). These changes included (i) enhanced manipulation of “boxes” (the containers for printed characters) and surrounding spaces and (ii) an increase in the number of fonts that could be used as well as improved methods for manipulating them. The resulting version, known as \( \mathrm{\TeX} \)82, is the basis for today’s program. At the same time, the language in which \( \mathrm{\TeX} \) was written was changed, from one that was in limited use to one with a solid history of use in teaching programming.6 As it had been from day one, the software remained free to use and adapt. Having achieved his goal of a system that met his needs, Knuth returned to his work on TAOCP.
Contributing to \( \mathrm{\TeX} \)’s growing popularity was the emergence, starting in the mid-1980s, of personal computer systems and their rapid adoption by technically minded individuals. This was \( \mathrm{\TeX} \)’s natural audience, and implementations of \( \mathrm{\TeX} \) on these personal machines proliferated.
By the end of the 1980s, a growing user population in Europe was becoming increasingly frustrated with the difficulties in handling non-English texts. \( \mathrm{\TeX} \) required arcane combinations of characters to represent accented letters rather than the single pre-accented forms provided by European keyboards. Also, the compound input forms could not be properly hyphenated. A persuasive group of German users sat down with Knuth at the 1989 \( \mathrm{\TeX} \) Users Group meeting to discuss this lack. This meeting resulted in the extension of \( \mathrm{\TeX} \) to accommodate natively accented letters on input and proper hyphenation in processing.7
5. Communicating mathematics
The basic \( \mathrm{\TeX} \) system comes with a functional toolkit of typographic functions and one (quite extensive) family of fonts. This is necessary for the typesetting of mathematics and other technical material, but many users did not find it sufficient. Development has occurred in several areas, not all involving \( \mathrm{\TeX} \).
Document structuring
While AMS-\( \mathrm{\TeX} \) formatted complicated math displays admirably using descriptive commands, it lacked the ability to automatically number equations and sections of a document and the means for cross-referencing. Another user instruction set, \( \mathrm{\LaTeX} \) (devised by Leslie Lamport,8 a former student of Palais), did provide those features, although it lacked the mathematical refinements of AMS-\( \mathrm{\TeX} \). The AMS, responding to pressure from authors, arranged to have the math-formatting facilities of AMS-\( \mathrm{\TeX} \) rewritten to operate within the \( \mathrm{\LaTeX} \) paradigm; the result was called AMS-\( \mathrm{\LaTeX} \), comprising two parts, amsmath and the AMS document classes.9
Fonts
Font development has been driven by the availability of personal computers and laser printers and the growth of the World Wide Web, as well as by the desire for variation in type styles available for \( \mathrm{\TeX} \).
One font family that originated in the need for robust output from low-resolution laser printers is Lucida by Kris Holmes and Charles Bigelow. Bigelow was on the Stanford faculty during part of the \( \mathrm{\TeX} \) project development, and Lucida has, from the very beginning, included a large complement of math symbols as needed by \( \mathrm{\TeX} \) users.
Desire to give mathematicians the ability to communicate on the Web was the driving force behind the STIX project.10 In the first phase of this project, a comprehensive list of math symbols was compiled from lists submitted by the STIpub member organizations and submitted for addition to Unicode. The bulk of additions became available with Unicode 4.0 in 2003, comprising several thousand symbols, including several variant alphabets (e.g., Fraktur and script) needed to discriminate between different variables as defined in mathematical contexts.
Version 1 of the STIX fonts (based on Times) was released in 2012, and final polishing of version 2 is underway.
Possibly influenced by the STIX work with Unicode,11 Microsoft added mathematics support to Word 200712 along with the newly designed Cambria font [e10]. Cambria is the first OpenType font (OTF) to make use of the OTF Math table. Indeed, the OTF Math table was created specifically for Cambria, and many of its parameters are recognizable as parallel to the \( \mathrm{\TeX} \) font paradigm.
The Web
XML was developed as a Web-aware application of SGML. Even for SGML, there had been an effort to standardize the names of math symbols as a “public entity set”, and this drew heavily on the names assigned for \( \mathrm{\TeX} \) and AMS-\( \mathrm{\TeX} \). This vocabulary was taken into XML and its technical daughter MathML. Work has continued in this area to maintain parallel naming, insofar as possible, between the two “languages”.
Since MathML is not as easily comprehended by humans as \( \mathrm{\TeX} \), translation conventions and software have sprung up to allow input using \( \mathrm{\TeX} \) notation, which is familiar to mathematicians. Another Web presentation tool, MathJax, has emerged to allow in-line math to be delivered natively on-screen (that is, without the use of bitmap inclusions, which are not scalable, or PDF); again, the input notation is essentially \( \mathrm{\TeX} \) although it is rarely entered directly by a human author.
Non-technical applications
Since \( \mathrm{\TeX} \) was designed as a hardware-independent batch process, it is capable of being used in repetitive contexts to prepare personalized form letters, invoices, bank statements, train schedules, catalogs, …; the list goes on and on. The original output format is compact, since it contains only the identification of glyphs and their location on the page; thus it can be archived compactly (along with one copy of each needed font and other repetitive content such as logos), an important feature to comply with legal requirements for some documents. Most such uses are “invisible” to those not familiar with the relevant workflow, but they are extensive, especially in Europe.
Remaining limitations
One area that has not yet seen a satisfactory method of presentation is accessibility — the ability to translate \( \mathrm{\TeX} \) input to an audio output that is readily understandable by a trained mathematician with visual limitations. Part of the problem is that, for best results, an author must think ahead about such use and restrict the way that notation is used; most authors can’t be bothered, even if they are aware of the problem. Someone may find a credible and easily applied solution, but to date, it’s still a quite hard problem.
6. Conclusion
The most lasting effect of \( \mathrm{\TeX} \) is separate from the software itself: \( \mathrm{\TeX} \)’s vocabulary has become the lingua franca of mathematics. Knuth’s design of a linearly coded stream for representing math has withstood the test of time and has been adopted into other software without any substantial redesign. And \( \mathrm{\TeX} \) itself is one of the few pieces of software from that period still in wide use.
Since the input is plain text, it is not affected by (most) upgrades to the processing system, and it is hardware independent; the same input will yield the same output, modulo the availability of identical fonts. Knuth’s original goal of creating a system that would enable him to typeset his life’s work, TAOCP, with the same high quality shown by the first edition of volume 1 and remain consistent regardless of how many years have elapsed has been achieved admirably.
Unless something totally unforeseen materializes that is simpler to use and produces results of equally high quality without the need to unlearn the basics of mathematical discourse itself, the situation is likely to remain very much the same in the coming decades.
Authors
Barbara Beeton is a long-time employee of the American Mathematical Society, where she has been involved in technical support of typesetting ever since installation of the first computer. She is a founding member of the \( \mathrm{\TeX} \) Users Group (TUG) and editor of their journal, TUGboat. She has been a representative to U.S. and international standards working groups with a focus on document processing, and she represented STIpub to the Unicode Technical Committee in the effort to expand Unicode to accommodate mathematical notation.
Richard Palais was the Founding Chair of the \( \mathrm{\TeX} \) Users Group. He was a member of the AMS Board of Trustees from 1972 to 1981 and its chair from 1977 to 1979. He is professor of mathematics emeritus at Brandeis University, and since 2004 he has been on the faculty at the University of California, Irvine.