some common (la)tex errors

introduction

TeX (and LaTeX, a set of TeX macros) can produce extremely high-quality output. It is particularly strong in typesetting mathematical expressions. Its interpretation of most plain text is consistent with the expectations of an old-fashioned typist. Some differences, however, lead many writers into errors. Further, to take care of fine points, it incorporates distinctions that a typist never needed to contemplate.

All the following points are discussed in Donald E. Knuth's TeXbook (Addison-Wesley, Reading, Mass., 1984) and other books on TeX, but are overlooked more often than not by the authors of material typeset using TeX that crosses my desk. Most of them are, in a sense, pretty minor. But once you begin to notice the "errors" they will stick out like a sore thumb.

The following discussion applies to the conventions for typesetting English in Canada and the US. As far as I know, British conventions are the same. Conventions for typesetting other languages differ. For example, French conventions for spacing are entirely different.

periods

TeX usually assumes that a period (the character ".") ends a sentence if it is followed by a space, or by a right parenthesis and then a space, or by other similar strings. Consequently it puts more space between a period (or the immediately following right parenthesis or similar character) and the following word than it does between one word and the next. (You can control exactly how much more—see pp. 75–76 of the TeXbook.)

A consequence is that if you type i.e. a word, TeX thinks the period after the e ends a sentence, and thus puts too much space after it. The solution is to type i.e.\ a word. The backslash followed by the space tells TeX that you want a normal interword space, not a sentence-separating space, after the i.e..

There is one exception in TeX's interpretation of a period (hence the earlier "usually"). If the preceding character is an uppercase letter, TeX assumes that the period does not end a sentence. The idea is that "most likely" (in Knuth's words (p. 74)) the uppercase letter is someone's initial, and thus the period is not sentence-ending. Unfortunately, for someone who ends sentences with acronyms more often than she types people's initials, Knuth's assumption is not correct. If you write, for example, ... like the axiom PAR. Now consider ..., TeX will treat the period after PAR as not sentence-ending. You tell LaTeX that it is sentence-ending by typing \@. instead of simply the period. It's a bit tricky to remember to do this.

Summary

  • Type a period alone if the period ends a sentence and does not follow an uppercase letter.
  • Type .\ [period, backslash, space] if the period does not end a sentence.
  • Type \@. [backslash, at-sign, period, space] if the period ends a sentence and follows an uppercase letter.

The same rules apply to exclamation marks and question marks, so if you type one of these that doesn't end a sentence, put \ after it.

(In my opinion these complicated rules are one of the very few examples of poor design in TeX. If every period that followed an uppercase letter did not end a sentence, the rules would be fine. But in my experience many such periods do end sentences, and I would much prefer all periods followed by spaces to be interpreted as sentence-ending. Probably TeX macros can be written to change TeX's default, but I hesitate to try my hand.)

Note to users of Scientific Word: TeX treats n spaces, for n ≥ 1, as equivalent to 1 space. So if you're used to typing two spaces after a period at the end of a sentence, no problem. But if you type a period followed by two spaces in Scientific Word you get . \ in the file, which causes TeX to add an extra space after its sentence-separating space after the period. So don't! (Given the extreme rareness with which one wants extra space between sentences, Scientific Word's action doesn't seem sensible. Note that if you type two spaces between words in Scientific Word you also get two spaces, while in TeX you get only one no matter how many you type.)

ellipses

Three periods in a row make an ellipsis (used to indicate omitted material), but to get an ellipsis, do not type three periods in a row! If you do, the periods will be too tightly spaced (TeX treats them as three periods, one after the other). Instead, type \ldots, which produces three nicely spaced dots. (According to Matt Swift, the behavior of \ldots is not perfect. His style file, lips.sty (available on CTAN), defines \lips, which he says is better-behaved. I haven't tried it, though I have noticed some unfortunate line breaks after \ldots.)

If your ellipsis is in some mathematics, then in some cases you want the three dots centered vertically on the line, rather than sitting on the baseline. To do so, use \cdots. (If you're using the AMSTeX package, \dots decides where to put the dots, and you don't have to worry about this point.)

hyphens, en-dashes, em-dashes, and minus signs

Hyphens
A hyphen separates the parts of a compound word, like "daughter-in-law". To get one, type -.
En-dashes
An en-dash is a bit longer than a hyphen, and is used to separate the elements of a range ("see pages 5–7"). To get one, type -- (two hyphens in a row). (If you really want two hyphens in a row, type {-}{-}.) Don't use an en-dash to separate compound words, but remember to use en-dashes, not hyphens, for the page ranges in references.
Em-dashes
An em-dash is a punctuation mark. (As in: "A specter is haunting Europe—the specter of Communism.") To get one, type --- (three hyphens in a row); the typographical convention is that there is no space between an em-dash and the surrounding text.
Minus signs
To get a minus sign, which is longer than a hyphen, you need to go into math mode. If you type -5 in text, the - is typeset as a hyphen; you need to type $-5$ (or, if you're using Scientific Word, put the "-5" in math mode).

quotes

If you type " (or '') you get closing quotation marks. To get opening ones, type ``.

math and nonmath

The spacing rules within math mode are different from those within text. So to get the spacing exactly right, you need to put math in math mode and nonmath in text mode. For example, if you type Let $x, y \in X$, thinking that a comma is a comma is a comma, so whether it's in math or not doesn't matter, the spacing is not correct. TeX will put the space required after a comma in math after your comma, while you want the space appropriate for text. You need to type Let $x$, $y \in X$ (Actually, to be precise you should write Let $x \in X$ and $y \in X$.)

You're also in trouble if you type ... is $x.$ Thus ..., where the period following the "x" is supposed to be sentence-ending. Because it's in math mode, TeX treats it as a period in math, not as sentence-ending.

In a few cases, the math-status of a character may be unclear. For example, are digits math or not? In some Roman fonts, there is no difference between the output of 1 and $1$, so you may not need to answer the question. But if your digits are in italic text you need to make a decision; 1 creates an italic digit, whereas $0$ creates a Roman one. Probably you want to type in the year 2007, but the value of $f$ is $1$. (You definitely need to use math mode if the number is negative (to get the minus sign).)

The general rule is simple: if it's math, put it in math mode; if it's not math, don't put it in math mode.

text within math

All text within math must be in a text box, because TeX's rules for spaces between characters are different in math mode and text mode. TeX treats $The$ as T times h times e, and spaces the characters accordingly. So don't type the action pair $(Top, Left)$; you need the action pair $(\textit{Top},\textit{Left\/})$. (Don't be tempted to type the action pair $($\textit{Top, Left}$)$---the spacing and possibility of a linebreak after the comma should be determined by the rules for math, not for text.)

Again the rule is simple: if it's text within math, put it in a text box within the math.

switching from sloping font to upright one

How is the space between characters determined? Each character is associated with a virtual rectangle, and when two characters are typeset beside each other their rectangles are aligned. (Think of fitting together blocks of lead type.) If the rectangle for every character were just large enough to contain the character, this method would result in inappropriate spaces between adjacent characters. For example, an "o" and an "e" would touch each other, whereas a significant space would exist between an italic "f" and an italic "l" (because the top of the "f" would be aligned with the bottom of the "l"). This problem is addressed by specifying the rectangle associated with each character to be different, in general, from the smallest rectangle that contains the character. For some characters, like "o" and "e", the rectangle is wider than the characters, and for other characters, like italic "f" and "l", it is narrower. The distances between the edges of the rectangle and the edges of the character are called "sidebearings". Thus for "o" and "e", the sidebearings are positive; for italic "f" and "l", they are negative.

Adding sidebearings to the characters does not allow for perfect spacing between characters. In a font with n characters, there are n2 pairs of characters, and potentially the optimal space between the members of any given pair is unrelated to the space between the members of any other pair. However, it isn't possible to generate n2 arbitrary spaces from the 2n sidebearings specified by the font. (Even for n = 2, when n2 = 2n, generating optimal spacing from a sidebearing specification may not be possible. Try specifying appropriate sidebearings for a font consisting only of "/" and "\" so that "//" and "\\" are close together and "/\" and "\/" do not bump, for example.)

Nevertheless, the method works fairly well most of the time. In only a few cases is intervention required. An example is the case of a sloping character followed by an upright character (or by a space and then an upright character). In this case, the space between the characters is typically too small. (Try typesetting {\em f}I.) To deal with this problem, each character in a font comes with an italic correction, obtained by typing \/ after the character. The idea is that when switching from a sloping character to an upright one, you add this correction. For example, instead of typing {\em f}I, you type {\em f\/}I. Or, more realistically, instead of typing {\em equilibrium} is, you type {\em equilibrium\/} is.

The good news is that if you use \textit to produce italics, you don't have to worry much about the correction, because \textit takes care of it most of the time. The bad news is that \textit doesn't take care of it all the time, and in some situations you may not realize that you are switching from a sloping font to an upright one.

The italic correction is omitted by \textit before a comma or a period, as it should be. But what about \textit{If} $f$? In this case \textit adds the correction, even though doing so seems inappropriate; a correction can be avoided by typing \textit{If\nocorr} $f$. In other similar cases you may need to specify \nocorr.

Here's a situation in which you may not realize that you are switching from a sloping font to an upright one: suppose your theorem environment sets the text in italics and you type

\begin{theorem}If $\Gamma$ is ... \end{theorem}

Then you need more space after "If", and adding an italic correction (If\/ $\Gamma$ is) improves the appearance. However, that's not quite the end of the story, as this discussion shows.

The bottom line: use \textit to produce italics, and be aware that in a few cases you may need suppress the italic correction; in theorems, consider adding an italic correction between text and math, although there may still be room for improvement in the resulting spacing.

thanks

Thanks to Kiefer Hicks for pointing out an error in an earlier version.