eqn: Formatting Equations in groff
I have written in the past about groff, a modern implementation of troff/nroff, an old-school, lightweight LaTex alternative. The underlying typesetting system used is incredibly powerful, but is also ugly and difficult to work with. Because of this, most groff documents are written using a combination of macro packages and pre-processing programs, to provide a more convenient interface for specifying document formatting.
Lately, I have taken a special interest in pre-processors. Many take the form of so-called "little languages", simple domain specific languages (DSL) that are designed and optimized for a particular purpose, such as describing diagrams or tables. The pre-processor converts from this DSL into the underlying typesetting language before groff is run.
One such pre-processor is eqn(1). It is designed for describing
equations, but has some formatting options that are
useful in other contexts as well. At its heart, eqnconsists of a simple language
for describing text layout and alignment, and a few
built-in definitions for Unicode characters commonly
used in equations.
eqn is quite different from
the mathematical language used in LaTex, although the
two systems are designed to accomplish the same basic
task. eqn is small, simple,
and internally consistent. It is designed without
any special keyword delimiters (like the backslash in
LaTex); instead it tokenizes the input on white-space
and interprets each token as a potential keyword. The
keywords and syntax are chosen to resemble the way in
which an equation is spoken aloud, with a few special
features for removing ambiguity. For example, a valid
eqn expression might be,
.EQsum from i=0 to k 2 sup i - i over sqrt 3.EN
which will render as,
Compiling a Document with eqn
Before we start adding equations to our groff documents,
we'll need to adjust our document compilation pipeline
to call the eqn(1) pre-processor. As a pre-processor,
we can manually call it in a pipeline, before we get to
groff(1) itself. For example, to compile a document
written with the ms macro package to a PDF, we could
use the following command,
eqn -Tpdf mydoc.ms | groff -ms -Tpdf > mydoc.pdf
noting that we need to pass the -Tpdf
flag to eqn as well as
groff. It's generally easier to
use groff alone, though. It provides
flags for calling a number of standard pre-processors:
-e will run eqn. So
we can compile our document with,
groff -ms -e -Tpdf mydoc.ms > mydoc.pdf
Including Equations in a Document
When using groff pre-processors, it is necessary
to delimit the portions of the document which
the pre-processor should evaluate. For eqn, this is done by using the .EQ
and .EN macros. I'll call this pair an equation
fence, and its contents an equation block. As
an example,
.EQ2x + 1 = 5.EN
This will render the equation on its own line. You can
also render equations directly in sentences, without
placing them on their own line. These are called inline
equations. To use inline equations, you must first
define a pair of delimiters: one that starts an inline
equation and one that ends it. This is done with the
delim keyword, followed by the starting and
ending delimiter. I like to use $ for this purpose,
but you can use any symbol you like, such as @.
.EQdelim @@.EN
You need to place this physically before any attempts to
add an inline equation to the document; I tend to place
it at the very top. Following these lines, you can add an
inline equation like so: @2x + 1 = 5@.
You don't need to use the same symbol as the start and end delimiter, for example,
.EQdelim @#.EN
This allows inline equations to be set using: @x2 +
5 = 5#. This might be useful if you want to use
an @ in the equation, for example--though you could
just use # on both ends in that case, and so I've yet
to find a a good reason for using a different starting
and ending delimiter.
Basic Concepts
While eqn has a number of
keywords built into it, as well as the capacity to define
your own, these aren't necessary for the most basic of
equations. It has a knowledge of basic mathematical
operators, and will automatically add an appropriate
amount of space between these and variables/numbers. It
will also identify variables and render them in italic
text. White-space will be ignored. For example,
.EQy=mx+b.EN
will be rendered like so,
and
.EQy = mx+ b.EN
will result in exactly the same render.
Most of the time, eqn's automatic
handling of white-spacing will be appropriate, but it does
let you take control and specify your own spacing. The
first way to do this is to place text within quotation
marks; eqn will render such
text verbatim, without any processing. So,
.EQ"hello there" + 5c - 1 = 9.EN
will preserve the space in "hello
there" and render like so,
Note that the text is still set in an italic typeface--we'll see how to set text in roman later.
You can also add spacing using using the ~
and ^ symbols. A ~ represents
a full space, and so,
.EQhello~there + 5c - 1 = 9.EN
results in the same formatting as the above example. The
^ is a half-width space, and is useful for
adding spacing around binary operators that you define
(more on this later). For example, if we wanted to use
& as an operator we might write,
.EQ5^&^1.EN
which will result in,
Using a ~ here will result in an awkwardly
large amount of space (in my opinion, aesthetic tastes
vary), and not using either of characters will result
in no spacing between the operator and operand, which
also looks awkward.
eqn also supports Greek
letters within equations, represented using their name.
The lowercase version is specified by using all lowercase
letters, and the capital version by capitalizing the
first letter. So,
.EQpi = 3.14.EN.EQDelta x = v Delta t.EN
The use of Greek letters forces us to contend
with an important fact of life when using eqn: all tokens must
be space delimited. This is something that I mess up
continuously. eqn tokenizes
its input on words, rather than using a special symbol
(like LaTex's backslash) to seperate keywords/commands
from text to be rendered. Each word is first checked for
any special meaning or definition, and then displayed
verbatim if one is not found. One annoying consequence
of this that white-space is quite important to how eqn processes input. For example,
.EQ(pi + 1).EN
will not render correctly. eqn
will not see pi (a keyword), but
rather (pi (not a keyword). The correct
way to enter this expression would be,
.EQ( pi + 1 ).EN
with extra space around the parenthesis. This even
applies to operators, like $pi+1$
or $Delta= 7. Neither of these will
render correctly, because there isn't white-space (or
~ or ^) on both sides of the
keywords. For someone coming from LaTex, where the input
is mostly tokenized by character, this can take a lot
of getting used to. [1]
Speaking of parentheses, you can use the
left and right keywords to
tell eqn to scale the size
of the parenthesis with its contents, which will be
handy with some of the layout options that we'll
discuss later. You can use a single left
without a matched right, but not vice
versa. You must leave a space between the word and
the symbol, like so,
.EQleft ( x + 1 right ).EN
You can use these keywords with a variety of separators,
such as [ and {. In fact, braces
are a reserved symbol in the eqn
language, so pairing them with right and
left is one of a few techniques for getting
the brace symbols to render in your equations.
Keyword Operators
eqn provides keywords for a
number of important operators and symbols, although not
nearly as many as you may be used to from LaTex. [2]
The lack of operators isn't as severe as you may at
first think, and is easily remedied (we'll do this in a
later section). Most of these are pretty straightforward
mappings to a symbol, such as del for
,
times for
, or partial
for
. But a few involve more complex formatting,
such as sum and int for
summations and integrals respectively; these symbols
will automatically be set in a larger font size. A full
list of supported operators is,
| Keyword | Symbol | Description | LaTex Equivalent |
|---|---|---|---|
| inter | set intersection | \cup |
|
| union | set union | \cap |
|
| prod | product | \prod |
|
| int | integral | \int |
|
| sum | summation | \sum |
|
| grad | gradient | \nabla |
|
| del | same as grad | \nabla |
|
| times | multiplication cross | \times |
|
| cdot | centered dot | \cdot |
|
| approx | ≈ | approximately equal to | \approx |
| prime | prime symbol | \prime |
|
| partial | partial differential | \partial |
|
| inf | infinity | \infty |
|
| >> | much greater | \gg |
|
| << | much less | \ll |
|
| <- | left arrow | \leftarrow |
|
| -> | right arrow | \rightarrow |
|
| +- | plus or minus | \pm |
|
| != | not equal | \neq |
|
| <= | less than or equal to | \leq |
|
| >= | greater than or equal to | \geq |
|
| == | equivalent | \equiv |
A number of other common mathematical functions,
such as cos, min, and
lim all get special treatment as well: eqn will automatically set these in
a roman typeface, rather than italic, and add operator
spacing before and after them. If you want to force some
other text to be roman, use the roman
keyword. Note that, unlike \text in
LaTex, roman will not escape spaces in its
arguments, and so you should put it in quotation marks
if you want to preserve spacing, i.e. roman {"you
must use quotes"}.
The sqrt keyword is used to specify square
roots. This one is a little unique in that it requires
braces to indicate the full length of its argument,
as the over-line must stretch over everything inside of
it. For example,
.EQsqrt {2x + 3}.EN
Note that the standard eqn
square root operator is widely regarded as ugly when
used with large arguments, particularly fractions,
such as in this example,
In such cases, the general guidance is to use a fractional power notation instead.
Simple Layout Commands
eqn handles a lot more than
just mapping words onto special characters and changing
font size and typeface. One of its most useful features
is controlling layout: superscripts, subscripts,
fractions, summation bounds, etc. These are all very
tricky to accomplish in groff without using eqn.
Fractions
First, we'll cover fractions. You write a fraction much like how you might speak it aloud,
.EQx over 4.EN
where the numerator comes before the over
keyword, and the denominator after. If the numerator or
denominator include multiple tokens, you can use braces
to group them,
.EQ{ sqrt {x + 4}} over {x - 1}.EN
groff's version of eqn
also supports a more compact fraction, using
smallover. This is particularly well suited
to inline equations, as it requires less vertical space on
the page. It can be used anywhere over can.
Text Positioning
eqn provides four keywords for
handling text positioning: subscripts, superscripts,
and placing text above or below a symbol. This contrasts
with Latex, which uses only the _ and
^ operators for positioning, with the
interpretation depending upon the context.
[3]
The sub and sup operators
indicate subscripts and superscripts respectively,
and the from and to operators
place text below or above a symbol. As an example,
you might express a summation as,
.EQsum from i=0 to n 3i sup 2 + 1.EN
Keep in mind the tokenization rules here too. If you
need multiple tokens in a subscript, summation range,
etc., they must be enclosed in braces. And, you must
use spaces to separate tokens you don't want
included. For example, writing $(3x+y sup
2)$ places the closing parenthesis in the
superscript,
which is almost certainly not your intent.
Generally speaking, you can apply these positional
adjustments to anything. This is most dramatic in the
case of the from and to
keywords. Beyond sums and integrals, they are used
in limits,
.EQlim from {x -> inf} f(x).EN
or for other arbitrary purposes,
.EQ"hello there" from {"this is under"} to {"this is above"}.EN
These four keywords provide a great deal of flexibility in
text positioning, and aren't tied to particular keywords
like sum or lim in the way
that LaTex's positioning system is; I find that I prefer
eqn's approach. It's true that
LaTex can perform these same layout tasks, but it isn't
as natural; I use LaTex every day and couldn't tell you
offhand how to recreate the previous example.
Complex Layout Commands
eqn provides more complex
layout features as well to support equations that span
multiple lines. These include using mark
and lineup to align text across multiple
equation blocks, using tab-stops to place multiple
alignment markers (like & in a LaTex
align environment, but with more control) or using
matrix to specify neatly formatted tables
of values. For my purposes, I have barely ever used any
of these, though. The layout feature that I use the most
are piles.
Piles
The pile (and also lpile
and rpile for the left and right aligned
versions, respectively) keyword is used to stack multiple
expressions on top of each other. The syntax for a pile
is quite simple,
.EQpile {{ first line } above{ second line } above{ third line } above....{ last line}}.EN
with each line being enclosed in braces (if it has more
than one token, anyway) and the above
keyword being used as a delimiter between lines.
in practice, I usually use the left aligned version,
lpile. pile is center-aligned,
which I feel tends to look weird.
I most recently used a pile in a homework assignment answer key to state a linear program (an optimization problem subject to a set of linear constraints). An example of this is,
.EQlpile {{ max left { 2x sub 1 - x sub 2 + x sub 3 right }roman " subject to" } above{ x sub 1 >= 3 } above{ x sub 1 + x sub 2 <= 10 } above{ x sub 2 >= 0 } above{ x sub 3 <= 4 } above{ x sub 3 >= 1 }}.EN
Piles are also useful for defining piecewise functions. I
can never remember how to do this in LaTex. In eqn, it's easy,
.EQf(x) = left { lpile {{ x sup 2 ,~~~x >= 0 } above{ 0, ~~~~ x < 0 }}.EN
Matrices
The next complex formatting command I want to talk about
is matrix, for laying out matrices. Matrices
build on piles, using the same above
keyword to specify rows, but adding a col
(also rcol, lcol) keyword as a
column delimiter. Note that, the difference between these
three columns is in text alignment, not column position.
You can have more than three columns.
Matrices are defined in column-major form. The basic syntax is,
.EQmatrix {col {{ row 1 } above{ row 2 } above...{row n}}col {{ row 1 } above{ row 2 } above...{row n}}...}.EN
So you can write a five by five identity matrix ( ) using,
.EQmatrix {col {1 above0 above0 above0 above0}col {0 above1 above0 above0 above0}col {0 above0 above1 above0 above0}col {0 above0 above0 above1 above0}col {0 above0 above0 above0 above1}}.EN
Yes, this is a bit of a wordy representation for a
matrix. When you're working with typesetting large
matrices, it's probably better to use tbl
if possible. But this technique does work well for simple
matrices, or for any other simple grid-layouts.
Mark and Lineup
eqn doesn't have a direct
analogue to LaTex's align environment,
which is used to define multiple equations with
matching alignment. However, a simple version of this
(with only one alignment point) can be achieved using the
mark and lineup keywords. Unlike
align, this system spans multiple equation
blocks, and so you can align equations that have text in
between them too! The general syntax is simple (I hope
you're seeing the trend here), use mark
to indicate the alignment point, and lineup
in future equations to align the following token to the
mark, like so
.EQy + 1 mark = ax + b.EN.EQz lineup = v sup 2 + sqrt {4 over 5}.EN
The blank line between the two equation blocks is
important here. Otherwise, groff will set
them both on the same line.
More alignment using tabs
If you need more than one alignment marker, you can achieve this using tabs.
.EQx sub 1 \t+ x sub 2 \t+ 2x sub 3 \t= y.EN.EQz sub 1 \t+ z sub 2 \t+ 3z sub 3 \t= u.EN
where the \t character represents a
tab. You need to use a physical tab character, which
may require disabling tab expansion in your editor.
The .ta request allows you to specify the
custom locations for tabstops if needed, but I haven't
tinkered with this enough to get it really figured out.
Custom Operators
One limitation of eqn is that
it has a limited set of built-in operators. This is
not as big a deal as it may seem, though, because
eqn makes it very easy
to define your own. eqn
uses standard groff character escapes,
and so most mathematical symbols you need can be
easily accessed this way. You can get a list of the
available symbols, and their escape sequences, by reading
groff_char(7) in the manual. As an example,
the set membership operator,
, is available
as \[mo]. You can write
as,
.EQx^\[mo]^X.EN
These character escapes are a lot to remember though,
and I find them and their accompanying spacing characters
ugly. Instead, I usually define my own operator keywords
and map them to the symbol. You can do this using eqn's define keyword,
.EQdefine in %^\[mo]^%.EN
The % signs here are an arbitrary delimiter
around the value; you can use whatever symbol you
want. The purpose of this is to allow you to use quotation
marks, braces, etc., within the value of the alias.
You can use a delimiter that doesn't conflict with
characters you want to in the value itself.
Once eqn processes the above
define, any instances of in
will be replaced by ^\[mo]^ within equation
blocks and inline equations.
.EQx in X.EN
There are some mathematical symbols that you still
can't get this way, such as blackboard bold or
calligraphic letters. These are possible to use with
eqn by adding additional fonts,
but that is a story for another day.
Conclusion
There's a bit more to using eqn,
but not much. This post covers most of what you need to
know. That's the benefit of a simple system like this. It
isn't as powerful as LaTex's mathematical typesetting,
but it's faster to learn and can still do the majority
of what LaTex can. I find that I like the syntax of
eqn quite a bit more than
LaTex, and have been using it, and groff,
a fair bit lately to prepare documents for classes
that I'm teaching. LaTex is still king when it comes to
conference/journal papers, but I'm having a lot of fun
with groff for lighter use cases.
References
There are a few important and useful resources for eqn. Here are the ones that I've used
to learn it,
- The man page:
eqn(1). Also,groff_char(7)is very useful. - A Guide to Typesetting Mathematics with GNU eqn
- UNIX Text Processing, Chapter 9