An Introduction to groff

A surprisingly handy, old-school formatting system

roff, short for “run off”, is an old-school document formatting language1 that is readily available for Linux systems in the form of GNU roff (groff), troff, and nroff. It’s significantly lighter-weight than something like LaTex, but still allows for fairly complex document formatting tasks. It is also used to render man pages, so it’s worth learning for that alone if you’re interested in Linux. Beyond manuals, it can be used for formatting books, papers, etc. You can render your document in a variety of different formats, including HTML, PostScript2, and pdf.

As a classic Unix utility, roff itself is implemented through several distinct programs that are called one after another in a pipeline to produce the desired output document. I’m going to address roff initially in terms of these individual programs and pipelining, however we will transition to the more modern approach of using groff instead to do much of the work in one command.

The roff Pipeline

roff comes in many flavours. Historically, the major two were troff(1) and nroff(1). troff(1) was used to produce output for typesetters, and nroff(1) for computer terminals. These programs both accept roff code as input, and output a standardized intermediate format for further processing. In modern contexts, these two programs are merged together into the groff(1) front-end. We’ll start out using troff(1) standalone, though.

A general pipeline for creating a document using troff(1) is,

% cat input.tr | preprocessor | troff | postprocessor > output.ext

In this example, input.tr is the source file describing the document. Preprocessors are programs that handle more complex formatting tasks like equations and tables, which are not actually part of the roff engine itself. The common preprocessors are,

  • eqn (for equations)
  • grn (for pictures)
  • pic (for simple block-diagrams [think mermaid])
  • chem (for chemical diagrams)
  • refer (for bibliographies and references)
  • tble (for tables)

We won’t be discussing these in this article, but information on them may come later. These preprocessors generate roff code, which is then converted into an intermediate output by the troff(1) program. Finally, this intermediate output is fed into a postprocessor, to produce output of the desired format.

Getting Started with troff

troff itself is a very simple program to use. It will accept text on stdin and will write the intermediate output to stdout.For example, we could simply run the program like so,

% troff
hello, world!
^D

and we get the following as output,

x T ps
x res 72000 1 1
x init
p1
x font 5 TR
f5
s10000
V12000
H72000
md
DFd
thello,
wh2500
tw
H104120
torld!
n12000 0
x trailer
V792000
x stop

Of course, why manually type the data into standard input when you can use a file. troff(1) will accept content on standard input, or you can specify a filename as an argument,

% echo hello, world! > hello.tr
% troff hello.tr

Postprocessing: Creating a pdf

Now, we could easily save this output to a file using redirection, but if our purpose is to produce a document for distribution, this probably isn’t what we want to do. The roff intermediate output isn’t exactly suitable for end-user consumption! We’ll need to run our file through a postprocessor first, to create a readable document.

We’ll create a pdf file here, and so we will use the gropdf(1) postprocessor. gropdf(1) accepts troff(1) output on stdin, and writes a pdf to stdout. If you run,

% troff hello.tr | gropdf

you will get a viable pdf file, however gropdf(1) will also display the following error message,

Expecting a pdf pipe (got ps)

The problem is that we need to give troff(1) a heads up that we intend to target our document to a pdf file. This is done using the -Tpdf argument. If we don’t do this, troff(1) will assume that we’re targeting PostScript, and so the file won’t be set up appropriately for gropdf(1) to do its thing. Thus, our final command is,

% troff -Tpdf hello.tr | gropdf > output.pdf

I’ve uploaded the output file from this command here, so you can see what the output should be. Opening it up in Evince yields,

Opening output.pdf in Evince
Opening output.pdf in Evince

Admittedly, it isn’t the most impressive of documents. But it is a start!

Skipping the Pipelines with groff

One common complaint about the roff system is how long the processing pipelines can get. It’s already a fair bit of typing to make our pdf above–now imagine if you wanted equations, pictures, and tables in your document too. That’s three more commands we need to add to the pipeline!

As a result, the modern convention is to bypass all this pipelining by using the groff(1) front-end to roff. This program will allow us to specify options telling it to run certain preprocessors or postprocessors, and thereby allowing us to skip writing the long pipelines.

To create our output pdf using groff(1), we need only execute the following command,

% groff -Tpdf hello.tr > output.pdf

When the -T option is specified, groff(1) will automatically call the relevant postprocessor, so we don’t need to! If no -T option is provided, groff(1) will default to PostScript output.

In recent groff(1) references, I have seen a different process listed for creating a pdf: using the ps2pdf(1) command. This command takes PostScript input and produces a pdf from it. By taking advantage of the fact that groff(1) defaults to producing PostScript, a pdf can be created using,

% groff hello.tr | ps2pdf - output.pdf

I’m not entirely sure why these authors use this approach, rather than the postprocessor for creating a pdf directly, but I want to mention it here as you are sure to encounter it in your searches.

Requests and Macros

roff code consists of a sequence of lines, which can be classified as either text lines or control lines. A text line, unsurprisingly, contains text that should appear in the document. A control line represents a command, and will always start with either a . or a ’ , followed by a command and any arguments that it might have. These control lines are what are used to format the document.

The standard format of a control line is,

.command_name arg1 arg2

The most common kind of command is a request, which is a low-level formatting or control directive. roff requests allow you to do a variety of things, such as controlling the kerning, adding space, coloring glyphs, etc.

Generally speaking, these requests are too low level to be useful to somebody just trying to write a paper for their English class (sure, you can do that in roff, why not?), and so roff also includes macros. These are a bit like functions, and allow you to do simple things like “make that bold” or “indent this line” or “center that text”, without having to worry about super low-level details of typesetting.

Technically, you could write these yourself out of roff requests. And if you’d like to give it a shot, the groff(7) man page does include a reference for all the supported requests (make sure to run man 7 groff to get the right document), but groff(1) also comes with a number of pre-packaged macro packages that you can use.

  • man
  • mandoc
  • mdoc
  • me
  • mm
  • ms
  • www

Others are available too, such as mom.

While each is a little different, many share some common elements when it comes to basic formatting. I’ll use the me package for the rest of this article.3 To load the package, pass the -me argument to groff(1).

Formatting hello.tr

As a first application of formatting, let’s modify our hello.tr file to include bold and italics. We’ll render “hello,” in bold, and “world!” in italics. The macros for bold and italics are, unsurprisingly, .b and .i respectively. So, we have,

% cat formatted.me
.b hello,
.i world!
% groff -me -Tpdf formatted.me > formatted.pdf

To center the text, we can use the .(c and .)c macros. These should surround the text to be centered, as below,

% cat centered.me
.(c
.b hello,
.i world!
.)c
% groff -me -Tpdf centered.me > centered.pdf

A listing of many common me macros can be found by reading the groff_me(7) man page. I’ll introduce a few more as we go, but you should look there for a more complete list.

Formatting a Simple Paper

Applying some simple formatting directives to a couple of words is a good start, but applying it to a larger paper is a bit more complex. So let’s give that a shot. Let’s grab the first bit of text from This Side of Paradise, courtesy of Project Gutenberg4. In raw form, we have

      BOOK ONE—The Romantic Egotist

      CHAPTER 1. Amory, Son of Beatrice


      Amory Blaine inherited from his mother every trait, except the
      stray inexpressible few, that made him worth while. His father,
      an ineffectual, inarticulate man with a taste for Byron and a
      habit of drowsing over the Encyclopedia Britannica, grew wealthy
      at thirty through the death of two elder brothers, successful
      Chicago brokers, and in the first flush of feeling that the world
      was his, went to Bar Harbor and met Beatrice O’Hara. In
      consequence, Stephen Blaine handed down to posterity his height
      of just under six feet and his tendency to waver at crucial
      moments, these two abstractions appearing in his son Amory. For
      many years he hovered in the background of his family’s life, an
      unassertive figure with a face half-obliterated by lifeless,
      silky hair, continually occupied in “taking care” of his wife,
      continually harassed by the idea that he didn’t and couldn’t
      understand her.

      But Beatrice Blaine! There was a woman! Early pictures taken on
      her father’s estate at Lake Geneva, Wisconsin, or in Rome at the
      Sacred Heart Convent—an educational extravagance that in her
      youth was only for the daughters of the exceptionally
      wealthy—showed the exquisite delicacy of her features, the
      consummate art and simplicity of her clothes. A brilliant
      education she had—her youth passed in renaissance glory, she was
      versed in the latest gossip of the Older Roman Families; known by
      name as a fabulously wealthy American girl to Cardinal Vitori and
      Queen Margherita and more subtle celebrities that one must have
      had some culture even to have heard of. She learned in England to
      prefer whiskey and soda to wine, and her small talk was broadened
      in two senses during a winter in Vienna. All in all Beatrice
      O’Hara absorbed the sort of education that will be quite
      impossible ever again; a tutelage measured by the number of
      things and people one could be contemptuous of and charming
      about; a culture rich in all arts and traditions, barren of all
      ideas, in the last of those days when the great gardener clipped
      the inferior roses to produce one perfect bud.

Let’s start by bolding and centering the BOOK ONE line.

.(c
.b BOOK ONE - The Romantic Egotist
.)c

The above roff seems reasonable. However, when we run it and take a look we see something rather unexpected,

The text got centered, but of all the words we typed only two of them are present, and of those only the first is bold! What gives?

Well, it has to do with the way that arguments to requests/macros work. Just like arguments on the command line, arguments to requests are separated by spaces. The bold macro makes its first argument bold and appends the second to it (not bold). So, the output that we got should make sense. The other four arguments to the macro were ignored.

Just like on the shell, if we want to have spaces in an argument, we’ll need to wrap the whole thing in quotes. So this,

.(c
.b "BOOK ONE - The Romantic Egotist"
.)c

is actually what we want.

Bearing this lesson in mind, let’s place the chapter title next. We’ll bold the words CHAPTER 1, but leave the name in normal face.

1.(c
2.b "BOOK ONE - The Romantic Egotist"
3.)c
4.b "CHAPTER 1."
5Amory, Son of Beatrice

Notice that roff automatically did a line break after we ended the centering on line 3. Generally, roff will handle line breaks, paragraphs, etc., for us, as long as we tell it where to put them (we’ll see this in just a moment). In fact, the manual advises against leaving any blank lines in your input file.

That will result in a rather ugly and hard to read file, though. If you’ve programmed before, you know that spacing things out is quite useful. So there is a way to do this. Simply start the line with a ., and don’t give a command. So we can space our book and chapter titles out a little bit like this,

1.(c
2.b "BOOK ONE - The Romantic Egotist"
3.)c
4.
5.b "CHAPTER 1."
6Amory, Son of Beatrice

This change is purely cosmetic. It won’t affect the actual output file.

What if we actually wanted to add some blank space in our output file? Those two lines are awfully close together, and it doesn’t look all that great. For this we will use a roff request, .sp. This isn’t a macro, but just looking at the file, you’d never be able to tell the difference.

1.(c
2.b "BOOK ONE - The Romantic Egotist"
3.)c
4.
5.sp 2
6.
7.b "CHAPTER 1."
8Amory, Son of Beatrice

This request accepts a numerical argument that states how many lines to skip, so here we are adding two blank lines between the book and chapter title. However, although this works, it does go against the manual for groff_me(7), which states that the .sp request can only be safely used with me after the first call to .pp (the paragraph macro).

The trouble is that a .pp will result in the title getting indented, which we don’t want. However there is also an .lp macro, which creates a paragraph with no indent. The documentation doesn’t explicitly say that this is safe, but if things work the way I think they do, it should be just as good. Just be aware that this is a “your mileage may vary” moment. I’ve never had it cause an issue, but we are going against the direct advice of the manual here.

 1.lp
 2.(c
 3.b "BOOK ONE - The Romantic Egotist"
 4.)c
 5.
 6.sp 2
 7.
 8.lp
 9.b "CHAPTER 1."
10Amory, Son of Beatrice

While I was at it, I threw another .lp before the chapter title, just to be explicit. If you haven’t gathered already, the .lp and .pp macros are what we will be using to add paragraph breaks in general. They do add a small amount of extra blank space as well. If it isn’t obvious from this example of their use, you’ll be able to see it once we start adding extra paragraphs.

Okay, let’s bring in the first paragraph of text. We’ll use the .lp macro to start this one off, as this will be the first paragraph in the chapter, and I prefer to leave this unindented. The .pp macro will work much the same, except it will add an indent. We’ll see this one in action for the second paragraph.

 1.lp
 2.(c
 3.b "BOOK ONE - The Romantic Egotist"
 4.)c
 5.
 6.sp 2
 7.
 8.lp
 9.b "CHAPTER 1."
10Amory, Son of Beatrice
11.
12.lp
13Amory Blaine inherited from his mother every trait, except the
14stray inexpressible few, that made him worth while. His father,
15an ineffectual, inarticulate man with a taste for Byron and a
16habit of drowsing over the Encyclopedia Britannica, grew wealthy
17at thirty through the death of two elder brothers, successful
18Chicago brokers, and in the first flush of feeling that the world
19was his, went to Bar Harbor and met Beatrice O’Hara. In
20consequence, Stephen Blaine handed down to posterity his height
21of just under six feet and his tendency to waver at crucial
22moments, these two abstractions appearing in his son Amory. For
23many years he hovered in the background of his family’s life, an
24unassertive figure with a face half-obliterated by lifeless,
25silky hair, continually occupied in “taking care” of his wife,
26continually harassed by the idea that he didn’t and couldn’t
27understand her.

A cursory glance would seem to indicate that this all looks okay, however if you look carefully you’ll see that there are a few issues. For example, we see the word didnât, instead of didn’t.

This is a really common problem when dealing with systems like this when you copy and paste text into them, instead of typing it. If you look closely, all of the issues appear where there is either a single quote or a double quote in the input. It’s a simple encoding problem.

When you press the quotation mark key on your keyboard, it corresponds to the character,

"

however, if you look closely, you’ll see that in the input we actually have the character

The difference is subtle, but it is enough to confuse roff. Let’s change all the “fancy” quotation marks for normal ones and try again.

And that one worked. In fact, if you look at it, all of the single quotes are rendered as the curly ones anyway. Groff will automatically handle translating from the standard “straight” single quotes into curly ones. However, the double quotes are still straight and (some might say) boring.

Luckily, we can fix this too. If we replace the first quotation mark with `` and the second with ’’ (that is, two backticks for the first and two single quotes for the second), groff(1) will give us the curly quotes there too! It just needs to know which one is the opening quote (noted by the backticks) and which is the closing one (noted by the single quotes), so it knows what the quotes should look like.

While we are making edits, this document actually does not meet the recommendations in the manual for groff(7). Just like the manual recommends that we avoid adding blank lines to the file, it also recommends that we start each new sentence on its own line. This won’t actually change the output (again, it’s just cosmetic), but it might help us stay organized. So let’s make that edit too,

 1.lp
 2.(c
 3.b "BOOK ONE - The Romantic Egotist"
 4.)c
 5.
 6.sp 2
 7.
 8.lp
 9.b "CHAPTER 1."
10Amory, Son of Beatrice
11.
12.lp
13Amory Blaine inherited from his mother every trait, except the
14stray inexpressible few, that made him worth while. His father,
15an ineffectual, inarticulate man with a taste for Byron and a
16habit of drowsing over the Encyclopedia Britannica, grew wealthy
17at thirty through the death of two elder brothers, successful
18Chicago brokers, and in the first flush of feeling that the world
19was his, went to Bar Harbor and met Beatrice O'Hara. 
20.
21In consequence, Stephen Blaine handed down to posterity his height of
22just under six feet and his tendency to waver at crucial moments, these
23two abstractions appearing in his son Amory. 
24.
25For many years he hovered in the background of his family's life, an
26unassertive figure with a face half-obliterated by lifeless, silky hair,
27continually occupied in ``taking care'' of his wife, continually harassed
28by the idea that he didn't and couldn't understand her.
29.pp

Okay, so far so good! Now, let’s add the next paragraph of text. I’ve already added the .pp macro in the code above in preparation. So let’s continue. I’ll go ahead and replace all the quotes, and put a line break between sentences, in advance this time.

 1.lp
 2.(c
 3.b "BOOK ONE - The Romantic Egotist"
 4.)c
 5.
 6.sp 2
 7.
 8.lp
 9.b "CHAPTER 1."
10Amory, Son of Beatrice
11.
12.lp
13Amory Blaine inherited from his mother every trait, except the
14stray inexpressible few, that made him worth while. His father,
15an ineffectual, inarticulate man with a taste for Byron and a
16habit of drowsing over the Encyclopedia Britannica, grew wealthy
17at thirty through the death of two elder brothers, successful
18Chicago brokers, and in the first flush of feeling that the world
19was his, went to Bar Harbor and met Beatrice O'Hara. 
20.
21In consequence, Stephen Blaine handed down to posterity his height of
22just under six feet and his tendency to waver at crucial moments, these
23two abstractions appearing in his son Amory. 
24.
25For many years he hovered in the background of his family's life, an
26unassertive figure with a face half-obliterated by lifeless, silky hair,
27continually occupied in ``taking care'' of his wife, continually harassed
28by the idea that he didn't and couldn't understand her.
29.pp
30But Beatrice Blaine! 
31.
32There was a woman! 
33.
34Early pictures taken on her father's estate at Lake Geneva, Wisconsin, or in
35Rome at the Sacred Heart Convent--an educational extravagance that in her youth
36was only for the daughters of the exceptionally wealthy-showed the exquisite
37delicacy of her features, the consummate art and simplicity of her clothes. 
38.
39A brilliant education she had--her youth passed in renaissance glory, she was
40versed in the latest gossip of the Older Roman Families; known by name as a
41fabulously wealthy American girl to Cardinal Vitori and Queen Margherita and
42more subtle celebrities that one must have had some culture even to have heard
43of. 
44.
45She learned in England to prefer whiskey and soda to wine, and her small talk
46was broadened in two senses during a winter in Vienna. 
47.
48All in all Beatrice O'Hara absorbed the sort of education that will be quite
49impossible ever again; a tutelage measured by the number of things and people
50one could be contemptuous of and charming about; a culture rich in all arts and
51traditions, barren of all ideas, in the last of those days when the great
52gardener clipped the inferior roses to produce one perfect bud.
53.pp

Conclusion

And there we have it! Obviously, we’ve barely scratched the surface of roff, and I intend to write a few more articles about it to further explore its features, but this should be enough to get you started! You now know enough roff to handle most non-technical writing tasks, like simple school papers. You also know how to find the manual pages for the different macro packages, which should give you most of what you need to know to use them.

Other References

While the man pages are fairly detailed for groff(1) and its component parts, they are still man pages, and may not be terribly approachable to somebody without any background. They are also a bit of a tangled mess, with information scattered across the man pages for several different programs, and thus tricky to navigate.

Unfortunately, time has not been kind to roff, and there aren’t a lot of resources that I’ve been able to track down on it outside of those man pages. However, here are a few other sources that I did track down that might be of interest.

  1. Hall, J. (2018). How to format academic papers on Linux with groff -me. Open Source. https://opensource.com/article/18/2/how-format-academic-papers-linux-groff-me

  2. Arora, H. (2012). Linux Groff Command Examples to Create Formatted Document. The Geek Stuff. https://www.thegeekstuff.com/2012/09/linux-groff-command-examples/

  3. Kernighan, B. W., & Pike, R. (1984). The UNIX Programming Environment. Prentice Hall.

  4. Shotts, W. (2019). The Linux Command Line (2nd ed.). No Starch Press.

The UNIX Programming Environment has a rather comprehensive coverage of several macro packages, as well as some of the preprocessors. It is dated, but most of the content is just as applicable today as it was in the ’80s.

The Linux Command Line covers many aspects of command line Linux, including groff(1). However the coverage specifically on groff(1) is minimal; it feels like it was just added to “check a box”, so to speak. Don’t buy this book for its groff(1) content. That said, it is a pretty good book to have around if you’re just learning Linux.


  1. If you’re interested in more background information on roff, you can readily find it in the man page, of all places. Run

    % man roff
    

    and read over the first section or two. ↩︎

  2. Postscript is a programming language created at Adobe for describing documents, with commands for drawing lines and such. Dr. Brailsford has a pretty good video on this language, published via Computerphile on YouTube, if you’re interested. For our purposes here, we don’t care about it beyond knowing it is the output of groff. ↩︎

  3. If you are interested in reading about a different package, they have man pages. Simply run,

    % man groff_* 
    

    where * is replaced by the name of the macro package, to acces the documentation for that package. ↩︎

  4. Fitzgerald, F. S. (1920). This Side of Paradise. Project Gutenberg. https://www.gutenberg.org/ebooks/805 ↩︎