An Introduction to roff
A surprisingly handy, old-school formatting system
roff, short for “run off”, is an old-school document formatting language1 that is readily available for Linux systems in the form of GNU roff (groff). It is very old school, and can feel a bit clunky, but I find that it is a useful tool. It is used to format man pages, so that in and of itself is a good reason to learn a little about it. Beyond just that, though, it can be used for formatting books, papers, etc. You can render your document in a variety of different formats, including HTML, PostScript2, and pdf.
As an old-school Unix utility, roff itself is implemented through several distinct programs that are called one after another in a pipeline to produce the desired output document. I’m going to address roff initially in terms of these individual programs and pipelining, however we will transition to the more modern approach of using groff instead to do much of the work in one command.
The roff Pipeline
The bulk of the work in a roff system is done by one of two programs: troff and nroff. These programs accept roff code as input, and output a standardized intermediate format for further processing. In modern contexts, troff is used to create documents, and nroff is used to produce output suitable for display on a terminal (like a man page). We’ll focus on creating documents with troff, though content on man pages will follow.
A general pipeline for creating a document using troff is,
% cat input.tr | preprocessor | troff | postprocessor > output.ext
In this example,
input.tr is the source file describing the document.
Preprocessors are programs that handle more complex formatting tasks like
equations and tables, which are not actually part of the roff engine itself.
The common preprocessors are,
- eqn (for equations)
- grn (for pictures)
- pic (for simple block-diagrams [think mermaid])
- chem (for chemical diagrams)
- refer (for bibliographies and references)
- tble (for tables)
We won’t be discussing these in this article, but information on them may come later. These preprocessors generate roff code, which is then converted into an intermediate output by the troff program. Finally, this intermediate output is fed into a postprocessor, to produce output of the desired format.
Getting Started with troff
troff itself is a very simple program to use. It will accept text on stdin and will write the intermediate output to stdout.For example, we could simply run the program like so,
% groff hello, world! ^D
and we get the following as output,
x T ps x res 72000 1 1 x init p1 x font 5 TR f5 s10000 V12000 H72000 md DFd thello, wh2500 tw H104120 torld! n12000 0 x trailer V792000 x stop
Of course, why manually type the data into standard input when you can use input redirection or piping to get it flowing. Conventionally, cat is used to initiate a pipeline, so let’s do that,3
% echo hello, world! > hello.tr % cat hello.tr | troff
Postprocessing: Creating a pdf
Now, we could pretty easily save this output to a file using redirection, but if our purpose is to produce a document for distribution, this probably isn’t what we want to do. The roff intermediate output isn’t exactly suitable for end-user consumption! We’ll need to run our file through a postprocessor first, to create a readable document.
We’ll create a pdf file here, and so we will use the gropdf postprocessor. gropdf accepts troff output on stdin, and writes a pdf to stdout. If you run,
% cat hello.tr | troff | gropdf
you will get a viable pdf file, however gropdf will also display the following error message,
Expecting a pdf pipe (got ps)
The problem is that we need to give troff a heads up that we intend to target our document to a pdf file. This is done using the -Tpdf argument to troff. If we don’t do this, troff will assume that we’re targeting PostScript (for printing), and so the file won’t be set up appropriately for gropdf to do its thing. Thus, our final command is,
% cat hello.tr | troff -Tpdf | gropdf > output.pdf
I’ve uploaded the output file from this command here, so you can see what the output should be. Opening it up in Evince yields,
Admittedly, it isn’t the most impressive of documents. But it is a start!
Skipping the Pipelines with groff
One common complaint about the roff system is how long the processing pipelines can get. It’s already a fair bit of typing to make our pdf above–now imagine if you wanted equations, pictures, and tables in your document too. That’s three more commands we need to add to the pipeline!
As a result, the modern convention is to bypass all this pipelining by using the groff front-end to roff. This program will allow us to specify options telling it to run certain preprocessors or postprocessors, and thereby allowing us to skip writing the long pipelines.
To create our output pdf using groff, we need only execute the following command,
% groff -Tpdf hello.tr > output.pdf
When the -T option is specified, groff will automatically call the relevant postprocessor, so we don’t need to! If no -T option is provided, groff will default to PostScript ouput.
In recent groff references, I have seen a different process listed for creating a pdf: using the ps2pdf command. This command takes PostScript input and produces a pdf from it. By taking advantage of the fact that groff defaults to producing PostScript, a pdf can be created using,
% groff hello.tr | ps2pdf - output.pdf
I’m not entirely sure why these authors use this approach, rather than roff’s postprocessor for creating a pdf directly, but I want to mention it here as you are sure to encounter it in your searches.
Requests and Macros
roff code consists of a sequence of lines, which can be classified as either text lines or control lines. A text line, unsurprisingly, contains text that should appear in the document. A control line represents a command, and will always start with either a . or a ' , followed by a command and any arguments that it might have. These control lines are what are used to format the document.
The standard format of a control line is,
.command_name arg1 arg2
The most common kind of command is a request, which is a low-level formatting or control directive. roff requests allow you to do a variety of things, such as controlling the kerning, adding space, coloring glyphs, etc.
Generally speaking, these requests are too low level to be useful to somebody just trying to write a paper for their English class (sure, you can do that in roff, why not?), and so roff also includes macros. These are a bit like functions, and allow you to do simple things like “make that bold” or “indent this line” or “center that text”, without having to worry about super low-level details of typesetting.
Technically, you could write these yourself out of roff requests. And if you’d
like to give it a shot, the groff man page does include a reference for all the
supported requests (make sure to run
man 7 groff to get the right document),
but groff also comes with a number of macro packages that you can use.
Others are available too, such as mom.
While each is a little different, many share some common elements when it comes to basic formatting. I’ll use the me package for the rest of this article.4 To load the package, pass the -me argument to groff.
As a first application of formatting, let’s modify our hello.tr file to include bold and italics. We’ll render “hello,” in bold, and “world!” in italics. The macros for bold and italics are, unsurprisingly, .b and .i respectively. So, we have,
% cat formatted.me .b hello, .i world! % groff -me -Tpdf formatted.me > formatted.pdf
To center the text, we can use the .(c and .)c macros. These should surround the text to be centered, as below,
% cat centered.me .(c .b hello, .i world! .)c % groff -me -Tpdf centered.me > centered.pdf
A listing of many common me macros can be found by reading the groff_me man page. I’ll introduce a few more as we go, but you should look there for a more complete list.
Formatting a Simple Paper
Applying some simple formatting directives to a couple of words is a good start, but applying it to a larger paper is a bit more complex. So let’s give that a shot. Let’s grab the first bit of text from This Side of Paradise, courtesy of Project Gutenberg5. In raw form, we have
BOOK ONE—The Romantic Egotist CHAPTER 1. Amory, Son of Beatrice Amory Blaine inherited from his mother every trait, except the stray inexpressible few, that made him worth while. His father, an ineffectual, inarticulate man with a taste for Byron and a habit of drowsing over the Encyclopedia Britannica, grew wealthy at thirty through the death of two elder brothers, successful Chicago brokers, and in the first flush of feeling that the world was his, went to Bar Harbor and met Beatrice O’Hara. In consequence, Stephen Blaine handed down to posterity his height of just under six feet and his tendency to waver at crucial moments, these two abstractions appearing in his son Amory. For many years he hovered in the background of his family’s life, an unassertive figure with a face half-obliterated by lifeless, silky hair, continually occupied in “taking care” of his wife, continually harassed by the idea that he didn’t and couldn’t understand her. But Beatrice Blaine! There was a woman! Early pictures taken on her father’s estate at Lake Geneva, Wisconsin, or in Rome at the Sacred Heart Convent—an educational extravagance that in her youth was only for the daughters of the exceptionally wealthy—showed the exquisite delicacy of her features, the consummate art and simplicity of her clothes. A brilliant education she had—her youth passed in renaissance glory, she was versed in the latest gossip of the Older Roman Families; known by name as a fabulously wealthy American girl to Cardinal Vitori and Queen Margherita and more subtle celebrities that one must have had some culture even to have heard of. She learned in England to prefer whiskey and soda to wine, and her small talk was broadened in two senses during a winter in Vienna. All in all Beatrice O’Hara absorbed the sort of education that will be quite impossible ever again; a tutelage measured by the number of things and people one could be contemptuous of and charming about; a culture rich in all arts and traditions, barren of all ideas, in the last of those days when the great gardener clipped the inferior roses to produce one perfect bud.
Let’s start by bolding and centering the BOOK ONE line.
.(c .b BOOK ONE - The Romantic Egotist .)c
The above roff seems reasonable. However, when we run it and take a look we see something rather unexpected,
The text got centered, but of all the words we typed only two of them are present, and of those only the first is bold! What gives?
Well, it has to do with the way that arguments to requests/macros work. Just like arguments on the command line, arguments to requests are separated by spaces. The bold macro makes its first argument bold and appends the second to it (not bold). So, the output that we got should make sense. The other four arguments to the macro were ignored.
Just like on the shell, if we want to have spaces in an argument, we’ll need to wrap the whole thing in quotes. So this,
.(c .b "BOOK ONE - The Romantic Egotist" .)c
is actually what we want.
Bearing this lesson in mind, let’s place the chapter title next. We’ll bold the words CHAPTER 1, but leave the name in normal face.
Notice that roff automatically did a line break after we ended the centering on line 3. Generally, roff will handle line breaks, paragraphs, etc., for us, as long as we tell it where to put them (we’ll see this in just a moment). In fact, the manual advises against leaving any blank lines in your input file.
That will result in a rather ugly and hard to read file, though. If you’ve programmed before, you know that spacing things out is quite useful. So there is a way to do this. Simply start the line with a ., and don’t give a command. So we can space our book and chapter titles out a little bit like this,
This change is purely cosmetic. It won’t affect the actual output file.
What if we actually wanted to add some blank space in our output file? Those two lines are awfully close together, and it doesn’t look all that great. For this we will use a roff request, .sp. This isn’t a macro, but just looking at the file, you’d never be able to tell the difference.
This request accepts a numerical argument that states how many lines to skip, so here we are adding two blank lines between the book and chapter title. However, although this works, it does go against the manual for groff_me, which states that the .sp request can only be safely used with me after the first call to .pp (the paragraph macro).
The trouble is that a .pp will result in the title getting indented, which we don’t want. However there is also an .lp macro, which creates a paragraph with no indent. The documentation doesn’t explicitly say that this is safe, but if things work the way I think they do, it should be just as good. Just be aware that this is a “your mileage may vary” moment. I’ve never had it cause an issue, but we are going against the direct advice of the manual here.
While I was at it, I threw another .lp before the chapter title, just to be explicit. If you haven’t gathered already, the .lp and .pp macros are what we will be using to add paragraph breaks in general. They do add a small amount of extra blank space as well. If it isn’t obvious from this example of their use, you’ll be able to see it once we start adding extra paragraphs.
Okay, let’s bring in the first paragraph of text. We’ll use the .lp macro to start this one off, as this will be the first paragraph in the chapter, and I prefer to leave this unindented. The .pp macro will work much the same, except it will add an indent. We’ll see this one in action for the second paragraph.
A cursory glance would seem to indicate that this all looks okay, however if you look carefully you’ll see that there are a few issues. For example, we see the word didnât, instead of didn’t.
This is a really common problem when dealing with systems like this when you copy and paste text into them, instead of typing it. If you look closely, all of the issues appear where there is either a single quote or a double quote in the input. It’s a simple encoding problem.
When you press the quotation mark key on your keyboard, it corresponds to the character,
however, if you look closely, you’ll see that in the input we actually have the character
The difference is subtle, but it is enough to confuse roff. Let’s change all the “fancy” quotation marks for normal ones and try again.
And that one worked. In fact, if you look at it, all of the single quotes are rendered as the curly ones anyway. Groff will automatically handle translating from the standard “straight” single quotes into curly ones. However, the double quotes are still straight and (some might say) boring.
Luckily, we can fix this too. If we replace the first quotation mark with `` and the second with '’ (that is, two backticks for the first and two single quotes for the second), groff will give us the curly quotes there too! It just needs to know which one is the opening quote (noted by the backticks) and which is the closing one (noted by the single quotes), so it knows what the quotes should look like.
While we are making edits, this document actually does not meet the reccomendations in the manual for groff. Just like the manual reccomends that we avoid adding blank lines to the file, it also reccomends that we start each new sentence on its own line. This won’t actually change the output (again, it’s just cosmetic), but it might help us stay organized. So let’s make that edit too,
Okay, so far so good! Now, let’s add the next paragraph of text. I’ve already added the .pp macro in the code above in preperation. So let’s continue. I’ll go ahead and replace all the quotes, and put a line break between sentences, in advance this time.
And there we have it! Obviously, we’ve barely scratched the surface of roff, and I intend to write a few more articles about it to further explore its features, but this should be enough to get you started! You now know enough roff to handle most non-technical writing tasks, like simple school papers. In the next article on this topic, we’ll expand that idea, and develop an APA formatted paper using nothing but groff, so stay tuned for that if you’re interested!
While the man pages are fairly detailed for groff and its component parts, they are still man pages, and may not be terribly approachable to somebody without any background. They are also a bit of a tangled mess, with information scattered across the man pages for several different programs, and thus tricky to navigate.
Unfortunately, time has not been kind to roff, and there aren’t a lot of resources that I’ve been able to track down on it outside of those man pages. However, here are a few other sources that I did track down that might be of interest.
Hall, J. (2018). How to format academic papers on Linux with groff -me. Open Source. https://opensource.com/article/18/2/how-format-academic-papers-linux-groff-me
Arora, H. (2012). Linux Groff Command Examples to Create Formatted Document. The Geek Stuff. https://www.thegeekstuff.com/2012/09/linux-groff-command-examples/
Kernighan, B. W., & Pike, R. (1984). The UNIX Programming Environment. Prentice Hall.
Shotts, W. (2019). The Linux Command Line (2nd ed.). No Starch Press.
The UNIX Programming Environment has a rather comprehensive coverage of several macro packages, as well as some of the preprocessors. It is dated, but most of the content is just as applicable today as it was in the ’80s.
The Linux Command Line covers many aspects of command line Linux, including groff. However the coverage specifically on groff is minimal; it feels like it was just added to “check a box”, so to speak. Don’t buy this book for its groff content. That said, it is a pretty good book to have around if you’re just learning Linux.
If you’re interested in more background information on roff, you can readily find it in the man page, of all places. Run
% man roff
and read over the first section or two. ↩︎
Postscript is a programming language created at Adobe for describing documents, with commands for drawing lines and such. Dr. Brailsford has a pretty good video on this language, published via Computerphile on YouTube, if you’re interested. For our purposes here, we don’t care about it beyond knowing it is the output of groff. ↩︎
Or, if you’re boring, you can provide the filename directly as an argument to groff and get the same effect,↩︎
% troff hello.tr
If you are interested in reading about a different package, they have man pages. Simply run,
% man groff_*
where * is replaced by the name of the macro package, to acces the documentation for that package. ↩︎