Text processing vs word processors

When you use a word processor, your screen persistently displays an updated image of the finished document. Word for word, line for line, What You See Is What You Get.

WYSIWYG eases the creation of simple documents by combining two related but discrete activities: writing and formatting. Longer, more complex documents reveal the disadvantages: awkward, menu-driven formatting; typography that’s cumbersome to tweak; defaults that can’t be changed; repetitive edits required for document-wide style changes; poor document and page navigation, and so on.

Unfortunately, word processors are so pervasive many people aren’t aware of the alternative for preparing documents: text processing.

Text processing: the right tool

In keeping with the computing maxims “the right tool for the job” and “small tools that do one thing and do it well,” text processing keeps text and processing separate. Documents are written in a text editor, processed by a formatting program and previewed, if necessary, in an application like Okular or Evince.

There are a number of advantages to text processing. First and foremost is that you get to use a text editor for the writing itself. Mature text editors are powerful progams whose function is to ease the manipulation of and navigation through text at all stages of the writing process, whether the final result will be a piece of software or a novel. To this end, they provide sophisticated editing tools far beyond the basic search-and-replace, cut-and-paste of most word processors.

Focus on what really matters: the words

Another advantage of text processing is that it lets you focus exclusively on content during the writing phase of document preparation. Formatting commands (macros) can be added afterwards. What’s more, if you take the time to learn the sed program that’s part of every flavour of Unix and every GNU/Linux distribution, you don’t have to introduce formatting commands into your original document at all, since sed can be made to do the work.

Global changes

(sorry, no effect on warming)

One of the key reasons to prefer text processing over word processors is that if you need to make global changes to the formatting, it can be done without repetitive edits. Two mechanisms make this possible.

First, your source file is plain text, including the macros used to format it. Thus, your text editor’s search-and-replace function can be used to make make formatting changes.

Imagine this situation: you want to change all instances of italic to bold. With a word processor, you’d have to find each occurrence, highlight it, and change the italics to bold one occurrence at a time. Editing a plain text file, one search-and-replace command is all it takes. (And since all modern text editors search/replace using regular expressions, you can even change just some of the italics and leave others intact.) Depending on the size of the document, the time saved could be anywhere from a few minutes to a few hours.

The second efficient mechanism for making global changes is that with a macro set like mom, you can define the style for every element of a document separately from the element itself. In other words, create a stylesheet. Changes to a document’s design can thus be done in one place and are applied throughout the whole document automatically. The flexibility this offers with respect to document design far exceeds what can be accomplished with a word processor.

Peace of mind

Underlying the text processing model is the notion of a single source file from which differently formatted versions can be generated.

As a novelist, I frequently need updated copies of my writing available in different styles: typewritten, double-spaced (for submissions); plain text (to include in emails); typeset hardcopies (for critiquing); pdf versions (specially formatted for reading at a monitor); press-ready files with crop and register marks.

My work is always in a state of flux, undergoing countless revisions. If I used a word processor, every time I made a change I’d have to make it in a batch of files in order to keep the differently formatted versions in synch. With text processing, I need only ever make the changes in one file. The peace of mind this brings is even more valuable than the time it saves.

See here for examples of multiple formatted versions produced from the same source file.

Do you really want to risk your Pulitzer hopeful?

The most compelling advantage of text processing over word processors is the assured readability of a document’s content—the words themselves—in perpetuity, on all computing platforms.

Almost everyone who’s used word processors for any length of time knows the horror of discovering that a document they created a couple of years ago in Microsoft Word won’t open in the current version, or that something they wrote in WordPerfect won’t open in Open Office.

The cause of the problem is that most word processors save their files in formats that are described as binary and proprietary.

Binary means the file is composed of machine-coded instructions that describe both the content and the appearance of a document, all jumbled together.

Proprietary means that someone actually owns the coding-decoding routines, and unless those routines are released to the public, you’re locked into using the owner’s software (sometimes only former versions of the software) to open files reliably—and then only on supported platforms.

Putting this in perspective: the first thirty chapters of the Pulitzer hopeful you started in Microsoft Word under Windows 98 may very well be unretrievable now, even if you’re still running Windows and using Microsoft Word.

Losing the formatting of your work would be an annoyance; losing the content would be a tragedy.

Since text processing begins with plain text files, the content of your work never gets messed up in unreadable code. Your words, as you wrote them, remain inviolable. And since plain text is impervious to the passage of time, platform considerations and the predations of software vendors, the content of your documents—the part that really matters—will never be lost.