Unexpectedly Simple

This is a story of the pursuit of user experience simplicity, confounded by delusions and over-engineering. It's also about text formatters.

The first computer text formatter, RUNOFF, was written in 1964 in assembly language for the CTSS operating system. If you've never used RUNOFF or one its descendants like the UNIX utility troff, imagine HTML where each tag appears on a line by itself and is identified with a leading period. Want to embolden a word in the middle of a sentence? That's one line to turn bold on, one line for the word, then a third line to turn bold off. This led to elongated documents where the formatter commands gave little visual indication of what the final output would look like.

The RUNOFF command style, of the first character on a line indicating a formatting instruction, carried over to early word processors like WordStar (first released in 1978). But in WordStar that scheme was only used for general settings like double-spacing. You could use control codes mid-line for bold, italics, and underline. This word is in bold: ^Bword^B. (And to be fair you could do this in later versions of troff, too, but it was even more awkward.)

WordPerfect 4.2 for MS-DOS (1986), hid the formatting instructions so the text looked clean, but they could be displayed with a "reveal codes" toggle. I clearly remember thinking this was a terrible system, having to lift the curtain and manually fiddle with command codes. After all, MacWrite and the preceding Bravo for the Xerox Star had already shown that WYSIWYG editing was possible, and clearly it was the simplest possible solution for the user. But I was wrong.

WYSIWYG had drawbacks that weren't apparent until you dove in and worked with it, rather than writing a sentence about unmotivated canines on MacWrite at the local computer shop and trying to justify a $2495 purchase. If you position the cursor at the end of an italicized word and start typing, will the new characters be in italicized or not? It depends. They might not even be in the same font. If you paste a paragraph, and all of a sudden there's excess trailing space below it even though there isn't a carriage return in the text, how do you remove it?

More fundamentally, low-level presentation issues--font families, font sizes, boldness, italics, colors--were now intermingled with the text itself. You don't want to manually change the formatting of all the text serving as section headers; you want them each to be formatted the way a section header should be formatted. That's fixable by adding another layer, one of user-defined paragraph styles. Now there's more to learn, and some of the simplicity of WYSIWYG is lost.

Let's back up a bit to the initial problem of RUNOFF: the marked-up text bears little resemblance to the structure of the formatted output. What if instead of drawing attention to a rigid command structure, the goal is to make it invisible. Instead of .PP on its own line to indicate a paragraph, assume all text is in paragraphs separated by blank lines. An asterisk as the first character means a line is an element of an unordered list.

The MediaWiki markup language takes some steps in this direction. The REBOL MakeDoc tool goes further. John Gruber's Markdown is perhaps the cleanest and most complete system for translating visually formatted text to HTML. (Had I known about Markdown, I wouldn't have developed the minimal mark-up notation I use for the articles on this site.)

That's my incomplete and non-chronological history of text formatters, through ugly and over-engineered to a previously overlooked simplicity. You might say "What about SGML and HTML? What about TeX?" which I'll pretend I didn't hear, and say that the real question is "What other application types have grown convoluted and are there unexpectedly simple solutions that are being ignored?"

(If you liked this, you might enjoy Documenting the Undocumentable.)