You Don't Read Code, You Explore It

(I wrote this in 2012 and rediscovered it in January of this year. I didn't feel comfortable posting it so close to Peter Seibel's excellent Code is Not Literature, so I held off for a few months.)

I used to study the program listings in magazines like Dr. Dobb's, back when they printed the source code to substantial programs. While I learned a few isolated tricks and techniques, I never felt like I was able to comprehend the entirety of how the code worked, even after putting in significant effort.

It wasn't anything like sitting down and reading a book for enjoyment; it took work. I marked up the listings and kept notes as I went. I re-read sections multiple times, uncovering missed details. But it was easy to build-up incorrect assumptions in my head, and without any way of proving them right or wrong I'd keep seeing what I wanted to instead of the true purpose of one particular section. Even if the code was readable in the software engineering sense, boundary cases and implicit knowledge lived between the lines. I'd understand 90% of this function and 90% of that function and all those extra ten percents would keep accumulating until I was fooling myself if I thought I had the true meaning in my grasp.

That experience made me realize that read isn't a good verb to apply to a program.

It's fine for hunting down particular details ("let's see how many buffers are allocated when a file is loaded"), but not for understanding the architecture and flow of a non-trivial code base.

I've worked through tutorials in the J language--called "labs" in the J world--where the material would have been opaque and frustrating had it not been interactive. The presentation style was unnervingly minimal: here's a concept with some sentences of high-level explanation, and here are some lines of code that demonstrate it. Through experimentation and trial and error, and simply because I typed new statements myself, I learned about the topic at hand.

Of particular note are Ken Iverson's interactive texts on what sound like dry, mathematical subjects, but they take on new life when presented in exploratory snippets. That's even though they are reliant on J, the most mind-melting and nothing-at-all-like-C language in existence.

I think that's the only way to truly understand arbitrary source code. To load it up, to experiment, to interactively see how weird cases are handled, then keep expanding that knowledge until it encompasses the entire program. I know, that's harder to do with C++ than with Erlang and Haskell (and more specifically, it's harder to do with languages where functions can have wide-ranging side effects that can change the state of the system in hidden ways), and that's part of why interactive, mostly-functional languages can be more pleasant than C++ or Java.

(If you liked this, you might enjoy, Don't Be Distracted by Superior Technology.)