Micro-Build Systems and the Death of a Prominent DSL

Normally I don't think about how to rebuild an Erlang project. I just compile a file after editing it--via the c(Filename) shell command--and that's that. With hot code loading there no need for a linking step. Occasionally, such as after upgrading to a new Erlang version, I do this:

erlc *.erl

which compiles all the .erl files in the current directory.

But wait a minute. What about checking if the corresponding .beam file has a more recent date than the source and skipping the compilation step for that file? Surely that's gong to be a performance win? Here's the result of fully compiling a mid-sized project consisting of fifteen files:

$ time erlc *.erl

real    0m1.912s
user    0m0.945s
sys     0m0.108s

That's less than two seconds to rebuild everything. (Immediately rebuilding again takes less than one second, showing that disk I/O is a major factor.)

Performance is clearly not an issue. Not yet anyway. Mid-sized projects have a way of growing into large-sized projects, and those 15 files could one day be 50. Hmmm...linearly interpolating based on the current project size still gives a time of under six-and-a half-seconds, so no need to panic. But projects get more complex in other ways: custom tools written in different languages, dynamically loaded drivers, data files that need to be preprocessed, Erlang modules generated from data, source code in multiple directories.

A good start is to move the basic compilation step into pure Erlang:

erlang_files() -> [
   "util.erl",
   "http.erl",
   "sandwich.erl",
   "optimizer.erl"
].

build() ->
   c:lc(erlang_files()).

where c:lc() is the Erlang shell function for compiling a list of files.

If you stop and think, this first step is actually a huge step. We've now got a symbolic representation of the project in a form that can be manipulated by Erlang code. erlang_files() could be replaced by searching through the current directory for all files with an .erl extension. We could even do things like skip all files with _old preceding the extension, such as util_old.erl. And all of this is trivially, almost mindlessly, easy.

There's a handful of things that traditional build systems do. They call shell commands. They manipulate filenames. They compare dates. The fancy ones go through source files and look for included files. These things are a small subset of what you can do in Perl, Ruby, Python, or Erlang. So why not do them in Perl, Ruby, Python, or Erlang?

I'm pretty sure there's a standard old build system to do this kind of thing, but in a clunky way where you have to be careful whether you use spaces or tabs, remember arcane bits of syntax, remember what rules and macros are built-in, remember tricks involved in building nested projects, remember the differences between the versions that have gone down their own evolutionary paths. I use it rarely enough that I forget all of these details. There are modern variants, too, that trade all of that 1970s-era fiddling for different lists of things to remember. But there's no need.

It's easier and faster to put together custom, micro-build systems in the high-level language of your choice.

permalink September 27, 2009

Micro-Build Systems and the Death of a Prominent DSL

previously

archives