What to do About Erlang's Records?

The second most common complaint about Erlang, right after confusion about commas and semicolons as separators, is about records. Gotta give those complainers some credit, because they've got taste. Statically defined records are out of place in a highly dynamic language. There have been various proposals over the years, including Richard O'Keefe's abstract syntax patterns and Joe Armstrong's structs. Getting one of those implemented needs the solid support of the Erlang system maintainers, and it's understandably difficult to commit to such a sweeping change to the language. So what are the alternatives to records that can be used right now?

To clarify, I'm really talking about smallish, purely functional dictionaries. For large amounts of data there's already the gb_trees module, plus several others with similar purposes.

In Python, a technique I often use is to return a small dictionary with a couple of named values in it. I could use a tuple, but a dictionary removes the need to worry about order. This is straightforward in Erlang, too:

fun(length) -> 46;
   (width)  -> 17;
   (color)  -> sea_green
end.

Getting the value corresponding to a key is easy enough:

Result(color)

This is handy, but only in certain situations. One shortcoming is that there's no way to iterate through the keys. Well, there's this idea:

fun(keys)   -> [length, width, color];
   (length) -> 46;
   (width)  -> 17;
   (color)  -> sea_green
end.

Now there's a way to get a list of keys, but there's room for error: each key appears twice in the code. The second issue is there's no simple way to take one dictionary and create a new one with a value added or removed. This road is becoming messy to go down, so here's more data-driven representation:

[{length, 46}, {width, 17}, {color, sea_green}]

That's just a list of key/value pairs, which is searchable via the fast, written-in-C function lists:keyfind. New values can be appended to the head of the list, and there are other functions in the lists module for deleting and replacing values. Iteration is also easy: it's just a list.

We still haven't bettered records in all ways. A big win for records, and this is something few purely functional data structures handle well, is the ability to create a new version where multiple keys get different values. For example, start with the above list and create this:

[{length, 200}, {width, 1400}, {color, sea_green}]

If we knew that only those three keys were allowed, fine, but that's cheating. The whole point of dictionaries is that we can put all sorts of stuff in there, and it doesn't change how the dictionary is manipulated. The general solution is to delete all the keys that should have new values, then insert the new key/value pairs at the head of the list. Or step through the list and see if the current key is one that has a new value and replace it. These are not linear algorithms, unfortunately. And you've got the same problem if you want to change multiple values in a gb_tree at the same time.

What I've been using, and I admit that this isn't perfect, is the key/value list approach, but forcing the lists to be sorted.This allows the original list and a list of changes to be merged together in linear time. The downside is that I have to remember to keep a literal list in sorted order (or write a parse transform to do this for me).

There's still one more feature of records that can't be emulated: extracting / comparing values using Erlang's standard pattern matching capabilities. It's not a terrible omission, but there's no way to dodge this one: it needs compiler and runtime system support.

permalink January 30, 2010

What to do About Erlang's Records?

previously

archives