Let's Take a Trivial Problem and Make it Hard

Here's a simple problem: Given a block of binary data, count the frequency of the bytes within it. In C, this could be a homework assignment for an introductory class. Just zero out an array of 256 elements, then for each byte increment the appropriate array index. Easy.

Now write this in a purely functional way, with an efficiency close to that of the C implementation.

It's easy to do a straightforward translation to Erlang, using tail recursion instead of a for loop, like this:

freq(B) when is_binary(B) ->
   freq(B, erlang:make_tuple(256, 0)).

freq(<<X, Rest/binary>>, Totals) ->
   I = X + 1,
   N = element(I, Totals),
   freq(Rest, setelement(I, Totals, N + 1));
freq(<<>>, Totals) ->
   Totals.

But of course in the name of purity and simplicity, setelement copies the entire Totals tuple, so if there are fifty million bytes, then the 256 element Totals is copied 50 million times. It's simple, but it's not the right approach.

"Blame the complier" is another easy option. If it could be determined that the Totals tuple can be destructively updated, then we're good. Note that the garbage collector in the Erlang runtime is based on the assumption that pointers in the heap always point toward older data, an assumption that could break if a tuple was destructively updated with, say, a list value. So not only would the compiler have to deduce that that the tuple is only used locally, but it would also have to verify that only non-pointer values (like integers and atoms) were being passed in as the third parameter of setelement. This is all possible, but it doesn't currently work that way, so this line of reasoning is a dead end for now.

Totals could be switched from a tuple to a tree, which might or might not be better than the setelement code, but there's no way it's in the same ballpark as the C version.

What about a different algorithm? Sort the block of bytes, then count runs of identical values. Again, just the suggestion of sorting means we're already off track.

Honestly, I don't know the right answer. In Erlang, I'd go for one of the imperative efficiency hacks, like ets tables, but let's back up a bit. The key issue here is that there are some fundamental assumptions about what "purely functional" means and the expected features in functional languages.

In array languages, like J, this type of problem is less awkward, as it's closer to what they were designed for. If nothing else, reference counted arrays make it easier to tell when destructive updates are safe. And there's usually some kind of classification operator, one that would group the bytes by value for easy counting. That's still not going to be as efficient as C, but it's clearly higher-level than the literal Erlang translation.

A more basic question is this: "Is destructively updating a local array a violation of purely functionalness?" OCaml allows destructive array updates and C-like control structures. If a local array is updated inside of an OCaml function, then the result copied to a non-mutable array at the end, is there really anything wrong with that? It's not the same as randomly sticking your finger inside a global array somewhere, causing a week's worth of debugging. In fact, it looks exactly the same as the purely functional version from the caller's point of view.

Perhaps the sweeping negativity about destructive updates is misplaced.

permalink May 4, 2009

Let's Take a Trivial Problem and Make it Hard

previously

archives