I'm James Hague, a recovering programmer who has been designing video games since the 1980s. This is Why You Spent All that Time Learning to Program and The Pure Tech Side is the Dark Side are good places to start.
Where are the comments?
Erlang vs. Unintentionally Purely Functional PythonHere's a little Python function that should be easy to figure out, even if you don't know Python:
The first interesting part is that the
lower()method creates an entirely new string. If
pathcontains a hundred characters, then all hundred of those are copied to a new string in the process of being converted to lowercase.
The second point of note is that the append operation--the plus--is doing another copy. The entire lowercased string is moved to a new location, then the four character extension is tacked on to the end. The original
pathhas now been copied twice.
Those previous two paragraphs gloss over some key details. Where is the memory for the new strings coming from? It's returned by a call to the Python memory allocator. As with all generic heap management functions, the execution time of the Python memory allocator is difficult to predict. There are various checks and potential fast paths and manipulations of linked lists. In the worst case, the code falls through into C's
mallocand the party continues there. Remember, too, that objects in Python have headers, which include reference counts, so there's more overhead that I've ignored.
Also, the result of
lowergets thrown out after the subsequent concatenation, so I could peek into "release memory" routine and see what's going on down in that neck of the woods, but I'd rather not. Just realize there's a lot of work going on inside the simple
make_filenamefunction, even if the end result still manages to be surprisingly fast.
A popular criticism of functional languages is that a lack of mutable variables means that data is copied around, and of course that just has to be slow, right?
A literal Erlang translation of
make_filenamebehaves about the same as the Python version. The string still gets copied twice, though in Erlang it's a linked list which uses eight bytes per characters given a 32-bit build of the language. If the string in Python is Unicode (the default in Python 3), then it uses either two or four bytes per character, depending. The big difference is that memory allocation in Erlang is just a pointer increment and bounds check, and not a heavyweight call to a heap manager.
I'm not definitively stating which language is faster for this specific code, nor does it matter to me. I suspect the Erlang version ends up running slightly longer, because the lowercase function is itself written in Erlang, while Python's is in C. But all that "slow" copying of memory isn't even part of the performance discussion.
(If you liked this, you might like Functional Programming Went Mainstream Years Ago.)