So, everybody knows about the Fibonacci pissing contest by now. From the
moment you mention two programming languages, this is likely to happen, and
the
blog post that triggered all this
provided a small push sufficient to make the discussion degenerate that way*1. The chosen
micro-benchmark also played a very negative role, as it gives too many
opportunities to get sidetracked: better algorithms (recursion vs. iteration,
memoization...), overflowable ints vs. arbitrary precision integers, etc.
You're probably expecting me to throw some code at you (OCaml since I've been
talking about it a bit as of late) in spite of everything I wrote above and
give enough figures to be able to claim something like "Holy Shmoly, OCaml
smokes {Haskell, SBCL, C++} away". (That's truly a blogging classic, deploring
the silliness of the latest blog trend and adding to it the second after.)
Instead, I'm going to use this micro-benchmark the way it was meant to: not to
compare language implementations in general, but to measure an isolated
characteristic. Micro-benchmarks are exceedingly useful for language
implementors and for those who want to assess the cost of some particular
operation.
What I want to do is build some intuition about the cost of function calls and
argument passing in Haskell. I shall look at the generated assembly code in
order to understand where my CPU cycles are going (this is where I'll be
needing your help, because, as you will see, Haskell's asm isn't exactly easy
to read).
Before I plunge into Haskell, I'll take a look at the code generated by GCC
and OCaml, which will serve as the control group of sorts and help me verify
that what I think I know about them is correct. If I can't understand what's going on with C and OCaml, I have no chance with Haskell. (Yes, I will also give you
execution times, but I promise there will be no Holy Shmoly.)
Some familiar code first
This is what I expect in my first runs:
argument passing should be faster in OCaml than in C with a non-static function
GCC should be able to generate code about as fast (or faster) given enough inlining (OCaml never inlines recursive functions, recent GCC versions do)
when using a static function (so that arguments are passed in registers instead of via the stack as required by the standard ABI), GCC's code should be as fast or faster than OCaml's
You can jump to the next section if you're not interested in the details of
the code generated by GCC and OCaml. In few words, my intuition regarding C and
OCaml was basically correct, and I now want to see what causes the Haskell
figures in this table (this is where I'll need some help):
implementation
execution time(s)
GCC -O2
0.444
GCC -O3
0.405
OCaml
0.413
Haskell (Int)
1.03
Here goes the OCaml code and the corresponding assembly, which can hardly get
more readable: