[Haskell-cafe] Can Haskell outperform C++?
rrnewton at gmail.com
Mon May 14 16:26:35 CEST 2012
> Well, if it's "in many ways the same as C", then again it's probably not
> idiomatic Haskell.
It's just a recursive equation for mandelbrot fractals. I should have been
precise, I didn't mean that the source is literally the *same* as the C
source (i.e. there's no for loop, no mutable variables), rather that it
presents the compiler backend with the same advantages as the C backend
would have with the equivalent loop. That is, there's no control flow
obfuscation (dictionaries, calls to unknown targets), no problems with
laziness, and the data representations are completely known.
mandel :: Int -> Complex Double -> Int
mandel max_depth c = loop 0 0
loop i !z
| i == max_depth = i
| magnitude z >= 2.0 = i
| otherwise = loop (i+1) (z*z + c)
It's also a loop that is readily recognized as a loop. Now, to my
knowledge, GHC doesn't explicitly recognize loops in any stage of the
compiler (so as to perform loop optimizations), something which other
functional compilers sometimes do.
But, anyway, it turns out that my example above *is easily transformed from
a bad GHC performance story into a good one*. If you'll bear with me, I'll
show how below.
First, Manuel makes a good point about the LLVM backend. My "6X"
anecdote was from a while ago and I didn't use llvm . I redid it just
now with 7.4.1+LLVM, results below. (The below table should read correctly
in fixed width font, but you can also see the data in the spreadsheet
Time (ms) Compiled File size Comple+Runtime (ms)
GHC 7.4.1 O0 2444 1241K
GHC 7.4.1 O2 925 1132K 1561
GHC 7.4.1 O2 llvm 931 1133K
GHC 7.0.4 O2 via-C 684 974K
So LLVM didn't help . And in fact the deprecated via-C backend did the
best! Compare with GCC:
G++ O0 300 9K 531
G++ O3 110 7K 347
G++ O3 recursive 116 9K
Uh oh, the "6X" gap I mentioned is now closer to 9X. And, in fact,
performing a mini "language shootout" on the above code, reveals that GHC
is doing worse than not only OCaml, but Chez Scheme, in spite of dynamic
type checks, and a necessarily boxed representation of complex numbers:
Chez Scheme 8.4 284 2.7K notStandalone 372
OCaml 166 180K 301
At least Python does worse!
Python 2.6 1973 NA 1973
*So here's the catch:* If you check the Core and STG GHC 7.4 is actually
compiling the above loop very well. This microbenchmark turns into just a
"magnitude" microbenchmark. The implementation of Data.Complex uses an
EXTREMELY expensive method to avoid
Since OCaml did well above, I took a look at their standard library's
implementation, on line 51
They use a nice little math trick (the extra division) that is also
mentioned on Wikipedia. By implementing the same trick in Haskell,
replacing the standard "magnitude"
we get the following results.
GHC 7.4.1 No
Overflow Avoidance 39 1127K 674
GHC 741, OCaml-style
Overflow avoidance 74 1127K
Wow, now not only is the Haskell version faster than OCaml/Scheme, *but it
is 48% faster than C++*, which is appropriate to the title of this email!
Of course, this probably means that C++'s abs is also doing something
overly expensive for overflow avoidance (or that their representation of
complex numbers is not fully unpacked and stored in registers) I haven't
tracked it down yet.
But in any case, I'll be submitting a library patch. *The moral, I think,
is that community members could do a great deal to help "Haskell"
performance by simply microbenchmarking and optimizing library routines in
P.S. You can check out the above benchmarks from here:
 P.P.S. Most concerning to me about Haskell/C++ comparisons are David
Peixotto's findings that LLVM optimizations are not very effective on
Haskell-generated LLVM compared with typical clang-generated LLVM.
 P.P.P.S. It turns out there was already a ticket (
http://hackage.haskell.org/trac/ghc/ticket/2450) regarding magnitude's
performance. But it still has bad performance even after a small
refactoring was performed.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe