[Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

Thu Nov 29 20:15:42 CET 2012

Hi Felix,

On Thu, Nov 29, 2012 at 10:09 AM, Fixie Fixie
<fixie.fixie at rocketmail.com> wrote:
> The problem seems to be connected to lazy loading, which makes my programs
> so slow that I really can not show them to anyone. I have tried all tricks
> in the books, like !, seq, non-lazy datatypes...

My advice usually goes like this:

 1. Use standard, high-performance libraries (I've made a list of high
quality libraries at
https://github.com/tibbe/haskell-docs/blob/master/libraries-directory.md).
 2. Make your data type fields strict.
 3. Unpack primitive types (e.g. Int, Word, Float, Double).
 4. Reduce allocation in tight loops (e.g. avoid creating lots of
intermediate lists).

I always do 1-3, but only do 4 when it's really necessary (e.g. in the
inner loop of some machine learning algorithm).

Rarely is performance issues due to lacking bang patterns on functions
(although there are cases, e.g. when writing recursive functions with
accumulators, where you need one).

> I was poking around to see if this had changed, then I ran into this forum
> post:
> http://stackoverflow.com/questions/9409634/is-indexing-of-data-vector-unboxed-mutable-mvector-really-this-slow
>
> The last solution was a haskell program which was in the 3x range to C,
> which I think is ok. This was in the days of ghc 7.0
>
> I then tried compile the programs myself (ghc 7.4.1), but found that now the
> C program now was more that 100x faster. The ghc code was compiled with both
> O2 and O3, giving only small differences on my 64-bit Linux box.
>
> So it seems something has changed - and even small examples are still not
> safe when it comes to the lazy-monster. It reminds me of some code I read a
> couple of years ago where one of the Simons actually fired off a new thread,
> to make sure a variable was realized.

Note that the issues in the blog post are not due to laziness (i.e.
there are no space leaks), but due to the code being more polymorphic
than the C code, causing extra allocation and indirection.

> A sad thing, since I am More that willing to go for Haskell if proves to be
> usable. If anyone can see what is wrong with the code (there are two haskell
> versions on the page, I have tried the last and fastest one) it would also
> be interesting.
>
> What is your experience, dear haskellers? To me it seems this beautiful
> language is useless without a better lazy/eager-analyzer.

It's definitely possible to write fast Haskell code (as some Haskell
programmers manage to do so consistently), but I appreciate that it's
harder than it should be. In my opinion the major thing missing is a
good text on how to write fast Haskell code and some tweaks to the
compiler (e.g. unbox strict primitive fields like Int by default).

Hope this helps.

Cheers,
Johan