[Haskell-cafe] Re: blas bindings,
why are they so much slower the C?
dons at galois.com
Thu Jul 24 19:26:36 EDT 2008
> Last month Anatoly Yakovenko published some disturbing numbers about
> the Haskell BLAS bindings I wrote being significantly slower than
> using plain C. I wanted to let everyone know that I've closed the
> performance gap, and now for doing ten million dot products, the
> overhead for using Haskell instead of C is about 0.6 seconds on my
> machine, regardless of the size of the vectors. The next version will
> incorporate the changes. If you can't wait for a formal release, the
> darcs repository is at http://www-stat.stanford.edu/~patperry/code/blas/
> Anyone interested in more details can check out my blog:
> Thanks everyone for the input on this (especially Anatoly). If any
> else finds any performance discrepancies, please let me know and I
> will do whatever I can to fix them.
Great work, Patrick!
So if I read correctly, the main change was to flatten the
representation (and thus in loops the vector's structure will be
unpacked and kept in registers, which isn't possible for sum types).
More information about the Haskell-Cafe