[Haskell-cafe] Re: blas bindings, why are they so much slower the C?

Patrick Perry patperry at stanford.edu
Thu Jul 24 19:41:16 EDT 2008


Yeah, I think that's where most of the performance gains came from.  I  
also added a re-write rule for unsafeGet dot (since it doesn't matter  
if the arguments are conjugated or not if the vectors are real)  that  
shaved off about a tenth of a second.


Patrick

On Jul 24, 2008, at 4:26 PM, Don Stewart wrote:

> patperry:
>> Last month Anatoly Yakovenko published some disturbing numbers about
>> the Haskell BLAS bindings I wrote being significantly slower than
>> using plain C.  I wanted to let everyone know that I've closed the
>> performance gap, and now for doing ten million dot products, the
>> overhead for using Haskell instead of C is about 0.6 seconds on my
>> machine, regardless of the size of the vectors.  The next version  
>> will
>> incorporate the changes.  If you can't wait for a formal release, the
>> darcs repository is at http://www-stat.stanford.edu/~patperry/code/blas/
>>
>> Anyone interested in more details can check out my blog:
>> http://quantile95.com/2008/07/24/addressing-haskell-blas-performance-issues/
>>
>> Thanks everyone for the input on this (especially Anatoly).  If any
>> else finds any performance discrepancies, please let me know and I
>> will do whatever I can to fix them.
>>
>
> Great work, Patrick!
>
> So if I read correctly, the main change was to flatten the
> representation (and thus in loops the vector's structure will be
> unpacked and kept in registers, which isn't possible for sum types).
>
> -- Don



More information about the Haskell-Cafe mailing list