[Haskell-cafe] repa parallelization results
Thomas Miedema
thomasmiedema at gmail.com
Thu Jan 14 19:19:01 UTC 2016
Anatoly: I also ran your benchmark, and can not reproduce your findings.
Note that GHC does not make effective use of hyperthreads (
https://ghc.haskell.org/trac/ghc/ticket/9221#comment:12). So don't use -N4
when you have only a dual core machine. Maybe that's why you were getting
bad results? I also notice a `NaN` in one of your timing results. I don't
know how that is possible, or if it affected your results. Could you try
running your benchmark again, but this time with -N2?
On Sat, Mar 14, 2015 at 5:21 PM, Carter Schonwald <
carter.schonwald at gmail.com> wrote:
> dense matrix product is not an algorithm that makes sense in repa's
> execution model,
>
Matrix multiplication is the first example in the first repa paper:
http://benl.ouroborus.net/papers/repa/repa-icfp2010.pdf. Look at figures 2
and 7.
"we measured very good absolute speedup, ×7.2 for 8 cores, on multicore
hardware"
Doing a quick experiment with 2 threads (my laptop doesn't have more cores):
$ cabal install repa-examples # I did not bother with `-fllvm`
...
$ ~/.cabal/bin/repa-mmult -random 1024 1024 -random 1024 1204
elapsedTimeMS = 6491
$ ~/.cabal/bin/repa-mmult -random 1024 1024 -random 1024 1204 +RTS -N2
elapsedTimeMS = 3393
This is with GHC 7.10.3 and repa-3.4.0.1 (and dependencies from
http://www.stackage.org/snapshot/lts-3.22)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20160114/156c8a68/attachment.html>
More information about the Haskell-Cafe
mailing list