[Haskell-cafe] repa parallelization results
thomasmiedema at gmail.com
Thu Jan 14 19:22:19 UTC 2016
To avoid any confusion, this was a reply to the following email:
On Fri, Mar 13, 2015 at 6:23 PM, Anatoly Yakovenko <aeyakovenko at gmail.com>
> so i am seeing basically results with N4 that are as good as using
> sequential computation on my macbook for the matrix multiply
> algorithm. any idea why?
On Thu, Jan 14, 2016 at 8:19 PM, Thomas Miedema <thomasmiedema at gmail.com>
> Anatoly: I also ran your benchmark, and can not reproduce your findings.
> Note that GHC does not make effective use of hyperthreads (
> https://ghc.haskell.org/trac/ghc/ticket/9221#comment:12). So don't use
> -N4 when you have only a dual core machine. Maybe that's why you were
> getting bad results? I also notice a `NaN` in one of your timing results. I
> don't know how that is possible, or if it affected your results. Could you
> try running your benchmark again, but this time with -N2?
> On Sat, Mar 14, 2015 at 5:21 PM, Carter Schonwald <
> carter.schonwald at gmail.com> wrote:
>> dense matrix product is not an algorithm that makes sense in repa's
>> execution model,
> Matrix multiplication is the first example in the first repa paper:
> http://benl.ouroborus.net/papers/repa/repa-icfp2010.pdf. Look at figures
> 2 and 7.
> "we measured very good absolute speedup, ×7.2 for 8 cores, on
> multicore hardware"
> Doing a quick experiment with 2 threads (my laptop doesn't have more
> $ cabal install repa-examples # I did not bother with `-fllvm`
> $ ~/.cabal/bin/repa-mmult -random 1024 1024 -random 1024 1204
> elapsedTimeMS = 6491
> $ ~/.cabal/bin/repa-mmult -random 1024 1024 -random 1024 1204 +RTS -N2
> elapsedTimeMS = 3393
> This is with GHC 7.10.3 and repa-188.8.131.52 (and dependencies from
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe