[Haskell-cafe] repa parallelization results

Sun Mar 15 20:40:51 UTC 2015

The likely issue is cache thrash.
On Mar 15, 2015 4:21 PM, "Anatoly Yakovenko" <aeyakovenko at gmail.com> wrote:

> Ok, so whats the difference between the sequence and parallel
> versions? does the parallel one contain a thunk for every element in
> the output?
>
> On Sun, Mar 15, 2015 at 12:44 PM, Carter Schonwald
> <carter.schonwald at gmail.com> wrote:
> > Read what I linked.
> > You are benchmarking repa for exactly the pessimal workload that it is
> bad
> > at.
> >
> > Repa is for point wise parallel and local convolution parallel programs.
> > The way repa can express matrix multiplication is exactly the worst way
> to
> > implement a parallel matrix mult.  Like, pretty pessimal wrt a memory
> > traffic / communication complexity metric of performance.
> >
> > Benchmark something like image blur algorithms and repa will really
> shine.
> >
> > Right now your benchmark is the repa equivalent of noticing that random
> > access on singly linked lists is slow :)
> >
> > On Mar 15, 2015 2:44 PM, "Anatoly Yakovenko" <aeyakovenko at gmail.com>
> wrote:
> >>
> >> I am not really focusing on matrix multiply specifically.  So the real
> >> problem is that the implementation using parallelized functions is
> >> slower then the sequential one, and adding more threads makes it
> >> barely as fast as the sequential one.
> >>
> >> So why would i ever use the parallelized versions?
> >>
> >>
> >> On Sat, Mar 14, 2015 at 9:24 AM, Carter Schonwald
> >> <carter.schonwald at gmail.com> wrote:
> >> > http://www.cs.utexas.edu/users/flame/pubs/blis3_ipdps14.pdf this
> paper
> >> > (among many others by the blis project) articulates some of the ideas
> i
> >> > allude to pretty well (with pictures!)
> >> >
> >> > On Sat, Mar 14, 2015 at 12:21 PM, Carter Schonwald
> >> > <carter.schonwald at gmail.com> wrote:
> >> >>
> >> >> dense matrix product is not an algorithm that makes sense in repa's
> >> >> execution model,
> >> >> in square matrix multiply of two N x N matrices, each result entry
> >> >> depends
> >> >> on 2n values total across the  two input matrices.
> >> >> even then, thats actually the wrong way to parallelize dense matrix
> >> >> product! its worth reading the papers about goto blas and the more
> >> >> recent
> >> >> blis project. a high performance dense matrix multipy winds up
> needing
> >> >> to do
> >> >> some nested array parallelism with mutable updates to have efficient
> >> >> sharing
> >> >> of sub computations!
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Mar 13, 2015 at 9:03 PM, Anatoly Yakovenko
> >> >> <aeyakovenko at gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> you think the backed would make any difference?  this seems like a
> >> >>> runtime issue to me, how are the threads scheduled by the ghc
> runtime?
> >> >>>
> >> >>> On Fri, Mar 13, 2015 at 4:58 PM, KC <kc1956 at gmail.com> wrote:
> >> >>> > How is the LLVM?
> >> >>> >
> >> >>> > --
> >> >>> > --
> >> >>> >
> >> >>> > Sent from an expensive device which will be obsolete in a few
> >> >>> > months!
> >> >>> > :D
> >> >>> >
> >> >>> > Casey
> >> >>> >
> >> >>> >
> >> >>> > On Mar 13, 2015 10:24 AM, "Anatoly Yakovenko"
> >> >>> > <aeyakovenko at gmail.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> https://gist.github.com/aeyakovenko/bf558697a0b3f377f9e8
> >> >>> >>
> >> >>> >>
> >> >>> >> so i am seeing basically results with N4 that are as good as
> using
> >> >>> >> sequential computation on my macbook for the matrix multiply
> >> >>> >> algorithm.  any idea why?
> >> >>> >>
> >> >>> >> Thanks,
> >> >>> >> Anatoly
> >> >>> >> _______________________________________________
> >> >>> >> Haskell-Cafe mailing list
> >> >>> >> Haskell-Cafe at haskell.org
> >> >>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >> >>> _______________________________________________
> >> >>> Haskell-Cafe mailing list
> >> >>> Haskell-Cafe at haskell.org
> >> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >> >>
> >> >>
> >> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150315/8d89be15/attachment.html>