[Haskell-cafe] I killed performance of my code with Eval and Strategies

Janek S. fremenzone at poczta.onet.pl
Sat Nov 17 11:20:07 CET 2012


> Just another definition of calculateSeq:
>
> calculateSeq = zipWith ($) (cycle [sin,cos]) . map sqrt
This is just slightly slower than my implementation.

I came up with a better implementation of parallel function:

calculatePar2 :: [Double] -> [Double]
calculatePar2 xss = runEval $
    concat `fmap` parList (rdeepseq . calculateSeq) (chunk 2048 xss)

chunk :: Int -> [a] -> [[a]]
chunk _ [] = []
chunk n xs = as : chunk n bs where !(as, bs) = splitAt n xs


This mimics behaviour of parListChunk method. It is faster than my original calculatePar, but 
still 3-4 times slower than sequential version.

Janek

>
>
> 2012/11/15 Janek S. <fremenzone at poczta.onet.pl>
>
> > > Do you really mean to calculate the 'sin . sqrt' of just the head of
> > > the
> >
> > list, or do you mean:
> > > calculateSeq = map (sin . sqrt)   ?
> >
> > Argh.. of course not! That's what you get when you code in the middle of
> > a night. But in my code I
> > will not be able to use map because elements will be processed in pairs,
> > so let's say that my
> > sequential function looks like this:
> >
> > calculateSeq :: [Double] -> [Double]
> > calculateSeq [] = []
> > calculateSeq [x] = [sin . sqrt $ x]
> > calculateSeq (x:y:xs) = (sin . sqrt $ x) : (cos . sqrt $ y) :
> > calculateSeq xs
> >
> > > I don't think there's a memory leak. It looks more like you're just
> > > allocating much more than is sane for such a simple function.
> > > On a recent processor, sin . sqrt is two instructions. Meanwhile, you
> >
> > have
> >
> > > a list of (boxed?) integers being split up, then recombined. That's
> > > bound to hurt the GC.
> >
> > I am not entirely convinced that my idea of using eval+strategies is
> > bound to be slow, because
> > there are functions like parListChunk that do exactly this: split the
> > list into chunks, process
> > them in parallel and then concatenate the result. Functions in
> > Control.Parallel.Strategies were
> > designed to deal with list so I assume it is possible to process lists in
> > parallel without GC
> > problems. However I do not see a way to apply these functions in my
> > setting where elements of
> > lists are processed in pairs, not one at a time (parList and parMap will
> > not do). Also, working
> > on a list of tuples will not do.
> >
> > > Also, you might want to configure criterion to GC between
> > > runs. That might help.
> >
> > The -g flag passed to criterion executable does that.
> >
> > > What I'd suggest doing instead, is breaking the input into chucks of,
> >
> > say,
> >
> > > 1024, and representing it with a [Vector]. Then, run your sin.sqrt's on
> > > each vector in parallel. Finally, use Data.Vector.concat to combine
> > > your result.
> >
> > As stated in my post scriptum I am aware of that solution :) Here I'm
> > trying to figure what am I
> > doing wrong with Eval.
> >
> > Thanks!
> > Janek
> >
> > > Hope that helps,
> > >   - Clark
> > >
> > > On Wed, Nov 14, 2012 at 4:43 PM, Janek S. <fremenzone at poczta.onet.pl>
> >
> > wrote:
> > > > Dear Haskellers,
> > > >
> > > > I am reading Simon Marlow's tutorial on parallelism and I have
> > > > problems with correctly using Eval
> > > > monad and Strategies. I *thought* I understand them but after writing
> > > > some code it turns out that
> > > > obviously I don't because parallelized code is about 20 times slower.
> > > > Here's a short example
> > > > (code + criterion benchmarks):
> > > >
> > > > {-# LANGUAGE BangPatterns #-}
> > > > module Main where
> > > >
> > > > import Control.Parallel.Strategies
> > > > import Criterion.Main
> > > >
> > > > main :: IO ()
> > > > main = defaultMain [
> > > >         bench "Seq" $ nf calculateSeq xs
> > > >       , bench "Par" $ nf calculatePar xs ]
> > > >     where xs = [1..16384]
> > > >
> > > > calculateSeq :: [Double] -> [Double]
> > > > calculateSeq [] = []
> > > > calculateSeq (x:xs) = (sin . sqrt $ x) : xs
> > > >
> > > > calculatePar :: [Double] -> [Double]
> > > > calculatePar xss = runEval $ go xss
> > > >     where
> > > >       go :: Strategy [Double]
> > > >       go [] = return []
> > > >       go xs = do
> > > >           lsh <- (rpar `dot` rdeepseq) $ calculateSeq as
> > > >           lst <- go bs
> > > >           return (lsh ++ lst)
> > > >           where
> > > >             !(as, bs) = splitAt 8192 xs
> > > >
> > > > Compiling and running with:
> > > >
> > > > ghc -O2 -Wall -threaded -rtsopts -fforce-recomp -eventlog evalleak.hs
> > > > ./evalleak -oreport.html -g +RTS -N2 -ls -s
> > > >
> > > > I get:
> > > >
> > > > benchmarking Seq
> > > > mean: 100.5990 us, lb 100.1937 us, ub 101.1521 us, ci 0.950
> > > > std dev: 2.395003 us, lb 1.860923 us, ub 3.169562 us, ci 0.950
> > > >
> > > > benchmarking Par
> > > > mean: 2.233127 ms, lb 2.169669 ms, ub 2.296155 ms, ci 0.950
> > > > std dev: 323.5201 us, lb 310.2844 us, ub 344.8252 us, ci 0.950
> > > >
> > > > That's a hopeless result. Looking at the spark allocation everything
> > > > looks fine:
> > > >
> > > > SPARKS: 202 (202 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
> > > >
> > > > But analyzing eventlog with ThreadScope I see that parallel function
> > > > spends most of the time doing
> > > > garbage collection, which suggests that I have a memory leak
> >
> > somewhere. I
> >
> > > > suspected that problem
> > > > might be caused by appending two lists together in the parallel
> > > > implementation, but replacing
> > > > this with difference lists doesn't help. Changing granularity (e.g.
> > > > splitAt 512) also brings no
> > > > improvement. Can anyone point me to what am I doing wrong?
> > > >
> > > > Janek
> > > >
> > > > PS. This is of course not a real world code - I know that I'd be
> > > > better of using unboxed data
> > > > structures for doing computations on Doubles.
> > > >
> > > > _______________________________________________
> > > > Haskell-Cafe mailing list
> > > > Haskell-Cafe at haskell.org
> > > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe





More information about the Haskell-Cafe mailing list