[Haskell-cafe] Re: Parallel foldl doesn't work correctly

Tue Dec 15 04:19:36 EST 2009

On 15/12/09 00:37, Philip Beadling wrote:
>
>> If you still have trouble, then try using ThreadScope
>>
>>    http://code.haskell.org/ThreadScope/
>>
>> with GHC 6.12.1.  You can use ThreadScope directly from the darcs
>> repository on code.haskell.org, and we hope to do a proper release soon.
>>
>> Cheers,
>> 	Simon
>
> Thanks for the advice, just downloaded ThreadScope and it's pretty
> useful (before I was using Ubuntu's System Monitor which isn't ideal).
>
> I've moved onto 6.12 and I now have my program working nicely over 2
> cores - the problem was at least in part my own design - I was
> generating large thunks in my parallel version which was killing
> performance.  With this solved 2 cores gives me ~50% performance
> increase.
>
> What I'm doing now is taking a list I am going to fold over and
> splitting it up so I have a list of lists, where each parent list
> element representing work for 1 core.  I then fold lazily and only
> parallelise on the final sum operation which (as far as I can see) sends
> each chunk of folds to a different core and sums the results.
>
> Can I confirm - what you are suggesting is that although I can't
> parallelise fold itself, I could force evaluation on the list I am about
> to fold in parallel and then merely accumulate the result at the end --
> thus most the donkey work is done in parallel?

Yes.  If it turns out that the list elements are too small to spark 
individually, then you may want to split the list into chunks and 
evaluate/sum the chunks in parallel, before summing the result.  This is 
a typical map/reduce pattern.

Cheers,
	Simon