[Haskell-cafe] Re: Parallel foldl doesn't work correctly
phil at beadling.co.uk
Sun Dec 13 21:59:29 EST 2009
> -- Prepare to share work to be
> -- done across available cores
> chunkOnCpu :: [a] -> [[a]]
> chunkOnCpu xs = chunk (length xs `div` numCapabilities) xs
> -- Spark a fold of each chunk and
> -- sum the results. Only works because
> -- for associative folds.
> foldChunks :: ([a] -> a) -> (a -> b -> a) -> a -> [[b]] -> a
> foldChunks combineFunc foldFunc acc =
> combineFunc . (parMap rwhnf $ foldl' foldFunc acc)
I should probably point out that use of chunk above isn't a good idea in
anything beyond a toy example. If you have used a list comprehension to
create your input then splitting it like the above results in thunks
that grow with list size as chunk forces generation of the list. This
rapidly negates any advantage gained from processing across >1 core!
This is easily solved - just alter the generating function to create a
*list* of list comprehensions equal in length to the number of cores you
wish to process across, rather than create one list that is split across
the cores later.
More information about the Haskell-Cafe