[Haskell-cafe] transparent parallelization
Tim Harris (RESEARCH)
tharris at microsoft.com
Wed Sep 19 07:27:37 EDT 2007
> The problem with Haskell is not finding opportunities to parallelize,
> they are legion. Actually, quite the opposite, there's so much that your
> code ends up slower than a sequential realization. The hard part is
> making a good cost-model and a good way to create coarser chunks of
> work. It's not worthwhile to spawn a thread (even a very lightweight
> one) for virtually every subexpression.
> Automatic parallelization is easy, efficient parallelization is hard.
We had another crack at this recently in an upcoming ICFP paper (draft online at http://research.microsoft.com/~tharris/papers/2007-fdip.pdf).
In that paper we start out by collecting profiles of the thunk creation, entry, and update operations for a set of benchmarks and conduct a limit study of how fast these dependencies would allow them to be evaluated (e.g. ignoring interactions through the GC, cache effects, the costs of sparking thunks etc.).
We then make the system increasingly more practical, (i) setting a lower bound on the compute-time of thunks that we spark, (ii) making predictions of a thunk's compute-time when it's allocated, and then (iii) building a real implementation based on GHC 6.6.
The parallel speedup dwindles at each stage: this kind of automated approach looks plausible for using "2n" cores instead of using "n", but I'm sceptical about it being able to efficiently exploit "n" cores instead of "1".
More information about the Haskell-Cafe