[Haskell-cafe] Dynamic thread management?

Wed Aug 22 10:39:27 EDT 2007

On Wed, Aug 22, 2007 at 04:07:22AM +0100, Hugh Perkins wrote:
> On 8/21/07, Andrew Coppin <andrewcoppin at btinternet.com> wrote:
> > I highly doubt that automatic threading will happen any time this decade
> > - but I just learned something worth while from reading this email. ;-)
> 
> That's an interesting observation.  I cant say I dont believe it, but
> I'm interested to know why (but it could be just a feeling, or an
> observation in time-to-market lead times?).  Are you saying this
> because multicores arent sufficiently widespread or powerful enough
> yet (4-cores doesnt really even make up for the overhead of using
> automatic threading, at least in initial implementations)? or are you
> saying this because you think the technical implementations are not
> sufficiently advanced?

Automatic threading is inherently limited by data dependencies.
You can't run a function that branches on an argument in parallel
with the computation producing that argument. Even with arbitrarily
many processors and no communication overhead there is a limit to
how much parallelism you can squeeze from any given program.

You should read
"Feedback Directed Implicit Parallelism"
http://research.microsoft.com/~tharris/papers/2007-fdip.pdf
and perhaps
"Limits to Implicit Parallelism in Functional Applications"
http://www.detreville.org/papers/Limits.pdf

In short, with zero overhead and an oracle for scheduling you can
get a factor of at most 2x to 32x by implicitly parallelizing
existing Haskell code. In practice, with execution overhead it's a
gain of perhaps 10% to 80% on a 4-core system. The experiments in
the first paper are based on a fairly sophisticated implementation
that reduces overhead by using profiling results at compile time
to decide which thunks might be worth evaluating in parallel. For a
fixed benchmark there's probably not much lost by using canned
profiling results instead of adapting at runtime, and in any case
the hard bounds from data dependencies still apply.

You can do a lot better if you expect people to rewrite code,
but "automatic threading" suggests something completely effortless.
I think you can get much better results if you work on the programming
style in connection with a better runtime.  You can think of data parallel
Haskell as a new programming style with more implicit parallelims,
and the runtime support to exploit it.

> I kindof think automatic threading is like 3d graphics: as soon as the
> hardware became sufficiently powerful, 3d graphics became trivial.
> Enough money was thrown at the problem in a very short time by a few
> powerful companies that it was a non-issue.

If you have cores to waste, you might try rewrites like

f x 
=>
case x of
  C1 a1 a2 -> f (C1 a1 a2)
  C2 b -> f (C2 b)
  C3 -> f C3

and then speculatively execute several of the case branches.
If you don't throw away too much type information you should
even be able to do it at runtime.

Brandon