[Haskell-cafe] optimising for vector units
Jan-Willem Maessen - Sun Labs East
Janwillem.Maessen at Sun.COM
Mon Jul 26 15:20:08 EDT 2004
Ketil Malde wrote:
> I'm sure somebody, somewhere, is working on speculative execution of
> Haskell code.
I think we both graduated (myself from MIT, Robert Ennals from
Cambridge; I'm building compilers for supercomputers at Sun). If
anyone else is working seriously on speculative evaluation of Haskell
at the moment, please drop me a line, I'd love to hear of your existence.
> Now that Sun is building Niagara with IIRC 8 cores on a
> chip, Intel is rumoured to put up to 16 cores on the post-Montecito
> Tukwila, Sony/IBM's Cell is said to be some kind of parallel design,
> and every self-respecting chip vendor is putting at least two cores on
> a chip and/or adding multithreaded cores.
>
> One would expect a lazy and pure language to be excellent for
> parallelization, since the programmer is generally removed from the
> actual flow of execution anyway.
As I recall, this was one of the original goals of the Grip project
(which built the original incarnation of GHC, if I have my history
straight).
Indeed, if you read any early paper on strictness analysis, you'd
think that it was all about parallelizing lazy code. I'm now of the
opinion that strictness analysis tells us where to *serialize* our
code, because parallelism just won't be worth it.
There are, I believe, a couple of major challenges:
* It's easy to identify very small pieces of parallel work, but much
harder to identify large, yet finite, pieces of work. Only the
latter are really worth parallelizing.
* If you don't compute speculatively, you'll never find enough work
to do.
* If you compute speculatively, you need some way to *stop* working
on useless, yet infinite computations. Rob and I had two rather
different takes on the same basic idea: run with some sort of
resource limits, and stop working when you exhaust them. Roughly
speaking, Rob then made sure you never speculated that bit of code
again, while I just kept relying on the bounds to kick in. On
pretty eager code, I did all right, but on code that actually
needed the laziness (this was after a good bit of compiler munging
to wring out most of the unnecessary laziness) I got
killed---Rob's approach fixed the mistake by patching the code,
and did much better thereafter.
My ultimate goal was parallelization. However, I'm now convinced it
would take about 2-3 programmer-years more of effort to realize that
goal. Parallel garbage collection (which is *not* optional on a
parallel Haskell implementation, as our experience with pH
demonstrated) and very fine-grained thunk-update protocols are both
tricky technical challenges. And that leaves aside the question of
whether there's enough coarse-grained work buried among all that
fine-grained work to make it all worthwhile. Anyone want to pay me to
do this? :-)
> At some point (for some n), being
> able to spawn n threads will gain you more than a factor c constant
> overhead, and Haskell programs, with a run-time system that can
> evaluate expressions in paralllel, will outperform single threaded C
> code.
Only if you can keep the granularity of the work large. This hasn't
been so easy.
-Jan-Willem Maessen
Eager Haskell Implementor
More information about the Haskell-Cafe
mailing list