[Haskell-cafe] optimising for vector units

Tue Jul 27 02:56:59 EDT 2004

Jan-Willem Maessen - Sun Labs East <Janwillem.Maessen at Sun.COM> writes:

> There are, I believe, a couple of major challenges:
>    * It's easy to identify very small pieces of parallel work, but much
>      harder to identify large, yet finite, pieces of work.  Only the
>      latter are really worth parallelizing.

By the former, are you thinking of so small grain that it is handled
by out-of-order execution units in the CPU?  And/or the C compiler?

>    * If you don't compute speculatively, you'll never find enough work
>      to do.

Although I'm not familiar with the issues, my point is that the number
of CPUs available, even in common household pee cees, is already more
than one (P4 hyper-threading), and could be something like eight in
the not-so-distant future.  It no longer matters (much) if you waste
`cycles, cycles are cheap.  (The next next IA64, Montecito is 1.7G
transistors, including 24Mb on-chip cache.  The P4 is big, but you
could fit thirty of them in that space.  No way Montecito is going to
have anywhere near 30x the performance)

So speculative execution, even if you end up throwing away 50% of the
work you do, could in theory make your program faster anyway.  This is
a headache for C programs; my hope would be that a functional language
would make it easier.

>    * If you compute speculatively, you need some way to *stop* working
>      on useless, yet infinite computations.

And you need to choose which computations to start working on, I guess.
Predicting the future never was easy :-)

[perhaps getting off-topic, but hey, this is -cafe]

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants