[Haskell-cafe] optimising for vector units

MR K P SCHUPKE k.schupke at imperial.ac.uk
Wed Jul 28 12:05:00 EDT 2004

Erm, when I said no overhead I meant there is no overhead to choosing
an instruction from a different thread compared to choosing an instruction
from the same thread... Obviously the overall scheduling overhead will 

>The real killer, of course, is memory latency.  The cache resources
>required to hold pending work can't be used to hold heap data

I would have thought cache thrashing caused by two CPUs accessing the
same address repeatedly would be the real killer... Of course this
is not so much of a problem with hyper-threading where both
'virtual' CPUs access the same cache.

>* All your functions must be strict in their arguments,

I don't see that I am making that assumption. Obviously parts of
the program that are in the IO monad are strict, so we start with
the demand for a value required strictly by an IO function. This
value will be the result of a function, so we take the graph
representing the function, and carry on as suggested. This is
definitely lazy as parameters not affecting the result will never
be evaluated.

I appreciate there are some unanswered hard questions, but most
of these are to do with efficiency - not whether it is possible...

Having an inefiicient implementation now rather than nothing 
might just be what is needed to get people looking at the
efficiency issues.

Thanks for the references by the way - although I am quite
familiar with the Monsoon architecture - and the Pentium
architecture (which since PPro has been internaly dataflow...
SuperScalar is really a dataflow computer emulating a
traditional processor, and interpreting its instructions)


More information about the Haskell-Cafe mailing list