[Haskell-cafe] Two-iteration optimisation (was: GHC Predictability)

Wed May 14 14:53:58 EDT 2008

Albert Y. C. Lai wrote:
> If you worry that the sum thread and the length thread are not 
> synchronized and therefore there is still no bound on the list prefix 
> kept in memory, I'm sure you can improve it by one of the chunking 
> strategies.

I'm more worried that data now has to go through tiny CPU <--> RAM 
busses rather than zipping along at the CPU's core frequency as it would 
if only 1 CPU were involved. I would think the cache issues make the 
sequential version a fair bit faster. (Not to mention that no GC is 
required.)

> As our hardware grows more cores but not more gigahertz, it may become 
> detrimal to fuse. Fusion implies entangling, an anti-thesis to 
> parallelism. One day we may call this an optimisation: unfuse code 
> hand-fused by programmers, then parallelize. Optimisation people on 
> the imperative side are already doing this (they have more hand-fused 
> code than we do), though targetting at older hardware like vector 
> machines and SIMD, not yet multiple cores.

The key to parallel processing is splitting a task into *independent* 
tasks so you can do them in parallel. If they're still interdependent 
[like the above], I suspect you're not gaining much.