Control.Parallel.Strategies.parMap CPU usage

Fri Mar 13 11:02:27 EDT 2009

Simon Marlow wrote:
> Christian Hoener zu Siederdissen wrote:
> 
>> when using parMap (or parList and demanding) I see a curious pattern 
>> in CPU usage.
>> Running "parMap rnf fib [1..100]" gives the following pattern of used 
>> CPUs:
>> 4,3,2,1,4,3,2,1,...
> 
> How did you find out which CPU is being used?

Sorry for the misunderstanding, the "pattern of used CPUs" is the _counted_number_ of active cores! 
That means that I am cycling through 4 to 1 active CPUs while there definitively is work that could 
be done by a core. Essentially, parMap seems to divide the list of thunks into blocks of 4 (or n in 
-Nn) and finishes each block before going to the next block.

This is easy to see by running the program and watching the number of active threads in htop / top.

> 
>> The fib function requires roughly two times the time if we go from 
>> fib(n) to fib(n+1), meaning that calculating the next element in the 
>> list always takes longer than the current. What I would like is a 
>> version of parMap that directly takes a free CPU and lets it calculate 
>> the next result, giving the usage pattern 4,4,4,4,...
> 
> In GHC you don't have any control over which CPU is used to execute a 
> spark.  We use dynamic load-balancing, which means the work distribution 
> is essentially random, and will change from run to run.
> 
> If you want more explicit control over your work distribution, try using 
> GHC.Conc.forkOnIO.
> 
> Also note that the implementation of much of this stuff is changing 
> rapidly, so you might want to try a recent snapshot.  Take a look at our 
> paper, if you haven't already:
> 
> http://www.haskell.org/~simonmar/papers/multicore-ghc.pdf
> 
> Cheers,
>     Simon

Hopefully I will find the time to try the latest head and see if the idle-pattern (better name?) 
persists.

Gruss,
Christian