parallel garbage collection performance

Tue Jun 19 02:49:59 CEST 2012

On 19/06/2012, at 24:48 , Tyson Whitehead wrote:

> On June 18, 2012 04:20:51 John Lato wrote:
>> Given this, can anyone suggest any likely causes of this issue, or
>> anything I might want to look for?  Also, should I be concerned about
>> the much larger gc_alloc_block_sync level for the slow run?  Does that
>> indicate the allocator waiting to alloc a new block, or is it
>> something else?  Am I on completely the wrong track?
> 
> A total shot in the dark here, but wasn't there something about really bad 
> performance when you used all the CPUs on your machine under Linux?
> 
> Presumably very tight coupling that is causing all the threads to stall 
> everytime the OS needs to do something or something?

This can be a problem for data parallel computations (like in Repa). In Repa all threads in the gang are supposed to run for the same time, but if one gets swapped out by the OS then the whole gang is stalled.

I tend to get best results using -N7 for an 8 core machine. 

It is also important to enable thread affinity (with the -qa) flag. 

For a Repa program on an 8 core machine I use +RTS -N7 -qa -qg

Ben.