parallel garbage collection performance

Simon Marlow marlowsd at
Wed Jun 27 14:20:09 CEST 2012

On 26/06/2012 00:42, Ryan Newton wrote:
>     However, the parallel GC will be a problem if one or more of your
>     cores is being used by other process(es) on the machine.  In that
>     case, the GC synchronisation will stall and performance will go down
>     the drain.  You can often see this on a ThreadScope profile as a big
>     delay during GC while the other cores wait for the delayed core.
>       Make sure your machine is quiet and/or use one fewer cores than
>     the total available.  It's not usually a good idea to use
>     hyperthreaded cores either.
> Does it ever help to set the number of GC threads greater than
> numCapabilities to over-partition the GC work?  The idea would be to
> enable some load balancing in the face of perturbation from external
> load on the machine...
> It looks like GHC 6.10 had a "-g" flag for this that.... later went away?

The GC threads map one-to-one onto mutator threads now (since 6.12). 
This change was crucial for performance, before that we hardly ever got 
any speedup from parallel GC because there was no guarantee of locality.

I don't think it would help to have more threads.  The load-balancing is 
already done with work-stealing, it isn't statically partitioned.


More information about the Glasgow-haskell-users mailing list