[Haskell-cafe] Re: Real-time garbage collection for Haskell
Simon Marlow
marlowsd at gmail.com
Sat Mar 6 07:42:39 EST 2010
On 06/03/10 06:56, Simon Cranshaw wrote:
> For settings we are using -N7 -A8m -qg.
I'm surprised if turning off parallel GC improves things, unless you
really aren't using all the cores (ThreadScope will tell you that).
Do these flags give you an improvement in throughput, or just pause times?
> I don't know if they are really the optimal values but I haven't found a
> significant improvement on these yet. I tried -qb but that was slow.
Interesting, I often find that -qb improves things.
> I
> tried larger values of A but that didn't seem to make a big difference.
-A8m is close to the size of your L2 caches, right? That will certainly
be better than the default of -A512k.
> Also -N6 didn't make much difference. Specifying H values didn't seem
> to make much difference.
-H is certainly a mixed bag when it comes to parallel programs.
> I have to admit I don't fully understand the
> implications of the values and was just experimenting to see what worked
> best.
So the heap size is trading off locality (cache hits) against GC time.
The larger the heap, the fewer GCs you do, but the worse the locality.
Usually I find keeping the nursery size (-A) close to the L2 cache size
works best, although sometimes making it really big can be even better.
-qg disables parallel GC completely. This is usually terrible for
locality, because every GC will move all the recently allocated data
from each CPU's L2 cache into the cache of the CPU doing GC, where it
will have to be fetched out again after GC.
-qb disables load-balancing in the parallel GC, which improves locality
at the expense of parallelism, usually I find it is an improvement in
parallel programs.
Cheers,
Simon
More information about the Haskell-Cafe
mailing list