Removing latency spikes. Garbage collector related?

Gregory Collins greg at
Tue Sep 29 15:33:06 UTC 2015

On Tue, Sep 29, 2015 at 2:03 AM, Will Sewell <me at> wrote:

> * I then tried a value of -A2048k because he also said "using a very
> large young generation size might outweigh the cache benefits". I
> don't exactly know what he meant by "a very large young generation
> size", so I guessed at this value. Is it in the right ballpark?

I usually use 2-8M for this value, depending on the chip. Most values in
the young generation are going to be garbage, and collection is
O(num_live_objects), so as long as you can keep this buffer and your
working set (i.e. the long-lived stuff that doesn't get GC'ed) in L3 cache,
higher values are better. I expect there is another such phase transition
as you set -A around the L2 cache size, but everything depends on what your
program is actually doing. Keeping a smaller young generation will mean
that those cache lines are hotter than they would be if you set it larger,
and that means increasing L2 cache pressure and potentially evicting
working set, so maybe you make average GC pause time faster (helping with
tail latency) at the expense of doing GC more often and maybe reducing the
amount of L2 cache available.

* With -H, I tried values of -H8m, -H32m, -H128m, -H512m, -H1024m
> But all lead to worse performance over the defaults (and -H didn't
> really have much affect at all).

What you should expect to see as you increase -H is that major GC pauses
become more infrequent, but average GC times go up. Dumping +RTS -S for us
will also help us understand your GC behaviour, since I wouldn't expect to
see 1s pauses on any but the largest heaps. Are you using large

Gregory Collins <greg at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Glasgow-haskell-users mailing list