Removing latency spikes. Garbage collector related?

John Lato jwlato at gmail.com
Tue Sep 29 15:47:20 UTC 2015


By dumping metrics, I mean essentially the same as the ghc-events-analyze
annotations but with any more information that is useful for the
investigation.  In particular,  if you have a message id, include that. You
may also want to annotate thread names with GHC.Conc.labelThread. You may
also want to add more annotations to drill down if you uncover a problem
area.

If I were investigating, I would take e.g. the five largest outliers, then
look in the (text) eventlog for those message ids, and see what happened
between the start and stop.  You'll likely want to track the thread states
(which is why I suggested you annotate the thread names).

I'm not convinced it's entirely the GC, the latencies are larger than I
would expect from a GC pause (although lots of factors can affect that). I
suspect that either you have something causing abnormal GC spikes, or
there's a different cause.

On 04:15, Tue, Sep 29, 2015 Will Sewell <me at willsewell.com> wrote:

> Thanks for the reply John. I will have a go at doing that. What do you
> mean exactly by dumping metrics, do you mean measuring the latency
> within the program, and dumping it if it exceeds a certain threshold?
>
> And from the answers I'm assuming you believe it is the GC that is
> most likely causing these spikes. I've never profiled Haskell code, so
> I'm not used to seeing what the effects of the GC actually are.
>
> On 28 September 2015 at 19:31, John Lato <jwlato at gmail.com> wrote:
> > Try Greg's recommendations first.  If you still need to do more
> > investigation, I'd recommend that you look at some samples with either
> > threadscope or dumping the eventlog to text.  I really like
> > ghc-events-analyze, but it doesn't provide quite the same level of
> detail.
> > You may also want to dump some of your metrics into the eventlog, because
> > then you'll be able to see exactly how high latency episodes line up
> with GC
> > pauses.
> >
> > On Mon, Sep 28, 2015 at 1:02 PM Gregory Collins <greg at gregorycollins.net
> >
> > wrote:
> >>
> >>
> >> On Mon, Sep 28, 2015 at 9:08 AM, Will Sewell <me at willsewell.com> wrote:
> >>>
> >>> If it is the GC, then is there anything that can be done about it?
> >>
> >> Increase value of -A (the default is too small) -- best value for this
> is
> >> L3 cache size of the chip
> >> Increase value of -H (total heap size) -- this will use more ram but
> >> you'll run GC less often
> >> This will sound flip, but: generate less garbage. Frequency of GC runs
> is
> >> proportional to the amount of garbage being produced, so if you can
> lower
> >> mutator allocation rate then you will also increase net productivity.
> >> Built-up thunks can transparently hide a lot of allocation so fire up
> the
> >> profiler and tighten those up (there's an 80-20 rule here). Reuse output
> >> buffers if you aren't already, etc.
> >>
> >> G
> >>
> >> --
> >> Gregory Collins <greg at gregorycollins.net>
> >> _______________________________________________
> >> Glasgow-haskell-users mailing list
> >> Glasgow-haskell-users at haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/glasgow-haskell-users/attachments/20150929/afc8c6db/attachment.html>


More information about the Glasgow-haskell-users mailing list