Removing latency spikes. Garbage collector related?

Will Sewell me at willsewell.com
Mon Sep 28 16:08:33 UTC 2015


Hi, I was told in the #haskell IRC channel that this would be a good
place to ask this question, so here goes!

We’re writing a low-latency messaging system. The problem is we are
getting a lot of latency spikes. See this image:
http://i.imgur.com/GZ0Ek98.png (yellow to red is the 90th percentile),
which shows end-to-end latency of messages through the system.

I have tried to eliminate the problem by removing parts of the system
that I suspected to be expensive, but the spikes are still there.

I’m now thinking that it’s the GC. As you can see in this output from
ghc-events-analyze, work on the GC thread (red) seems to be blocking
work on the main program thread (green) http://i.imgur.com/4YO5q4U.png
(x axis is time, darkness of buckets is % CPU time).

*Note: the graphs are not of the same run, but are typical*

Do you think the GC is the most likely culprit?
Is there anything I can do to confirm this hypothesis? (I looked into
turning off the GC, but this seems tricky)
If it is the GC, then is there anything that can be done about it?

Thanks!
Will


More information about the Glasgow-haskell-users mailing list