Docs for RTS option -I

Teofil Camarasu teofilcamarasu at gmail.com
Thu Nov 2 15:17:23 UTC 2023


Hi Bryan,

Thanks for improving this documentation! I've often found these flags
to be quite confusing.

> For an interactive application, it is probably a good idea to use the idle GC, because this will allow finalizers to run and deadlocked threads to be detected in the idle time when no Haskell computation is happening. [Why is this a good thing? What happens when the idle GC is disabled?]

So there's basically 3 ways to trigger a major GC as far as I know:
1. Heap overflow: when we last performed a major GC we checked how
much live data there is and set a variable so that we do another major
GC when the heap grows to be live * F.
2. Idle GC
3. Manually triggering a GC using the interface in System.Mem

When idle gc is disabled, then GC will happen less often. One of the
other two may still trigger a GC.

A key difference is both of those are only activated by the mutator
running code: either through allocation or by calling a GC directly.
On the other hand, idle GC can be triggered when the mutator isn't
running. So, if you want to ensure that finalizers get called promptly
then idle GC can help, especially if your application is idle for long
periods of time.

The other key benefit of the idle GC is that it can reduce the
prevalence of heap overflow GCs. These can only happen when your
application is allocating and hence running code. So it's quite likely
that it's going to tank the response time for the request your
application is serving at the time. And since idle GCs free some
memory, it makes it less likely that you reach the limit that would
trigger a heap overflow GC.

With idle GCs, if you are lucky, major GCs will only run while your
application isn't meant to be responding to requests at all, which
makes it basically free.

> Also, it will mean that a GC is less likely to happen when the application is busy, so application responsiveness may be improved. However, if the amount of live data in the heap is particularly large, then the idle GC can cause a significant penalty to responsiveness. [Why? Is it because the idle GC was delayed by waiting for some idle time, and thus has more work to do?].

The reason this can happen is because the time a major GC takes is
proportional to the live data in the heap. So, if the pause required
by the GC starts to overlap with time when you'd like the application
to be working on a response, then you will regress response times. For
instance if it takes 100ms to run an idle GC and a request comes in
just after you've started the GC then processing it will have to wait
until the GC is over.

>Conversely, too small of an interval could adversely affect interactive responsiveness [How? And how is this worse than having idle GC disabled? What is the actual behavior when it's disabled, anyway?]

The smaller the interval, the more time you are spending running an
idle GC, the more likely it becomes that it will overlap with time you
want to be doing something else. This is similar to the long GC case
above due to large heaps.

Another reason you might not want to run it too often is that you are
unlikely to free much memory.

I think this documentation was written before the non-moving GC was
added. It would also be important to add that the savings in terms of
responsiveness don't really apply if that is enabled as the non-moving
GC runs concurrently with the mutator anyway. So, the main advantage
would just be more prompt finalization, deadlock detection, etc.

I hope that helps; let me know if you'd like anything clarified.

Cheers,
Teo


More information about the ghc-devs mailing list