POLL: GC options

Thomas Hallgren hallgren@cse.ogi.edu
Mon, 06 Aug 2001 15:40:50 -0700


Simon Marlow wrote:

>Folks,
>
>There is some disagreement over how the GC options should be specified
>for Haskell programs.
>
Something that I think would be very convenient, help alleviate some of 
the problems discussed, and still very easy to implement, would be 
support for setting run-time system options from an environment 
variable, GHCRTS say. That way, you wouldn't have to specify them on the 
command line *everytime *you run a program. It would allow users to run 
a shell scripts at login-time to set the heap size limit to some 
suitable value, in some platform-specific way, for example taking into 
account the amount of available RAM. An additional benefit is that if 
you call Haskell programs from shell scripts, and switch between 
different Haskell compilers (like I often do), you don't have to change 
your scripts to pass the right RTS options: they could automatically be 
taken from the right environment variable (GHCRTS, HBCRTS, HUGSRTS, 
NHC98RTS, etc)

In the fudget library, we use the following flexible scheme (*):

    The /value/ of a parameter called /name/ is taken from

       1. the command line, if -/name// value/ is present, else
       2. the environment variable FUD_/prog/_/name/ (where /prog/ is
          the name of the program), if set, else
       3. the environment variable FUD_/name/, if set, else
       4. a builtin default (indicated in the tables above).

This allows users to set global defaults as well as defaults for 
particular programs.

>Issue 2: Should -M be renamed to -H, and -H renamed to something else?
>
HBC calls these flags -h and -H. I am sure you can figure out which is 
which!

>Issue 3: (suggestion from Julian S.) Perhaps there should be two options
>to specify "optimise for memory use" or "optimise for performance",
>
Clever automatic GC tuning would of course be nice. The current solution 
seems to set the limit on how much can be allocated before the next GC 
based on heap residency. This lowers the performance of programs with 
low residency and fast allocation rate. Taking the ratio between GC time 
and mutator time into account could perhaps help?

Regarding the maximum heap size, to avoid letting the heap grow too 
large, you could perhaps take into account the number of page faults 
that occur during garbage collection, or the ratio between CPU time and 
real time...

Regards,
Thomas Hallgren

(*)
http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/userguide.html#parameters