POLL: GC options

Simon Marlow simonmar@microsoft.com
Tue, 7 Aug 2001 16:58:06 +0100

> In local.glasgow-haskell-users, you wrote:
> > Issue 1: should the maximum heap size be unbounded by default?
> > Currently the maximum heap size is bounded at 64M. =20
> Arguments for: this
> > stops programs with a space leak eating all your swap=20
> space.  Arguments
> > against: it's annoying to have to raise the limit when you=20
> legitimately
> > need more space.
> I'm for boundless wasting of memory. If I'd cared, I'd set the
> ulimits correspondingly. You really don't want to get to work
> on Monday morning finding that your favourite prime-finder stopped
> Friday five minutes after you left :-)

The discussion is somewhat moot, since I removed the default limit
anyway, but I just thought I'd point out that there's an important
difference between GHC's max heap size and the ulimit: GHC will adjust
the sizes of the generations to try to stay within the max heap size
(thus trading performance for space), whereas it doesn't do that for the
ulimit.  Anyway, the upshot is that you'll get more failures setting the
ulimit to X than when setting the max heap size to X.

I certainly recommend setting either a ulimit or a maximum heap size,
especially when doing development.  In my experience, modern OS's which
do memory overcommit tend to behave very badly when something eats all
the swap: if you're lucky, your Haskell process will get killed, but if
you're unlucky, some other random process(es) will get killed or the
machine will crash altogether.  Besides, if you're on a multi-user box
using up all the swap isn't very sociable :-)

An interesting idea is for GHC to read the current ulimit settings and
set the max heap size accordingly; however this doesn't seem to be
straightforward.  There are four settings: the data seg size, the
resident set size, the virtual memory size and the locked-in memory
size.  The data seg size doesn't affect GHC-compiled programs, because
we mmap() the heap.  The resident set size similarly doesn't seem to
have any effect on Linux.  The locked-in memory size doesn't affect us
(we don't use mlock()), which leaves only the virtual memory size.  This
one sets a bound on the whole process size, including the size of the
binary itself any memory that has been malloc()ed, so it isn't easy to
figure out how much is left for the heap.