[Haskell] Re[2]: [Haskell-cafe] Is Haskell a Good Choice for Web Applications? (ANN: Vocabulink)

Thu May 7 09:45:53 EDT 2009

On 07/05/2009 11:51, Bulat Ziganshin wrote:
> Hello Simon,
>
> Thursday, May 7, 2009, 2:04:05 PM, you wrote:
>
>>>> I've heard it's hard to contain a long-running Haskell application in
>>>> a finite amount of memory
>>> not exactly. you may alloc fixed pool of memory to application (say, 1gb)
>>> if you know that it never need more memory. but as far as you don't do
>>> it, memory usage grows with each major GC. ghc just don't provide
>>> any way to return memory to OS (there is a ticket on it, you can add
>>> yourself to CC list to vote for its resolution)
>
>> http://hackage.haskell.org/trac/ghc/ticket/698
>
>> But let's be clear: this is not a memory leak, the issue is only that
>> GHC's runtime retains as much memory as was required by the program's
>> high-water-mark memory usage.  Fixing this will never reduce the
>> high-water-mark.  So I'm not sure what you meant by "memory usage grows
>> with each major GC".
>
> 1. none said that it's memory leak, at least in haskell code

You said "memory usage grows with each major GC", which sounded like a 
leak, I just wanted to clarify.

> 2. it seems that one of us doesn't know intimate details of GHC GC :)
> the following is my understanding, learned from +RTS -s/-S listings
> with ghc 6.4 or so:
>
> copying GC: let's imagine that we perform major GC when 100 mb is
> allocated, of those 10 mb are live. doing GC, ghc will alloc 10+ mb
> from OS,

Yes.

> and promote 100 mb freed to allocation area.

No.  Well, it depends on the GC settings.  With the default GC settings, 
the allocation area for each core is fixed at 512k (the docs are a bit 
out of date and say 256k, I've just fixed that).  The old generation is 
allowed to grow to 2x its previous size by default before being 
collected.  This is tunable with +RTS -F<n>.

In your example, the live data is 10MB, so the old generation will be 
allowed to grow to 20MB before being collected, meanwhile the young 
generation has 512KB for the allocation area, plus "step 2" of the young 
generation which will be at most another 512KB and usually a lot 
smaller.  So 21MB max in total.  When the old generation is next 
collected, assuming 10MB are still live, we'll need in total 21MB + 10MB 
of copied live data = 31MB.

The problem referred to originally in this thread is that even though 
the program might be running with a steady memory usage of 31MB, if in 
the past it needed 100MB that extra memory is retained by the RTS and 
won't be freed back to the OS.

> so from this
> point program will occupy 110mb of memory (although only 10 mb is
> really used ATM)

Yes - if 100MB has been allocated, and 10MB copied, then the total 
memory that the RTS has allocated from the OS will be 110MB.

> next major GC will occur when all these 100 mb will be allocated, i.e.
> overall memory allocated will be 110 mb.

No - see above.

> again, GHC will allocate as
> much memory as required for live data, increasing program footprint
>
>
> compacting GC: if major GC occurs when 100 mb is allocated, GHC
> increase memory footprint 2x (controlled by +RTS -F), and then perform
> GC. or it will perform GC and only then alloc more memory? i don't
> remember

exactly as above, except that the heap is compacted in-place rather than 
the live data being copied, so no additional memory is required by the 
GC.  Well, there's a little memory used for the bitmaps and the mark 
stack, but that will be a few percent at most.

> Simon, it will be VERY USEFUL if GC description that says about these
> behaviors will be added to manual. i'm definitely not the
> authoritative source on this topic and it seems that even you forget
> some details as time goes. this topic is very important for large
> programs and this question regularly arises here. we need
> authoritative description that pays attention to all GC-related RTS
> switches

The -A and -F flags are pretty well documented in the manual:

http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html#rts-options-gc

I'm happy to add more text here, what else would you like to see?

Cheers,
	Simon