What makes GHC's GC good for purely functional workloads?

Wed Nov 23 11:31:22 CET 2011

On 23/11/2011 04:09, Edward Z. Yang wrote:
> Apologies if this has been asked before.  In some ways
> GHC's garbage collectors are very traditional, so I was
> wondering what magic dust makes it do better in Haskelly
> workloads (e.g. whereas you'd need to do a bit more work
> in the JVM to make some allocation pattern play nice.)

Our GC is generally tuned towards making allocation fast.  So we use 
bump-pointer allocation and allocate several objects at once, and we use 
generational GC where the default nursery size is small (about L2 
cache-sized).  Young-generation collections are very quick - on the 
order of 10us.  So we collect a lot of the garbage before it even hits 
main memory; this is also what makes GHC's GC scale reasonably well for 
parallel programs.

On the other hand, our write barriers are more expensive than you'd 
typically find in a JVM, but they are more accurate: remembered set 
rather than card marking.  This is a good tradeoff because mutation is 
rare in Haskell.  The one exception is thunk update - there might be 
room for experimenting with a cheaper write barrier there.

One thing I'm particularly proud of in our GC is the block layer.  All 
memory is managed in units of blocks (currently 4k, but that's just a 
constant you can change).  It makes everything a lot simpler, including 
parallel GC.  We have a paper about it:

http://community.haskell.org/~simonmar/papers/parallel-gc.pdf

Cheers,
	Simon