[Haskell-cafe] Re: Real-time garbage collection for Haskell
Simon Marlow
marlowsd at gmail.com
Tue Mar 2 09:10:49 EST 2010
On 01/03/2010 14:53, Thomas Schilling wrote:
> On 28 February 2010 05:20, Luke Palmer<lrpalmer at gmail.com> wrote:
>> I have seen some proposals around here for SoC projects and other
>> things to try to improve the latency of GHC's garbage collector. I'm
>> currently developing a game in Haskell, and even 100ms pauses are
>> unacceptable for a real-time game. I'm calling out to people who have
>> seen or made such proposals, because I would be willing to contribute
>> funding and/or mentor a project that would contribute to this goal.
>> Also any ideas for reducing this latency in other ways would be very
>> appreciated.
>
> There is a SoC project suggestion to implement Immix's ideas [1] in
> GHC's GC. Both already use similar overall designs. Both split the
> heap into regions which may employ different collection strategies.
> However, Immix does not address real-time issues.
>
> The main difficulty with real-time GC is that, while first-generation
> collection is usually very fast, eventually you just have to collect
> the old generation and you have to do it all at once. Sun's new
> Garbage-First ("G1") [2] collector therefore tracks pointers between
> regions, as opposed to just pointers from older two newer generations.
> This allows collecting regions independently (and in parallel). G1
> is still stop-the-world, although marking phase is concurrent.
> Tracking pointers between all regions can result in quite substantial
> space overheads, however, so G1 uses some heuristics to discover
> "popular objects" and treats them specially. In a personal
> conversation Simon Marlow expressed to me that he intends to go
> further into this direction, but I don't know how high-priority it is.
> In general I don't think true real-time is the goal in any case, but
> rather a general effort to keep GC-pauses short.
>
> Truly concurrent garbage collection is a whole different beast.
> Concurrent marking can be implemented efficiently with a write
> barrier. I don't know of any fully concurrent GC scheme that gets by
> without a read barrier and significant space overhead, however. There
> are certainly no plans from the GC HQ to implement a fully concurrent
> GC.
A good summary of concurrent GC techniques used successfully in industry
was posted to the GC mailing list recently by Erez Petrank:
http://permalink.gmane.org/gmane.comp.programming.garbage-collection.general/388
Several of those we could do in GHC, for example mostly-concurrent
collection uses only a write barrier, and piggybacks on the generational
write-barrier we already have, but it does require two stop-the-world
phases per GC. I think you'd want to do it in conjunction with
Immix-style region sweeping in the old generation, so implementing Immix
would be a good first step.
Cheers,
Simon
More information about the Haskell-Cafe
mailing list