Finalizers: conclusion?

Mon Jan 13 09:07:22 EST 2003

Antony Courtney <antony at apocalypse.org> wrote,

> You indicated that you were somewhat unclear why we need liveness 
> dependencies.  I'll attempt to clarify by sketching some of the details 
> of the particular C library for which I am writing FFI wrappers.
> 
> I have a C library for 2D vector graphics.  Two of the abstract types 
> provided by this C library are:
>     Pixmap -- A handle to an actual buffer of raster data
>     RenderContext -- A handle that encapsulates all state associated 
> with rendering, such as the current color, current font, target pixmap, etc.
> 
> Note that it is possible to create many RenderingContext's that all 
> render on to the same underlying Pixmap.
> 
> To see why we need liveness dependencies, consider the following typical 
> usage scenario in Haskell:
>     do pm <- createPixmap               -- 1
>        rc <- createRenderContext pm     -- 2
>        drawBox rc                       -- 3
>        ...
> 
> Note that, in the above, it's possible that the call to 
> createRenderContext in line 2 could be the last Haskell reference to pm, 
> making it a candidate for collection.  But we don't actually want the 
> Pixmap to be collected (and its finalizer invoked) until both the Pixmap 
>   *and* all associated rendering contexts which refer to the Pixmap 
> become unreachable.
> 
> The reason we need liveness dependencies is because, internally, the 
> RenderContext maintains a pointer to the target Pixmap.  But because 
> this pointer exists only in the C heap, we need some way to inform 
> Haskell's garbage collector that whenever a particular RenderContext is 
> reachable, then its target pixmap is also reachable.

IMHO you are trying to compensate for a flaw in the whole
setup:

* Line 1: You get a pointer to a C object assuming it is the
    last reference to that C object.

* Line 2: You pass this pointer back to C without copying
    it; ie, the only reference to the C object is in C land.

At this moment, the pointer obtained on Line 1 is no longer
the business of the Haskell system.  It is a pointer in C
land to a C object; so, memory management of that structure
should be let to the C library.  Assume the following C
function

  RenderContext *createPixmapWithContext ()
  {

    Pixmap *pm = createPixmap ();
    return createRenderContext (pm);
  }

in conjunction with

  do
    rc <- createPixmapWithContext
    drawBox rc

How is this different from your Haskell code in a way that
requires a foreign pointer dependency in one case, but not
in the other?

The only answer that I can think of is that when you passed
the reference back to C (and hence, the responsibility to
eventually free the object), you already registered a
finalizer on `pm', which will run eventually (as there is no
way of getting rid of it without running it).  Hence, you
want to delay running it.  My point is that running this
finalizer (if it deallocated the object) is wrong at any
time: As `createPixmapWithContext()' demonstrates, C land
must free `pm' when the last render context referring to
`pm' dies.  Even if you delay running the Haskell finalizer
for `pm' after this (using `keepAlive' or so), it is still
wrong to deallocate the object twice.

IMO the only clean way to approach this problem is to add a
reference counting scheme to `pm' in C land.  Whenever `rc'
is deallocated it decrements the count on the `pm' it refers
to.  Similarly, the finalizer on `pm' calls the routine that
decrements reference counts.  As usual, the object is only
deallocated when its reference count reaches zero.  BTW,
this is exactly how this problem is solved in the GTK+ GUI
toolkit.

Manuel