the FFI and C reference counting

Peter Gammie peteg42 at gmail.com
Sun Jul 24 09:05:06 EDT 2005


Hello,

I have a question about the FFI and reference counted objects in C
land. This is a bit complicated, unfortunately.

The problem arises in trying to integrate a BDD package with GHC's  
storage
manager. One of the properties of BDD packages is that the name of a  
BDD (a
pointer or an int) is canonical (wrt boolean functions), no matter  
how it is
constructed; so for example:

True
True and True

are both given the same identifier. The user of the C library must  
increment
and decrement reference counts attached to these identifiers. (If you  
are
unfamiliar with BDDs, just think of a bunch of objects stored in a  
hashtable
where several keys map to each object, each of which is created on first
access via any key and garbage collected when it's reference count  
becomes
zero. The key property is that the same object can be returned using
different keys.)

I've had success with the CMU/Long package using code of the form  
(some of
it generated by c2hs):

newtype BDDManager = BDDManager (Ptr (BDDManager))
newtype BDD = BDD (ForeignPtr BDD)

withBDD (BDD fptr) = withForeignPtr fptr

bdd_manager :: BDDManager
bdd_manager = unsafePerformIO bdd_init

addBDDfinalizer :: Ptr BDD -> IO BDD
addBDDfinalizer bddp = liftM BDD $ newConcForeignPtr bddp bddf
     where bddf = bdd_free bdd_manager bddp
{-# INLINE addBDDfinalizer #-}

-- sample BDD operation.
bddNOT = unsafePerformIO $
       withBDD x (bdd_not bdd_manager) >>= addBDDfinalizer

where all free variables should be FFI functions if I got this excerpt
right.  (This BDD package does the refcount increment for us before
returning the BDD, which avoids the following problem.)

This approach fails to work for the CUDD library, which sometimes  
claims I
have called the dereference function too many times. It requires an  
explicit
refcount increment on each BDD returned, so "addBDDfinalizer" becomes:

addBDDfinalizer :: Ptr BDD -> IO BDD
addBDDfinalizer bddp =
     do cudd_Ref bddp
        liftM BDD $ newConcForeignPtr bddp bddf
     where bddf =
             do cudd_RecursiveDeref ddmanager bddp
                return ()
{-# INLINE addBDDfinalizer #-}

After much head scratching I can only think that the following is  
happening:

1. Create BDD, BDD.refcount++, add finalizer.
2. BDD becomes unreferenced but not yet GC'd.
3. Call BDD operation returning the same BDD as step 1. (call it BDD')
4. GHC's GC kicks in and BDD.refcount--. (*)
5. BDD'.refcount++, add finalizer.

At (*), CUDD deallocates BDD (== BDD') and BDD' becomes a dangling  
pointer.

Is this plausible?

I am wondering if anyone else has had a similar problem (e.g. with GTK's
reference counting) and has a nicer solution than writing wrappers that
increment the reference count before returning to GHC's runtime.

If I am on the right track then it seems that requiring the user to
increment the reference count on each returned object is a pretty bad
design decision.

If I'm not, some debugging hints would be most welcome.

On a related note: what impact will multi-core GHC have on code like  
this?
I get queasy thinking of how it will interact with all those
non-threadsafe C libraries bound in a style similar to the above...

thanks,
peter



More information about the Glasgow-haskell-users mailing list