the FFI and C reference counting
Peter Gammie
peteg42 at gmail.com
Sun Jul 24 09:05:06 EDT 2005
Hello,
I have a question about the FFI and reference counted objects in C
land. This is a bit complicated, unfortunately.
The problem arises in trying to integrate a BDD package with GHC's
storage
manager. One of the properties of BDD packages is that the name of a
BDD (a
pointer or an int) is canonical (wrt boolean functions), no matter
how it is
constructed; so for example:
True
True and True
are both given the same identifier. The user of the C library must
increment
and decrement reference counts attached to these identifiers. (If you
are
unfamiliar with BDDs, just think of a bunch of objects stored in a
hashtable
where several keys map to each object, each of which is created on first
access via any key and garbage collected when it's reference count
becomes
zero. The key property is that the same object can be returned using
different keys.)
I've had success with the CMU/Long package using code of the form
(some of
it generated by c2hs):
newtype BDDManager = BDDManager (Ptr (BDDManager))
newtype BDD = BDD (ForeignPtr BDD)
withBDD (BDD fptr) = withForeignPtr fptr
bdd_manager :: BDDManager
bdd_manager = unsafePerformIO bdd_init
addBDDfinalizer :: Ptr BDD -> IO BDD
addBDDfinalizer bddp = liftM BDD $ newConcForeignPtr bddp bddf
where bddf = bdd_free bdd_manager bddp
{-# INLINE addBDDfinalizer #-}
-- sample BDD operation.
bddNOT = unsafePerformIO $
withBDD x (bdd_not bdd_manager) >>= addBDDfinalizer
where all free variables should be FFI functions if I got this excerpt
right. (This BDD package does the refcount increment for us before
returning the BDD, which avoids the following problem.)
This approach fails to work for the CUDD library, which sometimes
claims I
have called the dereference function too many times. It requires an
explicit
refcount increment on each BDD returned, so "addBDDfinalizer" becomes:
addBDDfinalizer :: Ptr BDD -> IO BDD
addBDDfinalizer bddp =
do cudd_Ref bddp
liftM BDD $ newConcForeignPtr bddp bddf
where bddf =
do cudd_RecursiveDeref ddmanager bddp
return ()
{-# INLINE addBDDfinalizer #-}
After much head scratching I can only think that the following is
happening:
1. Create BDD, BDD.refcount++, add finalizer.
2. BDD becomes unreferenced but not yet GC'd.
3. Call BDD operation returning the same BDD as step 1. (call it BDD')
4. GHC's GC kicks in and BDD.refcount--. (*)
5. BDD'.refcount++, add finalizer.
At (*), CUDD deallocates BDD (== BDD') and BDD' becomes a dangling
pointer.
Is this plausible?
I am wondering if anyone else has had a similar problem (e.g. with GTK's
reference counting) and has a nicer solution than writing wrappers that
increment the reference count before returning to GHC's runtime.
If I am on the right track then it seems that requiring the user to
increment the reference count on each returned object is a pretty bad
design decision.
If I'm not, some debugging hints would be most welcome.
On a related note: what impact will multi-core GHC have on code like
this?
I get queasy thinking of how it will interact with all those
non-threadsafe C libraries bound in a style similar to the above...
thanks,
peter
More information about the Glasgow-haskell-users
mailing list