Bug in GC's ordering of ForeignPtr finalization?

Ben Gamari bgamari.foss at gmail.com
Sun Aug 28 23:27:49 CEST 2011


On Tue, 16 Aug 2011 12:32:13 -0400, Ben Gamari <bgamari.foss at gmail.com> wrote:
> It seems that the notmuch-haskell bindings (version 0.2.2 built against
> notmuch from git master; passes notmuch-test) aren't dealing with memory
> management properly. In particular, the attached test code[1] causes
> talloc to abort.  Unfortunately, while the issue is consistently
> reproducible, it only occurs with some queries (see source[1]). I have
> been unable to establish the exact criterion for failure.
> 
> It seems that the crash is caused by an invalid access to a freed Query
> object while freeing a Messages object (see Valgrind trace[3]). I've
> taken a brief look at the bindings themselves but, being only minimally
> familiar with the FFI, there's nothing obviously wrong (the finalizers
> passed to newForeignPtr look sane). I was under the impression that
> talloc was reference counted, so the Query object shouldn't have been
> freed unless if there was still a Messages object holding a
> reference. Any idea what might have gone wrong here?  Thanks!
> 
After looking into this issue in a bit more depth, I'm even more
confused. In fact, I would not be surprised if I have stumbled into a
bug in the GC. It seems that the notmuch-haskell bindings follow the
example of the python bindings in that child objects keep references to
their parents to prevent the garbage collector from releasing the
parent, which would in turn cause talloc to free the child objects,
resulting in odd behavior when the child objects were next accessed. For
instance, the Query and Messages objects are defined as follows,

    type MessagesPtr = ForeignPtr S__notmuch_messages
    type MessagePtr = ForeignPtr S__notmuch_message
    newtype Query = Query (ForeignPtr S__notmuch_query)
    data MessagesRef = QueryMessages { qmpp :: Query, msp :: MessagesPtr }
                     | ThreadMessages { tmpp :: Thread, msp :: MessagesPtr }
                     | MessageMessages { mmspp :: Message, msp :: MessagesPtr }
    data Message = MessagesMessage { msmpp :: MessagesRef, mp :: MessagePtr }
                 | Message { mp :: MessagePtr }
    type Messages = [Message]

As seen in the Valgrind dump given in my previous message, it seems that
the Query object is being freed before the Messages object. Since the
Messages object is a child of the Query object, this fails.

In my case, I'm calling queryMessages which begins by issuing a given
notmuch Query, resulting in a MessagesPtr. This is then packaged into a
QueryMessages object which is then passed off to
unpackMessages. unpackMessages iterates over this collection, creating
MessagesMessage objects which themselves refer to the QueryMessages
object. Finally, these MessagesMessage objects are packed into a list,
resulting in a Messages object. Thus we have the following chain of
references,

        MessagesMessage
              |   
              |  msmpp
              \/
        QueryMessages
              |
              |  qmpp
              \/
            Query

As we can see, each MessagesMessage object in the Messages list
resulting from queryMessages holds a reference to the Query object from
which it originated. For this reason, I fail to see how it is possible
that the RTS would attempt to free the Query before freeing the
MessagesPtr. Did I miss something in my analysis? Are there tools for
debugging issues such as this? Perhaps this is a bug in the GC?

Any help at all would be greatly appreciated.

Cheers,

- Ben



More information about the Glasgow-haskell-users mailing list