[Haskell-cafe] About pointer taken by C lib in FFI.

Mon Jan 14 14:42:51 UTC 2019

Hi, nice question.

> Does it track the "used by foreign lib"?

No, `Ptr` is a simple primitive numeric value, like `void *` in C itself.
GHC does not track what you do with it at all.

The lifetime and ownership of the pointer depends on how you created it.

For example, the `withCString` function of type
    withCString :: String -> (Ptr CChar -> IO a) -> IO a
    https://hackage.haskell.org/package/base-4.12.0.0/docs/Foreign-C-String.html#v:withCString
used e.g. like
    withCString "hello" $ \ptr -> do
       -- do something with with ptr here
keeps the pointer alive exactly within the (do ...) block.
Afterwards, the memory the `ptr` points to will be freed.

Similar for `allocaBytes :: Int -> (Ptr a -> IO b) -> IO b`.
You might do

    allocaBytes 1000 $ \(ptr :: Ptr void) -> do
       poke (castPtr ptr) ('c' :: Char)
       poke (castPtr ptr) (1234 :: Word64)
       -- call FFI function doing something with `ptr`

and after allocaBytes itself returns, the memory is gone.

Other functions, such as
    malloc :: Storable a => IO (Ptr a)
    mallocBytes :: Int -> IO (Ptr a)
only allocate the memory and never free it, and you need to free it later yourself (you can also use C's `free()` on the C side for that).

This may be what you want if you want the C code to take ownership of it.

In that case, you must take care that this is async-exception safe, e.g. that you don't leak the allocated memory when an async exception comes in (e.g. from the `timeout` function or the user pressing Ctrl+C and you handling it and continuing).
In general, one deals with async exceptions by using code blocks that temporarily disable them, like the `bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c` function does;
see its docs as https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Exception-Base.html#v:bracket.
Two examples of how non-bracketet `malloc` can go wrong:

Example A (no ownership change involved):

    ptr <- mallocBytes 1000
    -- async exception comes in here
    someOtherCodeDoingSomethingWith ptr
    free ptr

Example B:

    ptr <- mallocBytes 1000
    -- async exception comes in here
    ffiCodeThatChangesOwnershipToCLibrary ptr

This would be bad, your allocced-bytes are unreachable and will memory leak forever.
`bracket` can trivially solve the problem in example A, because the lifetime of `ptr` is lexically scoped.

But for the handover in example B, the lifetime is not lexically scoped.

You generally have 2 approaches to do a safe hand-over to C for non-lexically-scoped cases:

1. malloc the memory on the C side in the first place, and pass the pointer to Haskell so it can poke values in. In this case, the C side had the ownership the entire time, so it allocated and freed the memory.

2. Store the information whether Haskell still has the memory ownership somewhere, and always modify the pointer and this information together in some atomic fashion (for example using `bracket` so that it cannot be interrupted in the middle).
The pointer and a mutable Bool reference would be such an information pair.
Equivalent would be a double-pointer, where the outer pointer points to NULL to indicate that the memory is already owned by C.

Below is a sketch of how to do it with the double-pointer approach:

    bracket
      acquireResource
      releaseResource
      (\ptrPtr -> do
        ptr <- peek ptrPtr
        poke ptr ('c' :: Char)
        poke ptr ('c' :: Word64)
        mask_ $ do -- we don't want to get interrupted in this block
          ffiCodeThatChangesOwnershipToCLibrary ptr
          poke ptrPtr nullPtr
        -- do some more work here
        return yourresult
      )

      where
        acquireResource :: IO (Ptr (Ptr void))
        acquireResource = do
          ptrPtr :: Ptr (Ptr void) <- malloc
          ptr :: Ptr void <- malloc
          poke ptrPtr ptr
          return ptrPtr

        releaseResource :: Ptr (Ptr void) -> IO yourresult
        releaseResource ptrPtr = do
          -- If ptrPtr points to NULL, then the ownership change happend.
          -- In that case we don't have to free `ptr` (and we cannot, as it is NULL).
          -- Otherwise, we still own the memory, and free it.
          ptr <- peek ptrPtr
          when (ptr == nullPtr) $ free ptr
          free ptrPtr  

(It is recommended to get familiar with `bracket` and `mask_` before understanding this.)

The above works in a single-threaded case; if concurrency comes into play and you wrote code so that parts of this might be executed by different threads, you'll naturally have to put locks (e.g. `MVar`s) around the `poke ptrPtr ...` and the place where the `== nullPtr` check is done.

I hope this helps!

Niklas

PS:
I work for a Haskell consultancy. If answers like this would help move your project forward, consider us :)