[Haskell-cafe] About pointer taken by C lib in FFI.

Mon Jan 14 18:30:20 UTC 2019

Thanks.

On Mon, Jan 14, 2019 at 10:42 PM Niklas Hambüchen <mail at nh2.me> wrote:
>
> Hi, nice question.
>
> > Does it track the "used by foreign lib"?
>
> No, `Ptr` is a simple primitive numeric value, like `void *` in C itself.
> GHC does not track what you do with it at all.
>
> The lifetime and ownership of the pointer depends on how you created it.
>
> For example, the `withCString` function of type
>     withCString :: String -> (Ptr CChar -> IO a) -> IO a
>     https://hackage.haskell.org/package/base-4.12.0.0/docs/Foreign-C-String.html#v:withCString
> used e.g. like
>     withCString "hello" $ \ptr -> do
>        -- do something with with ptr here
> keeps the pointer alive exactly within the (do ...) block.
> Afterwards, the memory the `ptr` points to will be freed.
>
> Similar for `allocaBytes :: Int -> (Ptr a -> IO b) -> IO b`.
> You might do
>
>     allocaBytes 1000 $ \(ptr :: Ptr void) -> do
>        poke (castPtr ptr) ('c' :: Char)
>        poke (castPtr ptr) (1234 :: Word64)
>        -- call FFI function doing something with `ptr`
>
> and after allocaBytes itself returns, the memory is gone.
>
> Other functions, such as
>     malloc :: Storable a => IO (Ptr a)
>     mallocBytes :: Int -> IO (Ptr a)
> only allocate the memory and never free it, and you need to free it later yourself (you can also use C's `free()` on the C side for that).
>
> This may be what you want if you want the C code to take ownership of it.
>
> In that case, you must take care that this is async-exception safe, e.g. that you don't leak the allocated memory when an async exception comes in (e.g. from the `timeout` function or the user pressing Ctrl+C and you handling it and continuing).
> In general, one deals with async exceptions by using code blocks that temporarily disable them, like the `bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c` function does;
> see its docs as https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Exception-Base.html#v:bracket.
> Two examples of how non-bracketet `malloc` can go wrong:
>
> Example A (no ownership change involved):
>
>     ptr <- mallocBytes 1000
>     -- async exception comes in here
>     someOtherCodeDoingSomethingWith ptr
>     free ptr
>
> Example B:
>
>     ptr <- mallocBytes 1000
>     -- async exception comes in here
>     ffiCodeThatChangesOwnershipToCLibrary ptr
>
> This would be bad, your allocced-bytes are unreachable and will memory leak forever.
> `bracket` can trivially solve the problem in example A, because the lifetime of `ptr` is lexically scoped.
>
> But for the handover in example B, the lifetime is not lexically scoped.
>
> You generally have 2 approaches to do a safe hand-over to C for non-lexically-scoped cases:
>
> 1. malloc the memory on the C side in the first place, and pass the pointer to Haskell so it can poke values in. In this case, the C side had the ownership the entire time, so it allocated and freed the memory.
>
> 2. Store the information whether Haskell still has the memory ownership somewhere, and always modify the pointer and this information together in some atomic fashion (for example using `bracket` so that it cannot be interrupted in the middle).
> The pointer and a mutable Bool reference would be such an information pair.
> Equivalent would be a double-pointer, where the outer pointer points to NULL to indicate that the memory is already owned by C.
>
> Below is a sketch of how to do it with the double-pointer approach:
>
>     bracket
>       acquireResource
>       releaseResource
>       (\ptrPtr -> do
>         ptr <- peek ptrPtr
>         poke ptr ('c' :: Char)
>         poke ptr ('c' :: Word64)
>         mask_ $ do -- we don't want to get interrupted in this block
>           ffiCodeThatChangesOwnershipToCLibrary ptr
>           poke ptrPtr nullPtr
>         -- do some more work here
>         return yourresult
>       )
>
>       where
>         acquireResource :: IO (Ptr (Ptr void))
>         acquireResource = do
>           ptrPtr :: Ptr (Ptr void) <- malloc
>           ptr :: Ptr void <- malloc
>           poke ptrPtr ptr
>           return ptrPtr
>
>         releaseResource :: Ptr (Ptr void) -> IO yourresult
>         releaseResource ptrPtr = do
>           -- If ptrPtr points to NULL, then the ownership change happend.
>           -- In that case we don't have to free `ptr` (and we cannot, as it is NULL).
>           -- Otherwise, we still own the memory, and free it.
>           ptr <- peek ptrPtr
>           when (ptr == nullPtr) $ free ptr
>           free ptrPtr
>
> (It is recommended to get familiar with `bracket` and `mask_` before understanding this.)
>
> The above works in a single-threaded case; if concurrency comes into play and you wrote code so that parts of this might be executed by different threads, you'll naturally have to put locks (e.g. `MVar`s) around the `poke ptrPtr ...` and the place where the `== nullPtr` check is done.
>
> I hope this helps!
>
> Niklas
>
>
> PS:
> I work for a Haskell consultancy. If answers like this would help move your project forward, consider us :)

-- 
竹密岂妨流水过
山高哪阻野云飞

And for G+, please use magiclouds#gmail.com.