FFI Report, CVS Id 1.11

Thu Aug 23 04:05:06 EDT 2001

"Marcin 'Qrczak' Kowalczyk" <qrczak at knm.org.pl> wrote,

> Sun, 19 Aug 2001 23:14:59 +1000, Manuel M. T. Chakravarty <chak at cse.unsw.edu.au> pisze:
> 
> > * I am still not convinced that we need
> >   `Storable.destruct'.  For deallocating special purpose
> >   structures that need a deep traversal, shouldn't we just
> >   use a custom function?
> 
> Without destruct code like
> 
>     with haskellValue $ \ptr -> do
>         just use ptr, no allocation nor deallocation
> 
> would be correct in 99% of cases, except that for some types it
> leaks memory (when the only reasonable implementation of poke is not
> idempotent wrt. allocated memory).
> 
> These cases include char* inside a struct, assumed to point to
> a memory private to this struct.
> 
> Yes, it can be called by hand, just as C++ destructors could be
> called by hand. But places where it should be called are automatically
> predictable, as long as it is well defined which memory is initialized
> and which is not. Since with* functions automatically call poke, they
> should automatically call something which undoes side effects of poke.

I think, the crucial question is whether you really want to
have instances of poke that allocate memory (or any other
kind of resource).  I think, having such instances is
problematic, because it means that poke is no longer
idempotent, which I would usually expect it to be.

In other words, imagine I have code like

    with haskellValue $ \ptr -> do
      ...
      poke ptr newVal
      ...

For a type where `destruct' is needed, the above code would
leak memory (or whatever resource poke allocates).  Without
`destruct' in `Storable', you would have to write

    with haskellValue $ \ptr -> do
      ...
      myDestruct ptr
      poke ptr newVal
      ...
      myDestruct ptr

I think this is good, because it makes explicit the need for
`myDestruct' before the poke.

> > * I am also not really convinced about
> >   `MarshalUtils.withMany'.  There may be situations, where
> >   such a function is handy, but should it really be in the
> >   standard libraries?
> 
> It's not so easily written by hand. Explicit recursion is needed,
> AFAIK it can't be simply written in terms of foldr, mapM etc. So it
> requires a variable definition, and thus would be written in this
> very form where needed, instead of "inlined".
> 
> It's the mapM of the continuation monad, where the monad is
>     /\a. (a -> res) -> res

Especially in this case (ie, the function has a meaning
outside of the FFI context), I prefer SimonM's proposal of
moving it to Data.List.withMany or so.

> > PtrDiff
> > ~~~~~~~
> > Maybe after all, `PtrDiff' wasn't that bad an idea.  To
> > assume that a pointer difference fits into an `Int' (what we
> > do at the moment) is pretty dodgy.  Remember that all that
> > H98 requires of an `Int' is that it has >=30 bits.  IMHO,
> > this is pretty weak for a general representation of a
> > pointer difference.
> 
> Haskell already uses Int for these kinds of lengths: array indices
> are mapped to Int, default implementations of list functions use Int
> for measuring lengths.

Array and list indicies are quite different beats from
pointer differences.  To get into a problem with array
indicies, we would need an array (and thus, heap) that is
big enough to requires such an index.  In contrast, pointers
are not constrained be point into the heap.  For, example
with mmap(), I can easily get two pointers that are much
further apart than what would fit into the real (even the
virtual) memory of the concrete hardware.  In other words,
in x86 Linux, your address space is already 3GB.  We don't
need any new fancy hardware for 30 bit pointer differences
to have the potential to get us into trouble.

> On 64-bit processors Int is 64-bit (or 63-bit when targeting OCaml
> etc.), so the need of handling objects larger than a gigabyte in 32-bit
> architectures is temporary, if at all. Haskell has the advantage over
> C that there is no temptation to express all sized integer types as
> {char,short,int,long}, so there is no reason for Int to be too small.

It is basically a question of whether we want to rely on
Haskell implementers always choosing an Int representation
that is as wide as addresses are.  Or we could also say, do
we want to force them to always make such a choice if they
want a sensible FFI.

Anyway, I am not too concerned about this point.  As I said,
already on a normal PC, you can get into problems with
pointers obtained through mmap(), but there might not be
many applications where this is an issue.  In a sense, the
main point is, do we want to constrain all Haskell
implementations that implement the FFI to choose a
representation of Int whose range is the same as what the
kernel of the OS supports as an address range.

Cheers,
Manuel