[Haskell-cafe] [ANN] Haskell FFI Tutorial

Tue Nov 11 15:53:59 UTC 2014

On Mon, Nov 10, 2014 at 11:38 PM, Donn Cave <donn at avvanta.com> wrote:
> Maybe they don't!  I guess it isn't so much about exactly what you were
> up to, but for the sake of getting to whether there's an issue here for
> the tutorial, I wrote up a little example program, with CChar and Char.
> The commented alternatives work as well, at least it looks fine to me.
...
> - that means the Storable instance in question is CChar, and it looks
>   to me like poke reliably writes exactly one byte in this case,
>   whatever value is supplied (I also tried Int.)

I think Int is probably unsafe too, in theory if not in practice.

> - one might very well manage to keep all the poking to t fields in
>   the T Storable instance - that's what I'd expect the tutorial
>   to focus on.  Not that it makes any great difference, but I'm just
>   saying that the "ypoke" function in the example is there purely
>   for the purpose of testing that Char/CChar thing you're talking
>   about, and would be somewhat outside what I see as core usage.

Yes, it would be safe to say that all haskell data types which are
serializable to C should have only C types.  That also avoids the
problem.  However, you have to convert between haskell and C at some
point, and that means you wind up with C and haskell duplicates of all
records, so each one is actually expressed in 3 places: the C struct,
the haskell "CType" record, and the haskell type record.  To me it
seemed the logical place to do haskell to C type conversions was in
the poke method itself, but that's because I didn't think about the
corruption thing.

You might think it's obvious, but most type errors are obvious.
People do obviously dumb things all the time, and the nice thing about
a type checker is that we get a compile error, not memory corruption.

Another reason you wind up with Storable instances of  non-CType
records is Data.Vector.Storable.  It's very tempting to simply reuse
that to pass to C, and maybe it's initially fine because it has Ints
or Word32s or something "safe", but then one day 2 years later someone
who doesn't know about that adds a Char field and now you're in
trouble.

> ypoke :: Char -> Char -> Char -> IO T
> ypoke a b c = alloca $ \ tp -> do
>         (#poke struct t, a) tp a
>         (#poke struct t, b) tp b
>         (#poke struct t, c) tp c
>         peek tp

This is corrupting memory, since sizeOf 'c' == 4.  Like I said, it
will probably look like it works because it's usually just overwriting
adjacent fields or perhaps alignment padding or maybe it's "safe" if
it's on the stack, but you are likely to get mysterious crashes under
load.  Try changing the order of the pokes and see what happens.

I have to say I'm a bit surprised to be arguing for type safety vs.
"just remember to do the right thing and you won't get memory
corruption" on a haskell list :)