[Haskell-cafe] hsc2hs and Storable (especially) unsafe

Evan Laforge qdunkan at gmail.com
Wed Jan 2 00:17:35 CET 2013

It so happens that on 64 bit OS X, haskell's Int is 64 bits, while C's int is
32 bits.  hsc2hs's #poke does no typechecking at all, so if you have

(#poke Struct, int_field) structp int -- where int :: Int

Then you probably are going to corrupt memory, and in a particularly
pernicious way (since chances are it won't segfault).  Similarly, don't poke
haskell Chars, it has to be CChar.  And what's even worse,
Foreign.withArrayLen gives you an Int, not a CInt!  So doing the obvious thing
and poking the pointer and length directly is wrong.  And besides, shouldn't
it be a CSize?  Another trap: C++'s bool is probably one byte, haskell's Bool
Storable says it's 4.  It's probably not even correct to assume Double =
CDouble and Float = CFloat, though it seems likely to be always true.

I carefully vetted all my #pokes but I'm annoyed that hsc2hs and Foreign let
me get away with this.  If Storable is for marshalling types to and from C,
then non-C types should not be Storable!  If Storable is useful for other
things (and it is!), then Foreign should define its own CStorable, that only
includes C types, and hsc2hs should use that!

While we're at it, fromIntegral is not a safe way to convert a haskell type to
a C one.  But it sure is a tempting one, because Foreign.C doesn't provide any
conversion functions (except for Bool).  We should have explicit conversion
functions that give the choice of crashing with an error or clamping the
number to the (smaller) C type's bounds.  I'm not sure if anyone wants what
fromIntegral does.

Or maybe I shouldn't be using hsc2hs in the first place.  I feel more and more
that it gives haskell a bad name, being low level and error prone.  I used
bindings-dsl for a libgit2 binding, and at least it automates Storable
instances, so it's less error-prone that way.  I'd like to generate instances
directly from .h files though, so I should probably check out c2hs.

Back in the day when I was deciding to use hsc2hs I noticed how it lacks
typechecking, but I thought that it's just a few fields, I can be careful when
writing them.  But one of the lessons from haskell (and static analysis in
general) is that "just be careful" is going to fail you some day, probably
sooner than you think.

So anyway, I wrote a ForeignC module that defines CStorable that only includes
C types.  At first I thought I could make Storable a superclass and just
re-export functions with more restritive signatures, but then it gets tricky
because you wind up needing the Storable peek and poke to declare Storable
instances.  So I defined a completely new CStorable, and all you need to do is
import ForeignC instead of Foreign, and change 'instance Storable ...' to
'instance CStorable ...'.  Unfortunately that apparently means copy-pasting
over the Storable-using utilities like 'with' and 'newArray'.

On the plus side, I dropped it into my .hsc modules, and it found several
*more* mistakes.

I was thinking of putting it on hackage (along with the conversion functions),
since other people might have made these same mistakes, but it's unpleasant
because it duplicates so much from Foreign.  Also, I only copied over the bits
I'm using, so it's not complete either.  Is there a better way to create safe
Storable instances for C?  I'd be happier with a solution that avoids the need
to write Storable instances in the first place, but hsc2hs is here now, and
used in a lot of places.

More information about the Haskell-Cafe mailing list