[jhc] some FFI questions

John Meacham john at repetae.net
Mon Jul 9 20:59:56 EDT 2007

On Sat, Jul 07, 2007 at 02:20:49PM -0400, Samuel Bronson wrote:
> First of all, I'm wondering what the intended uses of the lookupCType
> and lookupCType' functions are. It seems like lookupCType tells you
> the type that should be used in marshalling a type, and lookupCType
> tells you the primitive type that is used in the representation of a
> type as well as how to extract the value from it. I'm not sure how to
> go from there to actually converting to/from the correct primitive
> type to use in interfacing with C code...

A lot of this is due to history and a very major rewrite of jhc
internals, switching everything from a C based view of the world to a
c-- (or assembly) one.

It used to be that raw haskell types (basic numeric types) were
associated with a specific C type, such as 'int' and all that entails,
so things like 'int', 'unsigned', 'wchar_t', etc were all independent
types, treated as opaque by jhc threading their way all through the
system from source to code generation and the actual operations
performed such as unsigned multiply vs signed were affected by the type.
Also, when this system was in place, first class unboxed values did not
exist, requiring the creation of data/primitives.txt to provide a
mapping of C primitive types to haskell types so it can build the proper
instances constructors etc as appropriate.

In addition, this type determined the C calling convention used. so the
then aptly named 'lookupCType' functions basically took a name, and
looked up the info in the appropriate primitive tables and whatnot.

needless to say, this situation although nice and straightforward at
first, was quite problematic, there were lots of things like casting
from 'wchar_t' to 'int' inhibiting optimizations. you would have to cast
a value explicity in order to do a signed operation on it instead of an
unsigned one. You could not vary the calling convention used in FFI
calls independent of the type, for instance, you mich want 'Char' to
translate to a 'wchar_t' calling convention, even though you don't want
'wchar_t' to be the representation used for 'Char'.

Thus C/Op.hs and friends were born. now, primitive types are fully
specified and all operations are spelt out explicitly. rawTypes no
longer coorespond to C types, but rather a well defined internal format
such as 'bits32' and all operations are fully specified (signed multiply
and unsigned multiply are independent operations, as are floating point
vs non floating point, etc)

however, rather than change the E representation of types, a rawtype now
_must_ contain a serialized version of a C-- type, not a C type like
they used to have in them. The mapping of names used in foreign
declarations to the calling convention is now indpendent of the haskell
representation of the type, this is very nice for when a type is a
newtype of another and actually represents a different C value.

however, this change is very recent and there is still a lot of code
around that refers to C types, and old tables for looking up properties
of C types that arn't used anymore. the lookupCType functions will be
replaced by nicer more explicit querying operations.

The C type used in the calling convention is what is actually stored in
the ExtType parameter in C/Prim and is completly independent of the E or
Grin type and only comes up again in code generation. explicit casts are
not needed because the code generator knows how to do so and there may
be no analog of the C type in grin or E (in fact, C may never come into
play if using a native code generator)

> Another question is: what sort of E type should an imperative foreign
> export use? Consider:
> foreign export hello :: Int -> IO ()
> hello n = putStrLn ("Hello, "++show n)
> foreign export foo :: IO Int
> foo :: Num a => IO a
> foo = return 1
> Should they just use IO? I guess that's what main does...

I am not sure what you mean. foreign exports are 'metainfo' and need not
affect anything at all until code generation when a proper alias or
exported symbol is emitted by the code generator. (gcc provides a way to
create explicit aliases, but gcc inlining should make that moot)

for that case, since you can't export 'foo' directly (having a different
type), a placeholder function should be generated in the compiler
private namespace.

something like

F at .fMain.foo = foo Int

then F at .fMain.foo would be annotated with the proper foreign export

I don't think IO ever needs to be treated specially other than setting
the IOLike flag properly and making sure the world argument is added in
the right spot.


John Meacham - ⑆repetae.net⑆john⑈

More information about the jhc mailing list