Proposal - Foreign enum support

Merijn Verstraaten merijn at inconsistent.nl
Thu Apr 17 14:19:21 UTC 2014


Cross-post to haskell-prime in case there's any interest for including this into the report's FFI specification.

Proposal - Foreign enum support
===============================

At the moment the FFI does not have a convenient way with interacting enums
(whether proper enums or CPP defines) in C (like languages). Both enums and CPP
defined enums are major parts of large C APIs and they are thus crucial to
writing foreign bindings. A few examples:

SDL_image defines the following enum:

    typedef enum {
        IMG_INIT_JPG = 0x00000001,
        IMG_INIT_PNG = 0x00000002,
        IMG_INIT_TIF = 0x00000004,
        IMG_INIT_WEBP = 0x00000008
    } IMG_InitFlags;

OpenCL specifies the following typedefs + CPP defined enum:

    typedef uint32_t  cl_uint     __attribute__((aligned(4)));
    typedef cl_uint   cl_platform_info;

    /* cl_platform_info */
    #define CL_PLATFORM_PROFILE                         0x0900
    #define CL_PLATFORM_VERSION                         0x0901
    #define CL_PLATFORM_NAME                            0x0902
    #define CL_PLATFORM_VENDOR                          0x0903
    #define CL_PLATFORM_EXTENSIONS                      0x0904

OpenCL functions will return the above CPP defines as return values of type
cl_platform_info.

Current Solutions
-----------------

In many cases someone wrapping such a C library would like to expose these
enums as a simple sum type as this has several benefits: type safety, the
ability to use haskell constructors for pattern matching, exhaustiveness
checks.

Currently the GHC FFI, as specified by Haskell2010, only marshalls a small set
of foreign types and newtypes with exposed constructors of these types. As such
there seem two approaches to wrap these enums:

 1. Implement an ADT representing the enum and write a manual conversion
    function between the ADT and the corresponding C type (e.g. CInt -> Foo and
    Foo -> CInt).

 2. Use a tool like c2hs to automatically generate the ADT and conversion
    function.

In both cases the foreign functions are imported using the corresponding C type
in their signature (reducing type safety) and the user is forced write trivial
wrappers for every imported function to convert the ADT to the relevant C type
and back.

This is both tedious to write and costly in terms of code produced, in case of
c2hs one calls "toEnum . fromIntegral" and "fromIntegral . fromEnum" for every
argument/result even though this could trivially be a no-op.

Worse, since c2hs uses the Enum class for it's conversion to/from C types it
generates Enum instances like:

    instance Enum Foo where
        fromEnum Bar = 1
        fromEnum Baz = 1337

        toEnum 1 = Bar
        toEnum 1337 = Baz
        toEnum unmatched = error ("PlatformInfo.toEnum: Cannot match " ++ show unmatched)

Since succ/pred and enumFromTo's default implementations assume enums convert
to continuous sequence of Int this means the default generated enum instances
crash. This problem could be overcome by making c2hs' code generation smarter,
but this does not eliminate the tediousness of wrapping all foreign imported
functions with marshalling wrappers, NOR does it eliminate the overhead of all
this useless marshalling.

Proposal
--------

Add a new foreign construct for enums, the syntax I propose below is rather
ugly and ambiguous and thereforeopen to bikeshedding, but I prefer explaining
based on a concrete example.

    foreign enum CInt as Foo where
        Bar = 1
        Baz
        Quux = 1337
        Xyzzy = _

This would introduce a new type 'Foo' with semantics approximately equivalent
too "newtype Foo = Foo CInt" plus the pattern synonyms "pattern Bar = Foo 1;
pattern Baz = 2; pattern Quux = 1337; pattern Xyzzy = Foo _".

Explicit listing of the value corresponding to a constructor should be
optional, missing values should just increment by one from the previous (like
C), if the initial value is missing, it should assume to start from 0. Values
do not need to be contiguous.

Users should be able to use these constructors as normal in pattern match
(really, this mostly follows to semantics of the above pattern synonyms).

The foreign import/export functionality should invisibly marshall Foo to the
underlying foreign type (as is done for newtypes).

I'm unsure about the support for a wildcard constructor like Xyzzy. If there is
support for a wildcard, it should be optional. On the upside a wildcard means
the marshalling is no longer a partial function. The downside is that it makes
desugaring the use of enums in patterns harder. It seems clear that

    f Xyzzy = {- ... -}
    f Bar = {- ... -}
    f Baz = {- ... -}
    f Quux = {- ... -}

Should not have the same semantics as:

    f (Foo _) = {- ... -}
    f (Foo 1) = {- ... -}
    f (Foo 2) = {- ... -}
    f (Foo 1337) = {- ... -}

So in the presence of wildcards, the Foo enum can't trivially be desugared into
pattern synonyms after checking exhaustiveness.

Pros:
 1. Foreign imports are slightly more type safe, as one can now write:

        foreign import ccall "someFoo.h" someFoo :: Foo -> Ptr () -> IO ()

    Preventing users from passing an arbitrary CInt to an argument expecting a
    specific enum.

 2. No need to write marshalling functions to/from ADT to obtain exhaustiveness
    checks and pattern matching

 3. Cheaper as marshalling Foo to CInt is a no-op

 4. toEnum/fromEnum can simply map to contiguous sequence of Int as this Int
    mapping is no longer used for marshalling

Cons:
 1. Non-standard extension of the FFI

 2. Someone has to implement it

 3. Wildcards constructors would present difficulties desugaring pattern
    matches to a simple newtype.

 4. ??

What Would Need to be Done?
---------------------------

1. Parser needs to be extended to deal with parsing of enum declarations.
2. Pattern matches of an enum type need to be checked for exhaustiveness and
   desugared to the underlying type's representation.
3. Extend foreign imports/exports to marshall enums properly.

If there's no objections I'm willing to take a stab at implementing this,
although I'd probably need some help with GHC's internals (although I could bug
#ghc for that).

Cheers,
Merijn
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140417/52fb4872/attachment.sig>


More information about the ghc-devs mailing list