[C2hs] swig

Manuel M T Chakravarty chak@cse.unsw.edu.au
Thu, 08 May 2003 17:52:33 +1000 (EST)


Amanda Clare <afc@aber.ac.uk> wrote,

> Are there any future plans to go with the SWIG group 
> http://www.swig.org/ for interfacing with C code? They claim to support 
> Perl, Python, Tcl/Tk, Ruby, Guile, MzScheme, Java, OCAML, CHICKEN, and 
> C#, and as I've understood, you just run something like "swig -python" 
> on your header file to create everything you need to interface C to Python.
> 
> At the moment, I'd like to generate Haskell access to all the functions 
> and enums defined in the Oracle database C library libsqlora8 
> http://www.poitschke.de/libsqlora8/.
> Something like SWIG sounds ideal for that. C2hs looks good. But somehow, 
> if I do it in c2hs I have to read and understand all about the ffi, 
> stable pointers and foreign pointers etc. Is the problem just too 
> complicated in Haskell to automate completely?

The short answer is "yes."

The c2hs paper <http://www.cse.unsw.edu.au/~chak/papers/papers.html#c2hs>
says the following about SWIG:

  SWIG works well for untyped scripting languages, such as
  Tcl, Python, Perl, and Scheme, or C-like languages, such
  as Java, but the problem with typed functional languages
  is that the information in the C header file is usually
  not sufficient for determining the interface on the
  functional-language side.  As a result, additional
  information has to be included into the C header file,
  which leads to maintenance overhead when new versions of
  an interfaced C library appear.  This is in contrast to
  the use of pristine C header files complemented by a
  separate high-level interface specification as favoured in
  C->Haskell.

I have to admit that I didn't have a look at how SWIG
handles OCaml (it didn't have that support at the time the
paper was written), though.

To illustrate the quoted text a bit, consider for example
that many C programs just use values of type `int' to
represent a Boolean value.  They may then go on to include

  #define TRUE  -1
  #define FALSE 0

to make the code a bit more readable.  Using `Int' for
Booleans is clearly not acceptable in Haskell, but presented
with a prototype of the form

  int foo (int x);

should the Haskell signature be

  foo :: Int  -> Int  ,
  foo :: Bool -> Int  ,
  foo :: Int  -> Bool , or
  foo :: Bool -> Bool ?

As a consequence, the design of a Haskell API for a C
library requires an understanding of the *semantics* of the
involved types and functions.  Hence, it requires human
intervention.

Consequently, tool support can follow any of two routes:

(1) Automatically generate a raw and ugly interface from the
    C header file (which, in particular, maps all use of a C
    `int' to a Haskell `Int', independent of whether that
    `int' represents a Boolean value).  Then, write a normal
    Haskell module that exports a nice Haskell-ised API and
    implements it by calling the functions from the raw and
    ugly interface.  I call this additional code "impedance
    matching code."

(2) Use the C header together with some extra information
    that describes the mapping of C types to Haskell types
    to directly generate a nice Haskell-ised API.

SWIG follows Route (1); although, it permits to annotate C
headers to get some of the benefits of Route (2).[1]
C->Haskell follows Route (2).  The extra information is
exactly what is contained in the binding modules.  I prefer
this route as it leaves scope for generating some of the
repetitive patterns in the impedance matching code
automatically, hence, leading to less overall effort.

The main advantage of Route (1) is that it facilitates to
generate a raw and ugly interface really quickly and, if you
don't care to making a proper Haskell library out of it,
allows you to code your application directly on that raw
interface.  In other words, it reduces the barrier to entry,
even if it increases the overall effort.

Consequently, I am very interested in reducing the barrier
to entry to work with c2hs.  To do so, I have writing a
tutorial on my list for a quite a while - I just never seem
to get the time to actually do it :-/  In addition, it might
be worthwhile to extend the existing function hooks

  http://www.cse.unsw.edu.au/~chak/haskell/c2hs/docu/c2hs-3.html#ss3.7

such that supplying a Haskell type is optional and there is
a default mapping to Haskell for every C type.  The result
would be a function binding like that SWIG would generate.

Cheers,
Manuel

[1] IMHO annotating C headers is a big no-no.  You want to
    work from prestine headers to simplify tracking of
    successive versions of the C library.