The future of Haskell discussion

Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl
14 Sep 2001 18:52:19 GMT


Fri, 14 Sep 2001 11:51:14 +0100 (BST), D. Tweed <tweed@cs.bris.ac.uk> pisze:

> As a general question (and forgive my ignorance): are the various ffi's
> implemented using something like `dlopen' or are they done by actually
> putting suitable stubs into the Haskell generated C-code which then gets
> compiled by the C compiler as part of the overall haskell compilation?

The latter.

> (1) Firstly there's calling code where the interface is basically
> C but compiled with C++; at this level there's the issue of name
> mangling (accounting for both for argument structure and namespace
> effects) and any global stuff in the C++ code (e.g., ensuring global
> objects construction/destruction happens at times that are ok).

This should be easy.

> (2) Then there's being able to invoke the method of an object
> without caring about moving `inside object' information back in to
> haskell (e.g., calling the colour_segment() member of an object of
> the Image class). Here the object is essentially just acting as a
> `struct with attatched function pointers'.

hsc2hs can be used to do similar things in two ways:

- either a Haskell module is preprocessed by a dynamically generated
  C program which knows object layout by including a header and uses
  it to fill special syntactic constructs with right constants,

- or C function wrappers are created which are compiled by the C
  compiler.

c->hs goes a third way:

- a Haskell program reads C headers, computes object layout using
  its own logic, and fills special syntactic constructs with right
  constants.

hsc2hs can be specialized for some C++ constructs, although convenience
of using it is limited by the simplicity of the way it works. It can't
for example automatically translate full contents of types between C++
and Haskell (unless its Haskell side implements a C++ parser). It can
do whatever a C++ program which includes the headers can do with marked
parts of a Haskell source, accompanied with simple decisions taken by
its Haskell side which sees raw C text (this now includes extraction of
the "declarative" part of a definition which should go to a C header).

I will see what can be done with C++-specific constructs, but before
that ghc should be changed to be able to compile *.cc files (hsc2hs
calls a Haskell compiler to compile C files, because it inserts
necessary include path options).

c->hs has more power in its hands, at the cost of having to reimplement
parts of the logic of a C++ compiler. It can perform some sophisticated
structural translation. But it can't be fed with C++ headers at
all until it is prepared to understand them - its parser currently
understands only C headers.

> (3) Then there's being able to actually use objects more fully via
> a Haskell type/type-class wrapper of some sort (so that for example
> objects of C++-class X_cpp with a member function

This is surely beyond the scope of hsc2hs. IMHO it should not be an
integral part of the FFI, because there are too many design decisions
to take. Instead tools should be ready to make it easy to implement
library-specific conventions.

> (4) Finally there's being able to propagate C++ exceptions into Haskell,
> using either Haskell exceptions or some other representation. (This is
> clearly incredibly hard, but I belive some package (forget which) manages
> to propagate C++ exceptions into Python exceptions.)

There are two approaches: teach the code generator of a Haskell
compiler how to cooperate with C++ exceptions, and design tools which
make wrapping functions for this case not too inconvenient. Both ways
are hard.

Unfortunately we need both directions. Currently Haskell exceptions
are not properly propagated through C functions. And extern "C" is
incompatible with exceptions, so until the code generator can handle it
itself, exceptions must be always manually translated to other means.

> Indeed, with a potential additional problem: many template functions are
> written as inlines in header files, so that if I try and use a template
> function I wrote years ago in a .cc file containing a C++ class I defined
> yesterday I silently get the correct code compiled into the new .o
> file.

Actually in ISO C++ there is an 'export' keyword which allows to
keep template bodies in .cc files, but... no compiler implements it
(and almost none even tries). It's incredibly hard - it doesn't fit
the traditional C++ compilation and linking model at all.

The main problem is that the instantiated template code should be
compiled using a strange mix of the environment from the point of
template definition and instantiation. Depending on types involved,
a name is looked up in either environment. C++ template is neither a
textual substitution nor semantically polymorphic code, but something
in-between.

It used to be more like textual substitution before; at least old
Borland implemented it this way - syntax errors were not recognized
before instantiation, only braces were matched to find the end of
the function or class. Now parts which don't depend on the template
parameters are treated more "semantically", and there must be some
hints in the syntax ('typename', 'template') so the template can be
fully parsed before instantiation and partially typechecked.

> If I try to `glue' together a template function and a new C++
> type (where they haven't been used together otherwise) where does
> the new instantiation go; do I have to go around adding explicit
> instatantiation requests in the C++ source?

The C++ compiler must at least once see the template body instantiated
with the right types. It's usually not smart enough to avoid repeated
recompilations of the instantiated code, but it should be able to
remove this duplicate code during linking.

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK