Library archives

Alastair Reid reid at cs.utah.edu
Mon Jun 17 08:52:43 EDT 2002


SimonPJ:
> However, I *believe* that the square bracket part is like the module
> name in a Haskell qualified name... it says where the name comes
> from.  It makes sense to be able to specify that on a call-by-call
> basis.  Two different libraries might have a procedure with the same
> name, and you might want to import both into the same FFI interface.

[Note: in all of the following I am talking about the C-specific part
of the ffi spec.  Other languages (Java, .net, etc.) treat namespaces
differently and would have different stuff in their language-specific
parts.]

This touches on the subject of static/dynamic linking namespaces which
was the cause of Warren Burton's recent Darwin problem (on the Hugs
list).

With a single namespace (such as that provided by the Unix linker ld
or the Unix dynamic linker dlopen(...,RTLD_GLOBAL|...)), there can
only be one symbol with each name.  This makes it impossible to have
two libraries with identically named symbols.  It probably doesn't
matter whether you create a Haskell binding for the multiply defined
symbols - their mere existence is enough to hose you.  Conflicts
between symbols in the library and the main program (e.g., GHCi or
Hugs) will also hose you.  I believe this is what GHC uses (because it
is what ld does) and it may also be what GHCi uses.

With a per-library namespace (such as that provided by dlopen when you
don't specify RTLD_GLOBAL), each module has its own namespace and so
different libraries can have overlapping names.  It is probably still
possible to have conflicts between symbols in the main program and
symbols in the library.  This is what Hugs/GreenCard and Hugs/FFI uses
and is what you need to do to make Simon's example work.


Obviously Hugs is doing the right thing.  Well, no, maybe not...

Multiple instantiation of a module in Haskell makes little difference
at runtime, but multiple instantiation of a C library makes a huge
difference to C code.  Consider these three C files:

  A.c:  extern int C; int inc() { return C++; }
  B.c:  extern int C; int dec() { return --C; }
  C.c:  int C = 0;

And suppose we have a separate Haskell module (A.hs, B,hs, C,hs)
corresponding to each of these files.

With a single namespace, loading each of these Haskell modules results
in the correct behaviour: the variable incremented by inc is the same
variable decremented by dec is the same variable exported by C.

With separate namespaces, the only way to avoid undefined symbols is to 
build the following combinations:

  A.hs + A.c + C.c
  B.hs + B.c + C.c
  C.hs + C.c

This is absolutely not what we wanted through because now one variable
has become three separate variables and inc modifies a different
variable from dec.

If we put all three modules into a single package, we might
conceivably avoid this problem but the problem will then come up again
between packages.  For example, the xlib package probably uses errno
but errno is in the libc package.  We certainly don't want to
duplicate errno and we certainly don't want to merge the two packages
into one.


In conclusion, C is designed to use a single global namespace and
things break if you try to change that.  Hence, I don't think the ffi
for C can allow C libraries to export overlapping names.  I think the
ffi spec should explicitly say that all C libraries are loaded into a
single global namespace.  And I don't think the square bracket part
can be treated like a Haskell qualified name.

I intend to change Hugs/FFI to match GHC's behaviour.  Fortunately,
the only change required is to specify RTLD_GLOBAL when calling dlopen
and that a symbol currently called 'initModule' in the ffi-generated
code will be called 'init_Foo_Bar_Baz' instead if this is the code for
a Haskell module called Foo.Bar.Baz.


-- 
Alastair Reid        reid at cs.utah.edu        http://www.cs.utah.edu/~reid/

ps Note that if anyone really, really wants a local namespace, they
can foreign-import the dlopen interface and code it up themselves
usnig foreign import dynamic.

pps If anyone has irreconcilable name conflicts between C libraries, I
have a handy tool for renaming symbols in ELF binaries.  It was
developed as part of a project to add module-local namespaces to C.




More information about the FFI mailing list