Locating shared libraries

Clemens Fruhwirth clemens at endorphin.org
Tue Jun 12 09:24:07 EDT 2007


Hi,

I'm hacking on shared library support for GHC and it's coming along quite nicely.
http://hpaste.org/192

My initial hacks are available from:

http://clemens.endorphin.org/patches/ghc-20070605-initial-shared-libs.patch
(works only with x86-64 atm, on i386 the NCG dies in the register
allocator when compiling cmm files RTS)

http://clemens.endorphin.org/patches/cabal-20070528-initial-shared-library.patch

libtool usually takes care of creating shared libraries under *nix
system. libtool solves a few minor problems associated with:

1) creating shared libraries
2) linking programs that depend on shared libraries
3) running programs that depend on shared libraries

libtool is tailored to C compilers and the general opinion from #ghc
towards libtool seems to be: "hands off". From the list above, I will
try to sketch solutions without libtool.

1) creating shared libraries:

At the moment, my second patch teaches Cabal how to build shared
libraries. Basically, this is:
  * add -fPIC to the compiler invocation (and -optc-fPIC for c-sources),
  * invoke "ld -shared -Bsymbolic -o foo.so obj1.o obj2.o ...". 

ATM, ld is not invoked with the inter-library dependencies for the
shared library being built. This is not problematic as the final
executable will include all dependencies due to the ghc package
dependency tracking. But DT_NEEDED on ELF influences the sequence in
which shared library initializers are run. I have not yet investigated
if this leads to any problems.

To solve this little shortcoming, the ld-invocation could be delegated
to GHC. "ghc -o libHSfoo.so Foo1.o Foo2.o". We already have a similar
facility for DLLs (see MkDLL in DriverPipeline.hs). This could be
abstracted into MkShared, and platform specific knowledge could be
encapsulated in GHC. The benefit would be that we could easily access
the package information and we could create shared libraries that
contain proper DT_NEEDED sections.

2) Linking programs

Linking should work out of the box:

"ghc -dynamic -o HelloWorld HelloWorld.o" creates dynamically linked
executable.

3) Running programs

This a typical problem:
./HelloWorld 
./HelloWorld: error while loading shared libraries: libHShaskell98-1.0_dyn.so: cannot open shared object file: No such file or directory

There are several ways to add search paths for dynamic linking: either
we do it temporarily or we encode the search paths into the
executables. On ELF platforms, this works by adding -rpath to the
linker flags. This adds two new entries in the .dynamic section
(DT_RPATH, DT_RUNPATH) both responsible for signalling additional
search paths to the dynamic linker, ld.so. According to Simon Marlow,
Windows has similar mechanism via manifest files.

Let's see how libtool handles this situation. libtool differentiates
between installed and uninstalled libraries. When linking against
installed libraries not in the standard search path, libtool uses
-rpath to add these search paths to the created executable. When
linking against uninstalled libraries, libtool still uses -rpath but
pointing to the directory the uninstalled library is going to be
installed in. libtool derives this information from the .la
files+Makefiles.

In any case, libtool creates a wrapper in the build directory that
takes care of executing the program linked against uninstalled shared
libraries. There are two strategies for accomplishing this:
 * add the paths of the uninstalled shared libraries to LD_LIBRARY_PATH
 * relink the executable with additional -rpath's
libtool chooses the second strategy.

How do we translate these solutions to GHC? The first question is
whether we expect 

ghc -dynamic -package uninstalled-package -o Hello Hello.o
./Hello

to work or whether we require manual intervention in these cases. If
we expect this to work without intervention, we have the same options
as libtool:

* create a wrapper that takes care of locating the uninstalled shared
  libraries and sets LD_LIBRARY_PATH. 

* create a binary with rpath of the uninstalled libraries, and create
  an additional executable for deployment without these rpaths.

In any case we have to modify the installer scripts to either know
about where to locate the real binary, either ask ghc where to find
the real binary, or either delegate the installation to ghc. The last
option is basically the libtool way, "libtool --mode=install ..."

When we decide to create a deployable executable at the "-o" spot, we
need to 

* modify the invocation to manually pick up the libraries by modifying
  LD_LIBRARY_PATH.. this is pretty unpractical.

* delegate invocation to ghc. Maybe "ghc --execute HelloWorld". 
  libtool has a similar mechanism for executing 3rd party programs in
  the "dynamic environment" of the compiler executable. For instance,
  "gdb HelloWorld" would fails for libtool as HelloWorld is a wrapper,
  but "libtool --mode=execute gdb HelloWorld" works, as libtool
  rewrites to HelloWorld to .libs/lt-HelloWorld.

And now something completely different: Create a custom ELF program
interpreter for Haskell programs. Using INTERP in the ELF program
header, loads up this interpreter and delegates control to it. Usually
this is /lib/ld-linux.so.2, the dynamic linker, but we can replace that.

Haskell has its own idea of libraries/packages. We have package.conf
which gives us the location of the installed libraries. This is ok for
static linking, as at link time ghc is running and knows how to invoke
gcc with the correct paths. It does not matter, if package.conf is
updated afterwards as the statically linked programs contain a copy of
the library anyway. For dynamic linking this phase is delayed and when
we encode rpath such as "/usr/lib/network-2.0/ghc-6.6/", we can not
update to network-2.1 without breaking this executables.

A custom programming loading stub could access the global and local
package.conf and extract the library path for the dependencies and execve 

/lib64/ld-linux.so.2 --library-path=<paths of the dependencies> HelloWorld <args>

This certainly gives us more flexibility than encoding all these
rpaths statically into HelloWorld. To solve the inplace execution
directly from the build directory, we might create
.HelloWorld.package.conf in case a non-standard package.conf is used
(non-standard=different from global and local) and have the stub
loader to check for this file.

I agree that the last scheme sounds a bit wild, but I argue that
that's what ELF designers had in mind when they specified the INTERP
header. Of course, this is only a solution for ELF platforms.

Opinions :) ?
--
Fruhwirth Clemens - http://clemens.endorphin.org 




More information about the Glasgow-haskell-users mailing list