[GHC] #10352: Properly link Haskell shared libs on ELF systems

GHC ghc-devs at haskell.org
Fri Apr 24 12:38:02 UTC 2015


#10352: Properly link Haskell shared libs on ELF systems
-------------------------------------+-------------------------------------
              Reporter:  duncan      |             Owner:
                  Type:  task        |            Status:  new
              Priority:  normal      |         Milestone:
             Component:  Package     |           Version:  7.11
  system                             |  Operating System:  Linux
              Keywords:              |   Type of failure:  None/Unknown
          Architecture:              |        Blocked By:
  Unknown/Multiple                   |   Related Tickets:
             Test Case:              |
              Blocking:              |
Differential Revisions:              |
-------------------------------------+-------------------------------------
 Since the early days of building Haskell shared libs on Linux we have been
 using a scheme that is really a bit of a hack. We should do it properly.

 In my blog post on this from 2009 (http://www.well-typed.com/blog/30/) I
 said:

 > If we use ldd again to look at the libfoo.so that we've made we will
 notice that it is missing a dependency on the rts library. This is problem
 that we've yet to sort out, so for the moment we can just add the
 dependency ourselves:
 >
 > $ ghc --make -dynamic -shared -fPIC Foo.hs -o libfoo.so \
 >     -lHSrts-ghc6.11 -optl-Wl,-rpath,/opt/ghc/lib/ghc-6.11/
 >
 > The reason it's not linked in yet is because we need to be able to
 switch which version of the rts we're using without having to relink every
 library. For example we want to be able to switch between the debug,
 threaded and normal rts versions. It's quite possible to do this and it
 just needs a bit more rearranging in the build system to sort it out. Once
 it's done you'll even be able to switch rts at runtime, eg:
 >
 > $ LD_PRELOAD=/opt/ghc/lib/ghc-6.11/libHSrts_debug-ghc6.11.so
 > $ ./Hello

 So in general, if a shared lib requires symbols from another shared lib
 then it should depend on it. In ELF terminology that means a NEEDED entry
 to say this lib needs that other lib. This is important to be able to link
 and load these shared libraries, otherwise they can have dangling
 dependencies.

 But we don't do this. For the specific case of the RTS we do not link
 Haskell shared libs against the RTS. So they have lots of dangling
 symbols. These libraries cannot be loaded on their own, e.g. with
 `dlopen()`. This is bad, and has other knock-on consequences.

 Why don't we link to the RTS? It's because historically (with static
 linking) GHC had had the ability to select the flavour of the RTS when
 final executables are linked, not when intermediate libraries are created.
 This works because the RTS flavours share a common ABI. This is a useful
 feature as it lets us select the SMP or debug or other RTS at final link
 time. So when we made up the first shared lib scheme on ELF we had to
 support this.

 Our initial scheme was like this: don't link Haskell library DSOs against
 the RTS, only like the final exe against the RTS. Each RTS flavour has a
 separate SONAME, e.g. `libHSrts_thr-ghc7.8.4.so` or `libHSrts_debug-
 ghc7.8.4.so`. This works because the runtime linker looks at the final exe
 first and loads the RTS, and then when other libs are loaded the symbols
 all resolve.

 Why can't we link all the libraries against the RTS? Currently each RTS
 flavour has a different SONAME, which is the key that the dynamic linker
 uses to identify each library. So if we did link all the Haskell libs
 against "the" RTS we would have to pick which one at the point at which we
 create the library, and that'd stop us from being able to choose later.

 So, can we use a better scheme? We want one that doesn't leave dangling
 undefined references in intermediate Haskell libs, and is also compatible
 with the ability to select the flavour of the RTS at final exe link time
 (or even override it at load time).

 Yes we can!

 The first thing to note is that to be interchangeable, all the RTS
 flavours (that share a compatible ABI) need to have the same SONAME. So
 for example, all the (non-profiling) RTS DSO files have to have the
 internal SONAME of `libHSrts-ghc7.8.4.so`. Once they all have the same
 SONAME, then it's ok for all the Haskell libs to specify a NEEDED
 dependency on that rts SONAME.

 But if they have the same SONAME, what do the files get called, where do
 they live and how are they found? The trick is to make use of the search
 path. Put each RTS flavour in a different directory, but otherwise with
 the same filename, e.g. `lib/rts-1.0/thr/libHSrts-ghc7.8.4.so`,
 `lib/rts-1.0/debug/libHSrts-ghc7.8.4.so` etc.

 Each library DSO and exe has its list of NEEDED entries, and it has an
 RPATH entry used to find those libraries if they're not loaded yet. The
 key is the "if they're not loaded yet" bit. Remember that the linker uses
 the SONAME as the key to decide if the lib is loaded yet or not. So the
 libraries could all have an RPATH entry to say to look for the RTS in the
 directory containing the default RTS flavour. But then the top level exe
 (or foreign/export shared lib) can also link to the RTS directly (ie an
 NEEDED entry) and can specify an RPATH which can be for any of the rts
 flavours. When the linker loads the top level exe, it will loads the
 selected RTS using the exe's RPATH, and then when the linker sees other
 Haskell libs that have a NEEDED entry on the RTS it will ignore them
 because the RTS's SONAME is already loaded.

 So concretely, instead of:

     lib/ghc-${ver}/rts-1.0/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/libHSrts_thr-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/libHSrts_debug-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/libHSrts_l-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/libHSrts_thr_l-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/libHSrts_thr_debug-ghc${ver}.so

 each with a different SONAME

 we'd have

     lib/ghc-${ver}/rts-1.0/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/thr/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/debug/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/l/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/thr_l/libHSrts-ghc${ver}.so
     lib/ghc-${ver}/rts-1.0/thr_debug/libHSrts-ghc${ver}.so

 each with the same SONAME

 When linking libs we would always use `-lHSrts -rpath
 lib/ghc-${ver}/rts-1.0`

 When linking exes (or shared libs for external consumption) we would use
 both `-lHSrts` and `-rpath lib/ghc-${ver}/${rtsflavour}/rts-1.0`.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10352>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list