[GHC] #14031: Linker paths carry substantial N*M overhead when many libaries are used

GHC ghc-devs at haskell.org
Tue Jul 25 22:02:25 UTC 2017


#14031: Linker paths carry substantial N*M overhead when many libaries are used
-------------------------------------+-------------------------------------
           Reporter:  nh2            |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.0.2
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  Compile-time
  Unknown/Multiple                   |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 **"The linker search path problem"**

 It's common these days to have 400 (transitive) Hackage dependencies (even
 my hobby projects have that many).

 When linking, statically or dynamically, GHC passes `-L /path/to/foo/dir`
 and `-L HSfoo...` to the linker for each Haskell package `foo`.

 That results in `N` many linker search paths (`-L`) and `M` many library
 names. That will translate in `N * M` many `stat()` syscalls (file
 existence checks) in the linker to check where each file is.

 For 400 package dependencies, that translates to 160k `stat()` syscalls.
 On my machine (on SSD and hot file system cache in RAM), that takes `0.7
 seconds` (measured with `strace -c -w`).

 That is a substantial overhead for linking (for example, the executable
 with 400 dependencies with which I found this problem can be linked with
 `ld.gold` in just 1.5 seconds).

 If there is tooling in between GHC and the linker that also parses linker
 flags and checks for file existence (for example Nix's `cc-wrapper` and
 `ld-wrapper` scripts, or other instrumentation or build system tooling,
 which are often not written in compiled languages and thus
 [https://github.com/NixOS/nixpkgs/issues/27609 much slower] at stat()ing
 than the linker is), then the problem can be easily blown up another 10x.

 As a result, you may be waiting a lot longer for your link to finish than
 necessary.

 For static linking, this could be resolved by passing the archive files
 (`libHSfoo....a`) directly to the linker instead of the `-L`/`-l`
 combination.

 For this we would need to check that passing files directly really have
 the same semantics as `-L` and `-l`, but the man pages suggest so (and
 I'll check it with the binutils authors).

 For dynamic linking, it cannot be fixed this way. It can be fixed at
 compile time, but with dynamic linking the same problem exists at run-time
 (startup): Here all the `-L`s become `RPATH`s if not installed in a global
 location (for example, nix has an `RPATH` for each package), and thus
 program startup can take up to those 0.7 seconds I mentioned.

 To solve the runtime problem, the only solution seems to be to place the
 libraries into the same directory (i.e. to have `M = 1` essentially).

 This approach (putting all libs into 1 directory) would also solve it for
 the static linking case, with the drawback of this being not compatible
 with how cabal sandboxes, stack build and snapshot dirs, nix, Debian and
 other distributions deliver and build libraries. And for some of them this
 approach goes straight against their nature.

 **Detail discussion:**

 See below the relevant discussion on #ghc:

 {{{
 nh2:        bgamari-: do you know why ghc uses -L and -l for linking
 libHS... things instead of passing the .a file directly?
 hvr:        nh2: btw, fun fact: on IBM AIX, both static and DSOs are
 stored in .a archives
 bgamari-:   nh2, I suspect just since it makes the dynamic and static
 linking codepaths consistent
 hvr:        nh2: and if you use .a for linking, you end up hardwiring the
 DSO path into your executable
 nh2:        bgamari-: I'm looking into whether it would make sense to
 change it, at least for the platforms that would support it, because the
 (-L * -l) searching that it does currently seems to dominate link times
 here
 nh2:        cc hvr
 hvr:        nh2: sounds reasonable; but then you end up with longer CLI
 arguments
 hvr:        nh2: it could be a problem for when the tools don't support @
 bgamari-:   nh2, seriously?
 bgamari-:   the search is really that long?
 bgamari-:   nh2, it seems like that's a linker bug
 hvr:        bgamari-: I guess it depends on how many -L there are
 bgamari-:   nh2, how many libraries are you linking against?
 hvr:        bgamari-: as each -L and -l multiply w/ each other requiring
 several stat() calls iirc
 bgamari-:   hvr, yes
 hvr:        but otoh, I wouldn't expect this to *dominate* linker times
 bgamari-:   right
 bgamari-:   hvr, this is the compile-time equivalent to #11587
 bgamari-:   which I still think we should do
 hvr:        right
 bgamari-:   hmm
 bgamari-:   although comment:9 sort of suggests that this is already the
 case for user-installed libraries
 ***bgamari- forgot about that
 hvr:        I mean, if you have really *many* -L-paths
 hvr:        then the quadratic effect becomes probably significant
 bgamari-:   right
 bgamari-:   nh2, how many libraries are you linking against?
 nh2:        bgamari-: I have approximately L=450 and l=350. There is some
 notoriously slow tooling "in between" that I'm currently investigating,
 namely nix's ld-wrapper, which does does the stat'ing in bash, where
 statting that many files easily takes 1 second, but I did a quick test
 with `xargs ls -1` on the "$DIR/lib${name}.a" expansion, an that too takes
 1 second, so it might not be entirely its fault
 hvr:        ok, ~400 is quite a bit
 bgamari-:   sheesh
 bgamari-:   yes, that is... impressive
 geekosaur:  next they'll want this to work on macOS >..
 geekosaur:  >.>
 hvr:        and then windows!
 nh2:        even my hobby projects have 300 recursive deps by now
 nh2:        bgamari-: my strace says that `ls` did the 160k stat()s in 0.7
 seconds, which is half of the time that `gold` needs to link my binary
 nh2:        bgamari-: I have now strace'd `ld` itself, that does it
 slightly faster, 0.5 seconds, but this is just because it's my hobby
 project and has "only" that many deps; for a real project that'd be 2
 seconds just stat()ing
 bgamari-:   nh2, sheesh
 nh2:        bgamari-: of course nix's ld wrapper, being written in bash,
 does the same thing but at least 5x slower
 bgamari-:   nh2, do you just add all of Hackage to your build-depends? ;)
 bgamari-:   nh2, Instead of passing explicit paths I think I would rather
 just put all of the archives in LIBDIR
 nh2:        bgamari-: I blame it on my dependees ;) also, "Haskell has
 high-quality libraries and encourages code reuse" comes back at us here
 bgamari-:   although, admittedly, I haven't really thought through the
 consequences of this yet
 bgamari-:   this does sound like a worthwhile thing to fix though
 bgamari-:   that is a significant amount of time
 nh2:        bgamari-: personally I'd WAY prefer explicit args vs env vars
 because then I can see them in ps, I can easily replay the specific
 command myself to debug it etc
 hvr:        nh2: not every library on Hackage is "high-quality" ;-)
 bgamari-:   I had always assumed that our large linking times we due to,
 well, the "linking"
 bgamari-:   not just the stating
 bgamari-:   nh2, fair enough, but I'd rather not have to add even more
 platform-dependent behavior to our interaction with the system toolchain
 bgamari-:   oh, wait
 bgamari-:   nh2, I didn't mean to pass via env var
 bgamari-:   nh2, I just meant pass one -L
 bgamari-:   nh2, where the static and dynamic libraries can both be
 bgamari-:   found
 nh2:        bgamari-: I don't know that, where can I read about that? Is
 it a flag?
 bgamari-:   nh2, -L?
 bgamari-:   it's just the usual --library-path flag
 nh2:        bgamari-: I got confused by LIBDIR in caps
 bgamari-:   ah
 bgamari-:   right, I should have been more specific
 nh2:        bgamari-: so what do you mean by that, somehow put all the
 libraries into a single dir?
 bgamari-:   nh2, treat static archives the same way we treat dynamic
 libraries
 bgamari-:   which is to place them in, e.g.,
 $HOME/.cabal/lib/$ghc_version/
 bgamari-:   instead of where they are currently,
 $HOME/.cabal/lib/$ghc_version/$library_abi_name
 bgamari-:   then you can just pass -L$HOME/.cabal/lib/$ghc_version to ld
 bgamari-:   and a -l for each library
 bgamari-:   I think this is probably as much a Cabal change as it is a GHC
 change
 nh2:        bgamari-: that doesn't seem to go well with approaches that
 package separately, like debian, nix, cabal sandboxes and stack work dirs
 hvr:        which includes new-build
 bgamari-:   I'm not sure it's fundamentally incompatible
 hvr:        yeah, but it's a lot of work to change everyone :-)
 bgamari-:   to take the case of new-build, all that matters AFAIK is that
 the packages are registered in separate package dbs
 bgamari-:   there's nothing stopping you from throwing all of the
 libraries into one directory
 hvr:        bgamari-: well, I kinda exploit that new-build has nicely
 self-contained folders
 hvr:        this makes it easier to atomically move around artifacts
 nh2:        bgamari-: do you have a dislike for passing them explicitly as
 files beyond cross-platform worries? Because I really find explicit
 arguments the cleanest, easiest and most debuggable approach
 hvr:        as I can stage the new artifact, and then move it into place
 hvr:        atomically
 hvr:        by simply renaming/moving the folder which contains everyting
 bgamari-:   nh2, I'm just currently really not sure that passing libraries
 by path is really equivalent to passing via -l
 hvr:        unfortauntely filesytems don't offer many other facilities for
 atomic transactions
 bgamari-:   I would need to run some experiments
 nh2:        bgamari-: gcc's and ld's man pages suggest that to me, looking
 at their respective `-l`; we certainly should double check (or maybe
 simpler, ask the linker authors)
 bgamari-:   if we can really show that the linker treats explicit paths no
 differently then I'd be okay with it
 nh2:        suggests that they are equivalent)
 bgamari-:   but I'm rather weary
 bgamari-:   nh2, are you only suggesting this for static libraries?
 geekosaur:  it's supposed to be equivalent. but shared objects have the
 added issue of runtime discoverability (i.e. passing by full path at link
 time does not install a full path for runtime access)
 geekosaur:  except on macos or older aix where it's always a full path)
 nh2:        bgamari-: no, at least the stat()-taking-time problem is the
 same problem for .so's (though I'm using static linking in many places and
 would be appreciate an improvement even if limited to static linking)
 bgamari-:   nh2, the problem is that your approach doesn't help dynamic
 linkage at all
 bgamari-:   since the runtime linker still needs to search N paths
 bgamari-:   so I would argue that new-build, stack, etc. really should be
 putting libraries in the same directory regardless
 bgamari-:   more work or not
 bgamari-:   it's the only way to truly solve the problem
 mpickering: Wonders if David saw this issue
 https://github.com/haskell/cabal/issues/4550
 alanz:      mpickering: what are you planning to use the source plugins
 for?
 nh2:        bgamari-: yeah you're right, the start-up cost of the
 executable will always be there for dynamic linking, I was purely talking
 from the compiler's perspective
 bgamari-:   nh2, my point was that I would prefer to solve the problem
 once and for all
 bgamari-:   rather than have another sharp edge that we then still need to
 solve
 bgamari-:   given that moving the libraries solves both problems, it feels
 like the right solution
 mpickering: alanz: haskell-indexer was the plan
 nh2:        bgamari-: I agree that for dynamic linking that is the only
 solution. But I'm not sure there exist a fits-it-all solution across
 static and dynamic. At least for my work, I'd prefer to have static libs
 that can be anywhere, in nix-style immutable locations, and give up on
 dynamic linking (of Haskell libs) to get that. It seems that hvr's
 thoughts are going in the same direction on the cabal side, and lots of
 build system and OS movements these days seem to converge away from global
 shared dirs to dir-prefixes.
 nh2:        I also suspect that static linking will be even more
 attractive now that sensible -split-sections is in place, which can give
 smaller code sizes, which .so's can't deliver.
 nh2:        So I'm not sure that a big shift to the fits-it-all solution
 that satisfies dynamic linking needs would be the most practical or used;
 it'd certainly be nice to have it for dynamic linking, but I'm not sure
 all the downstreams can be forced to do it that way for non-static if it
 quite fundamentally dictates how they have to work. If they can, it would
 probably take pretty long.
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14031>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list