[GHC] #5987: Too many symbols in ghc package DLL
GHC
ghc-devs at haskell.org
Tue Jun 21 13:22:22 UTC 2016
#5987: Too many symbols in ghc package DLL
---------------------------------+----------------------------------------
Reporter: igloo | Owner: Phyx-
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.5
Resolution: | Keywords:
Operating System: Windows | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking: 5355
Related Tickets: | Differential Rev(s):
Wiki Page: |
---------------------------------+----------------------------------------
Changes (by Phyx-):
* owner: => Phyx-
Comment:
I have recently taken a look at this and have an almost working version
that should solve the problem once and for all.
First, we don't actually have that many symbols to go over the limit. Or
it seems we don't. If I measure amount of symbols in the input object
files going into the link and the amount coming out, the difference is
huge.
Looking at it further this is because of two things. We never explicitly
use `__declspec`, we just change the names of the functions to match the
conventions that `__declspec` would use. This is fine, but it means that
binutil's default of `--export-all-symbols` is still enabled. Which means,
we'll re-export any symbol we link from archives as well.
Given that `-dynamic` on Windows always produces an import library
`.dll.a` and the search order for `ld` is
{{{
libxxx.dll.a
xxx.dll.a
libxxx.a
cygxxx.dll (*)
libxxx.dll
xxx.dll
}}}
Then we always end up picking the import lib. This is recursive, we link
against `gmp`, `base` etc. By the time it gets to `GHC` the resulting
import lib is huge and hence we blow passed the number of symbols. Also
`kernel` and `gdi32` and `mingwex` etc are all import libraries for GCC.
So we accumulate a ton of symbols from there as well.
So the first thing my changes do is only export symbols defined in the
input object files.
This not only drastically reduces the size of the resulting DLLs and
import libraries, it also pushes the number of symbols way way below the
limit.
In fact I got rid of `dll-split` all together and allow all symbols to go
into the same dll and we end up with
{{{
$ nm -g "R:\ghc\libHSghc-8.1-ghc8.1.20160617.dll" | wc -l
49610
}}}
This down from ~240,000 (mingwex and mingw32 are huge for instance).
The second thing my build changes do is that in order to prevent this from
happening again, I implemented an automatic partitioning scheme which
requires no special treatment of the split dlls.
In case we hit the limit again, the build script will automatically detect
this and do the following:
It will split the symbols up per object file input. So that all symbols of
the same object file are in the same DLL.
Like @rassilon suggested before, I'm using `import libraries` to break the
dependencies. So the specific grouping doesn't matter.
The import libraries point to the location of the dll which contains the
symbol:
{{{
LIBRARY "libHSCabal-1.25.0.0-ghc8.1.20160617-pt2.dll"
EXPORTS
"__stginit_Cabalzm1zi25zi0zi0_DistributionziCompatziBinary"
"__stginit_Cabalzm1zi25zi0zi0_DistributionziCompatziCopyFile"
...
}}}
And these are used to break the dependencies.
We then end up with smaller dlls with the suffix `-pt<num>.dll` and their
import libraries.
The next step is to produce one large/merged import library with the name
of the dll we were originally supposed to create.
`libHSCabal-1.25.0.0-ghc8.1.20160617.dll.a` which is just a merging of the
different `-pt` import files.
This has the effect that when `-lHSCabal-1.25.0.0-ghc8.1.20160617` is used
as the link argument (which we do), the import lib is found and the linker
puts a reference to the right dlls. No extra/special handling is needed by
any other tool.
Using the import libraries essentially removes the limit, since each
symbol is an object file in the archive.
(note that while I recently added support for import libraries to GHCi,
this support only extends to single dll import libraries. It needs some
minor modifications to support this too but LD should work fine.)
This works fine, and I can successfully compile a dynamic version of GHC
and the program runs (but segfaults due to a piece of bit rotted code I'm
looking at).
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5987#comment:54>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list