[Git][ghc/ghc][wip/fendor/ghc-iface-sharing-avoid-reserialisation] 6 commits: Add run-time configurability of .hi file compression
Hannes Siebenhandl (@fendor)
gitlab at gitlab.haskell.org
Fri Apr 26 16:12:59 UTC 2024
Hannes Siebenhandl pushed to branch wip/fendor/ghc-iface-sharing-avoid-reserialisation at Glasgow Haskell Compiler / GHC
Commits:
58c19f2d by Matthew Pickering at 2024-04-26T11:15:09+02:00
Add run-time configurability of .hi file compression
Introduce the flag `-fwrite-if-compression=<n>` which allows to
configure the compression level of writing .hi files.
The motivation is that some deduplication operations are too expensive
for the average use case. Hence, we introduce multiple compression
levels that have a minimal impact on performance, but still reduce the
memory residency and `.hi` file size on disk considerably.
We introduce three compression levels:
* `1`: `Normal` mode. This is the least amount of compression.
It deduplicates only `Name` and `FastString`s, and is naturally the
fastest compression mode.
* `2`: `Safe` mode. It has a noticeable impact on .hi file size and is
marginally slower than `Normal` mode. In general, it should be safe to
always use `Safe` mode.
* `3`: `Full` deduplication mode. Deduplicate as much as we can,
resulting in minimal .hi files, but at the cost of additional
compilation time.
Reading .hi files doesn't need to know the initial compression level,
and can always deserialise a `ModIface`.
This allows users to experiment with different compression levels for
packages, without recompilation of dependencies.
Note, the deduplication also has an additional side effect of reduced
memory consumption to implicit sharing of deduplicated elements.
See https://gitlab.haskell.org/ghc/ghc/-/issues/24540 for example where
that matters.
-------------------------
Metric Decrease:
MultiLayerModulesDefsGhciWithCore
T21839c
T24471
-------------------------
- - - - -
ec52327c by Matthew Pickering at 2024-04-26T11:15:36+02:00
Introduce regression tests for `.hi` file sizes
Add regression tests to track how `-fwrite-if-compression` levels affect
the size of `.hi` files.
- - - - -
9227f1a8 by Fendor at 2024-04-26T11:15:36+02:00
Implement TrieMap for IfaceType
- - - - -
4c32894a by Fendor at 2024-04-26T18:10:10+02:00
Improve sharing of duplicated values in `ModIface`
As a `ModIface` contains often duplicated values that are not
necessarily shared, we improve sharing by serialising the `ModIface`
to an in-memory byte array. Serialisation uses deduplication tables, and
deserialisation implicitly shares duplicated values.
This helps reducing the peak memory usage while compiling in
`--make` mode. The peak memory usage is especially reduced when
generating interface files with core expressions
(`-fwrite-if-simplified-core`).
On agda, this reduces the peak memory usage:
* `2.2 GB` to `1.9 GB` for a ghci session.
On `lib:Cabal`, we report:
* `570 MB` to `500 MB` for a ghci session
* `790 MB` to `667 MB` for compiling `lib:Cabal` with ghc
The execution time is not affected.
- - - - -
48b11db1 by Fendor at 2024-04-26T18:10:10+02:00
Avoid unneccessarily re-serialising the `ModIface`
To reduce memory usage of `ModIface`, we serialise `ModIface` to an
in-memory byte array, which implicitly shares duplicated values.
This serailised byte array can be reused to avoid work when we actually
write the `ModIface` to disk.
We introduce a new field to `ModIface` which allows us to save the byte
array, and write it to disk if the `ModIface` wasn't changed after the
initial serialisation.
This requires us to change absolute offsets, for example to jump to the
deduplication table for `Name` or `FastString` with relative offsets, as
the deduplication byte array doesn't contain header information, such as
fingerprints.
To allow us to dump the binary blob to disk, we need to replace all
absolute offsets with relative ones.
This leads to new primitives for `ModIface`, which help to construct
relative offsets.
- - - - -
df5db6c2 by Fendor at 2024-04-26T18:10:10+02:00
Make sure to always invalidate correctly
- - - - -
30 changed files:
- compiler/GHC.hs
- compiler/GHC/Core/Map/Expr.hs
- compiler/GHC/Data/TrieMap.hs
- compiler/GHC/Driver/Backpack.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Iface/Binary.hs
- compiler/GHC/Iface/Ext/Binary.hs
- compiler/GHC/Iface/Ext/Fields.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/Iface/Make.hs
- compiler/GHC/Iface/Recomp.hs
- compiler/GHC/Iface/Rename.hs
- compiler/GHC/Iface/Type.hs
- + compiler/GHC/Iface/Type/Map.hs
- compiler/GHC/Stg/CSE.hs
- compiler/GHC/Tc/Errors/Hole.hs
- compiler/GHC/Tc/Gen/Splice.hs
- compiler/GHC/Tc/Utils/Backpack.hs
- compiler/GHC/Unit/Module/ModIface.hs
- compiler/GHC/Utils/Binary.hs
- compiler/ghc.cabal.in
- docs/users_guide/using-optimisation.rst
- + testsuite/tests/iface/IfaceSharingIfaceType.hs
- + testsuite/tests/iface/IfaceSharingName.hs
- + testsuite/tests/iface/Lib.hs
- + testsuite/tests/iface/Makefile
- + testsuite/tests/iface/all.T
- + testsuite/tests/iface/if_faststring.hs
The diff was not included because it is too large.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/21816fcb43f9ba80413a58febf1c9e4406d94aae...df5db6c2050ba85e99ac814b18feb775e60faaac
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/21816fcb43f9ba80413a58febf1c9e4406d94aae...df5db6c2050ba85e99ac814b18feb775e60faaac
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240426/dacb34cf/attachment.html>
More information about the ghc-commits
mailing list