[Git][ghc/ghc][wip/fendor/ghc-iface-sharing-avoid-reserialisation] 16 commits: Add performance regression test for '-fwrite-simplified-core'

Hannes Siebenhandl (@fendor) gitlab at gitlab.haskell.org
Wed Apr 24 07:51:26 UTC 2024



Hannes Siebenhandl pushed to branch wip/fendor/ghc-iface-sharing-avoid-reserialisation at Glasgow Haskell Compiler / GHC


Commits:
593f4e04 by Fendor at 2024-04-23T10:19:14-04:00
Add performance regression test for '-fwrite-simplified-core'

- - - - -
1ba39b05 by Fendor at 2024-04-23T10:19:14-04:00
Typecheck corebindings lazily during bytecode generation

This delays typechecking the corebindings until the bytecode generation
happens.

We also avoid allocating a thunk that is retained by `unsafeInterleaveIO`.
In general, we shouldn't retain values of the hydrated `Type`, as not evaluating
the bytecode object keeps it alive.

It is better if we retain the unhydrated `IfaceType`.

See Note [Hydrating Modules]

- - - - -
e916fc92 by Alan Zimmerman at 2024-04-23T10:19:50-04:00
EPA: Keep comments in a CaseAlt match

The comments now live in the surrounding location, not inside the
Match. Make sure we keep them.

Closes #24707

- - - - -
d2b17f32 by Cheng Shao at 2024-04-23T15:01:22-04:00
driver: force merge objects when building dynamic objects

This patch forces the driver to always merge objects when building
dynamic objects even when ar -L is supported. It is an oversight of
!8887: original rationale of that patch is favoring the relatively
cheap ar -L operation over object merging when ar -L is supported,
which makes sense but only if we are building static objects! Omitting
check for whether we are building dynamic objects will result in
broken .so files with undefined reference errors at executable link
time when building GHC with llvm-ar. Fixes #22210.

- - - - -
209d09f5 by Julian Ospald at 2024-04-23T15:02:03-04:00
Allow non-absolute values for bootstrap GHC variable

Fixes #24682

- - - - -
3fff0977 by Matthew Pickering at 2024-04-23T15:02:38-04:00
Don't depend on registerPackage function in Cabal

More recent versions of Cabal modify the behaviour of libAbiHash which
breaks our usage of registerPackage.

It is simpler to inline the part of registerPackage that we need and
avoid any additional dependency and complication using the higher-level
function introduces.

- - - - -
d636f6a2 by Fendor at 2024-04-24T09:47:56+02:00
Refactor the Binary serialisation interface

The goal is simplifiy adding deduplication tables to `ModIface`
interface serialisation.

We identify two main points of interest that make this difficult:

1. UserData hardcodes what `Binary` instances can have deduplication
   tables. Moreover, it heavily uses partial functions.
2. GHC.Iface.Binary hardcodes the deduplication tables for 'Name' and
   'FastString', making it difficult to add more deduplication.

Instead of having a single `UserData` record with fields for all the
types that can have deduplication tables, we allow to provide custom
serialisers for any `Typeable`.
These are wrapped in existentials and stored in a `Map` indexed by their
respective `TypeRep`.
The `Binary` instance of the type to deduplicate still needs to
explicitly look up the decoder via `findUserDataReader` and
`findUserDataWriter`, which is no worse than the status-quo.

`Map` was chosen as microbenchmarks indicate it is the fastest for a
small number of keys (< 10).

To generalise the deduplication table serialisation mechanism, we
introduce the types `ReaderTable` and `WriterTable` which provide a
simple interface that is sufficient to implement a general purpose
deduplication mechanism for `writeBinIface` and `readBinIface`.

This allows us to provide a list of deduplication tables for
serialisation that can be extended more easily, for example for
`IfaceTyCon`, see the issue https://gitlab.haskell.org/ghc/ghc/-/issues/24540
for more motivation.

In addition to this refactoring, we split `UserData` into `ReaderUserData`
and `WriterUserData`, to avoid partial functions and reduce overall
memory usage, as we need fewer mutable variables.

Bump haddock submodule to accomodate for `UserData` split.

-------------------------
Metric Increase:
    MultiLayerModulesTH_Make
    MultiLayerModulesRecomp
    T21839c
-------------------------

- - - - -
be6836a2 by Fendor at 2024-04-24T09:47:56+02:00
Split `BinHandle` into `ReadBinHandle` and `WriteBinHandle`

A `BinHandle` contains too much information for reading data.
For example, it needs to keep a `FastMutInt` and a `IORef BinData`,
when the non-mutable variants would suffice.

Additionally, this change has the benefit that anyone can immediately
tell whether the `BinHandle` is used for reading or writing.

Bump haddock submodule BinHandle split.

- - - - -
a0f3d1f7 by Fendor at 2024-04-24T09:49:28+02:00
Add Eq and Ord instance to `IfaceType`

We add an `Ord` instance so that we can store `IfaceType` in a
`Data.Map` container.
This is required to deduplicate `IfaceType` while writing `.hi` files to
disk. Deduplication has many beneficial consequences to both file size
and memory usage, as the deduplication enables implicit sharing of
values.
See issue #24540 for more motivation.

The `Ord` instance would be unnecessary if we used a `TrieMap` instead
of `Data.Map` for the deduplication process. While in theory this is
clerarly the better option, experiments on the agda code base showed
that a `TrieMap` implementation has worse run-time performance
characteristics.

To the change itself, we mostly derive `Eq` and `Ord`. This requires us
to change occurrences of `FastString` with `LexicalFastString`, since
`FastString` has no `Ord` instance.
We change the definition of `IfLclName` to a newtype of
`LexicalFastString`, to make such changes in the future easier.

Bump haddock submodule for IfLclName changes

- - - - -
61407888 by Fendor at 2024-04-24T09:49:28+02:00
Break cyclic module dependency

- - - - -
9932b31d by Fendor at 2024-04-24T09:49:28+02:00
Add deduplication table for `IfaceType`

The type `IfaceType` is a highly redundant, tree-like data structure.
While benchmarking, we realised that the high redundancy of `IfaceType`
causes high memory consumption in GHCi sessions when byte code is
embedded into the `.hi` file via `-fwrite-if-simplified-core` or
`-fbyte-code-and-object-code`.
Loading such `.hi` files from disk introduces many duplicates of
memory expensive values in `IfaceType`, such as `IfaceTyCon`,
`IfaceTyConApp`, `IA_Arg` and many more.

We improve the memory behaviour of GHCi by adding an additional
deduplication table for `IfaceType` to the serialisation of `ModIface`,
similar to how we deduplicate `Name`s and `FastString`s.
When reading the interface file back, the table allows us to automatically
share identical values of `IfaceType`.

To provide some numbers, we evaluated this patch on the agda code base.
We loaded the full library from the `.hi` files, which contained the
embedded core expressions (`-fwrite-if-simplified-core`).

Before this patch:

* Load time: 11.7 s, 2.5 GB maximum residency.

After this patch:

* Load time:  7.3 s, 1.7 GB maximum residency.

This deduplication has the beneficial side effect to additionally reduce
the size of the on-disk interface files tremendously.

For example, on agda, we reduce the size of `.hi` files (with
`-fwrite-if-simplified-core`):

* Before: 101 MB on disk
* Now:     24 MB on disk

This has even a beneficial side effect on the cabal store. We reduce the
size of the store on disk:

* Before: 341 MB on disk
* Now:    310 MB on disk

Note, none of the dependencies have been compiled with
`-fwrite-if-simplified-core`, but `IfaceType` occurs in multiple
locations in a `ModIface`.

We also add IfaceType deduplication table to .hie serialisation and
refactor .hie file serialisation to use the same infrastrucutre as
`putWithTables`.

Bump haddock submodule to accomodate for changes to the deduplication
table layout and binary interface.

- - - - -
7ddf8ecf by Matthew Pickering at 2024-04-24T09:49:28+02:00
Add run-time configurability of .hi file compression

Introduce the flag `-fwrite-if-compression=<n>` which allows to
configure the compression level of writing .hi files.

The motivation is that some deduplication operations are too expensive
for the average use case. Hence, we introduce multiple compression
levels that have a minimal impact on performance, but still reduce the
memory residency and `.hi` file size on disk considerably.

We introduce three compression levels:

* `1`: `Normal` mode. This is the least amount of compression.
    It deduplicates only `Name` and `FastString`s, and is naturally the
    fastest compression mode.
* `2`: `Safe` mode. It has a noticeable impact on .hi file size and is
  marginally slower than `Normal` mode. In general, it should be safe to
  always use `Safe` mode.
* `3`: `Full` deduplication mode. Deduplicate as much as we can,
  resulting in minimal .hi files, but at the cost of additional
  compilation time.

Reading .hi files doesn't need to know the initial compression level,
and can always deserialise a `ModIface`.
This allows users to experiment with different compression levels for
packages, without recompilation of dependencies.

Note, the deduplication also has an additional side effect of reduced
memory consumption to implicit sharing of deduplicated elements.
See https://gitlab.haskell.org/ghc/ghc/-/issues/24540 for example where
that matters.

-------------------------
Metric Decrease:
    T21839c
    T24471
-------------------------

- - - - -
2c32fbcf by Matthew Pickering at 2024-04-24T09:49:28+02:00
Introduce regression tests for `.hi` file sizes

Add regression tests to track how `-fwrite-if-compression` levels affect
the size of `.hi` files.

- - - - -
5c18a8b8 by Fendor at 2024-04-24T09:49:28+02:00
Implement TrieMap for IfaceType

- - - - -
4db39e3c by Fendor at 2024-04-24T09:51:19+02:00
Share duplicated values in ModIface after generation

This helps keeping down the peak memory usage while compiling in
`--make` mode.

- - - - -
5b7e1def by Fendor at 2024-04-24T09:51:19+02:00
Reuse the 'ReadBinMem' after sharing to avoid recomputations

- - - - -


30 changed files:

- compiler/GHC/Core/Map/Expr.hs
- compiler/GHC/Core/TyCo/Rep.hs
- compiler/GHC/CoreToIface.hs
- compiler/GHC/Data/FastString.hs
- compiler/GHC/Data/TrieMap.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Driver/Pipeline/Execute.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Iface/Binary.hs
- compiler/GHC/Iface/Decl.hs
- compiler/GHC/Iface/Env.hs
- compiler/GHC/Iface/Ext/Binary.hs
- compiler/GHC/Iface/Ext/Fields.hs
- compiler/GHC/Iface/Ext/Utils.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/Iface/Make.hs
- compiler/GHC/Iface/Recomp.hs
- compiler/GHC/Iface/Recomp/Binary.hs
- compiler/GHC/Iface/Recomp/Flags.hs
- compiler/GHC/Iface/Syntax.hs
- compiler/GHC/Iface/Type.hs
- + compiler/GHC/Iface/Type/Map.hs
- compiler/GHC/IfaceToCore.hs
- compiler/GHC/Parser.y
- compiler/GHC/Stg/CSE.hs
- compiler/GHC/StgToJS/Object.hs
- compiler/GHC/Tc/Gen/Splice.hs
- compiler/GHC/Types/Basic.hs
- compiler/GHC/Types/FieldLabel.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d073db1e92b5d41d1b7e07dcc57d467976ceac08...5b7e1def55d5ac2b6d6d0e2766bf19fb00d24218

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d073db1e92b5d41d1b7e07dcc57d467976ceac08...5b7e1def55d5ac2b6d6d0e2766bf19fb00d24218
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240424/d8149512/attachment-0001.html>


More information about the ghc-commits mailing list