[Git][ghc/ghc][wip/fendor/ghc-iface-sharing-avoid-reserialisation] 29 commits: ghc-bignum: remove obsolete ln script
Hannes Siebenhandl (@fendor)
gitlab at gitlab.haskell.org
Tue Apr 30 09:47:45 UTC 2024
Hannes Siebenhandl pushed to branch wip/fendor/ghc-iface-sharing-avoid-reserialisation at Glasgow Haskell Compiler / GHC
Commits:
c62dc317 by Cheng Shao at 2024-04-25T01:32:02-04:00
ghc-bignum: remove obsolete ln script
This commit removes an obsolete ln script in ghc-bignum/gmp. See
060251c24ad160264ae8553efecbb8bed2f06360 for its original intention,
but it's been obsolete for a long time, especially since the removal
of the make build system. Hence the house cleaning.
- - - - -
6399d52b by Cheng Shao at 2024-04-25T01:32:02-04:00
ghc-bignum: update gmp to 6.3.0
This patch bumps the gmp-tarballs submodule and updates gmp to 6.3.0.
The tarball format is now xz, and gmpsrc.patch has been patched into
the tarball so hadrian no longer needs to deal with patching logic
when building in-tree GMP.
- - - - -
65b4b92f by Cheng Shao at 2024-04-25T01:32:02-04:00
hadrian: remove obsolete Patch logic
This commit removes obsolete Patch logic from hadrian, given we no
longer need to patch the gmp tarball when building in-tree GMP.
- - - - -
71f28958 by Cheng Shao at 2024-04-25T01:32:02-04:00
autoconf: remove obsolete patch detection
This commit removes obsolete deletection logic of the patch command
from autoconf scripts, given we no longer need to patch anything in
the GHC build process.
- - - - -
daeda834 by Sylvain Henry at 2024-04-25T01:32:43-04:00
JS: correctly handle RUBBISH literals (#24664)
- - - - -
8a06ddf6 by Matthew Pickering at 2024-04-25T11:16:16-04:00
Linearise ghc-internal and base build
This is achieved by requesting the final package database for
ghc-internal, which mandates it is fully built as a dependency of
configuring the `base` package. This is at the expense of cross-package
parrallelism between ghc-internal and the base package.
Fixes #24436
- - - - -
94da9365 by Andrei Borzenkov at 2024-04-25T11:16:54-04:00
Fix tuple puns renaming (24702)
Move tuple renaming short cutter from `isBuiltInOcc_maybe` to `isPunOcc_maybe`, so we consider incoming module.
I also fixed some hidden bugs that raised after the change was done.
- - - - -
fa03b1fb by Fendor at 2024-04-26T18:03:13-04:00
Refactor the Binary serialisation interface
The goal is simplifiy adding deduplication tables to `ModIface`
interface serialisation.
We identify two main points of interest that make this difficult:
1. UserData hardcodes what `Binary` instances can have deduplication
tables. Moreover, it heavily uses partial functions.
2. GHC.Iface.Binary hardcodes the deduplication tables for 'Name' and
'FastString', making it difficult to add more deduplication.
Instead of having a single `UserData` record with fields for all the
types that can have deduplication tables, we allow to provide custom
serialisers for any `Typeable`.
These are wrapped in existentials and stored in a `Map` indexed by their
respective `TypeRep`.
The `Binary` instance of the type to deduplicate still needs to
explicitly look up the decoder via `findUserDataReader` and
`findUserDataWriter`, which is no worse than the status-quo.
`Map` was chosen as microbenchmarks indicate it is the fastest for a
small number of keys (< 10).
To generalise the deduplication table serialisation mechanism, we
introduce the types `ReaderTable` and `WriterTable` which provide a
simple interface that is sufficient to implement a general purpose
deduplication mechanism for `writeBinIface` and `readBinIface`.
This allows us to provide a list of deduplication tables for
serialisation that can be extended more easily, for example for
`IfaceTyCon`, see the issue https://gitlab.haskell.org/ghc/ghc/-/issues/24540
for more motivation.
In addition to this refactoring, we split `UserData` into `ReaderUserData`
and `WriterUserData`, to avoid partial functions and reduce overall
memory usage, as we need fewer mutable variables.
Bump haddock submodule to accomodate for `UserData` split.
-------------------------
Metric Increase:
MultiLayerModulesTH_Make
MultiLayerModulesRecomp
T21839c
-------------------------
- - - - -
bac57298 by Fendor at 2024-04-26T18:03:13-04:00
Split `BinHandle` into `ReadBinHandle` and `WriteBinHandle`
A `BinHandle` contains too much information for reading data.
For example, it needs to keep a `FastMutInt` and a `IORef BinData`,
when the non-mutable variants would suffice.
Additionally, this change has the benefit that anyone can immediately
tell whether the `BinHandle` is used for reading or writing.
Bump haddock submodule BinHandle split.
- - - - -
4d6394dd by Simon Peyton Jones at 2024-04-26T18:03:49-04:00
Fix missing escaping-kind check in tcPatSynSig
Note [Escaping kind in type signatures] explains how we deal
with escaping kinds in type signatures, e.g.
f :: forall r (a :: TYPE r). a
where the kind of the body is (TYPE r), but `r` is not in
scope outside the forall-type.
I had missed this subtlety in tcPatSynSig, leading to #24686.
This MR fixes it; and a similar bug in tc_top_lhs_type. (The
latter is tested by T24686a.)
- - - - -
981c2c2c by Alan Zimmerman at 2024-04-26T18:04:25-04:00
EPA: check-exact: check that the roundtrip reproduces the source
Closes #24670
- - - - -
a8616747 by Andrew Lelechenko at 2024-04-26T18:05:01-04:00
Document that setEnv is not thread-safe
- - - - -
1e41de83 by Bryan Richter at 2024-04-26T18:05:37-04:00
CI: Work around frequent Signal 9 errors
- - - - -
a6d5f9da by Naïm Favier at 2024-04-27T17:52:40-04:00
ghc-internal: add MonadFix instance for (,)
Closes https://gitlab.haskell.org/ghc/ghc/-/issues/24288, implements CLC
proposal https://github.com/haskell/core-libraries-committee/issues/238.
Adds a MonadFix instance for tuples, permitting value recursion in the
"native" writer monad and bringing consistency with the existing
instance for transformers's WriterT (and, to a lesser extent, for Solo).
- - - - -
64feadcd by Rodrigo Mesquita at 2024-04-27T17:53:16-04:00
bindist: Fix xattr cleaning
The original fix (725343aa) was incorrect because it used the shell
bracket syntax which is the quoting syntax in autoconf, making the test
for existence be incorrect and therefore `xattr` was never run.
Fixes #24554
- - - - -
e2094df3 by damhiya at 2024-04-28T23:52:00+09:00
Make read accepts binary integer formats
CLC proposal : https://github.com/haskell/core-libraries-committee/issues/177
- - - - -
1c2fd963 by Alan Zimmerman at 2024-04-29T23:17:00-04:00
EPA: Preserve comments in Match Pats
Closes #24708
Closes #24715
Closes #24734
- - - - -
4189d17e by Sylvain Henry at 2024-04-29T23:17:42-04:00
LLVM: better unreachable default destination in Switch (#24717)
See added note.
Co-authored-by: Siddharth Bhat <siddu.druid at gmail.com>
- - - - -
a3725c88 by Cheng Shao at 2024-04-29T23:18:20-04:00
ci: enable wasm jobs for MRs with wasm label
This patch enables wasm jobs for MRs with wasm label. Previously the
wasm label didn't actually have any effect on the CI pipeline, and
full-ci needed to be applied to run wasm jobs which was a waste of
runners when working on the wasm backend, hence the fix here.
- - - - -
702f7964 by Matthew Pickering at 2024-04-29T23:18:56-04:00
Make interface files and object files depend on inplace .conf file
A potential fix for #24737
- - - - -
44986061 by Fendor at 2024-04-30T11:29:50+02:00
Add Eq and Ord instance to `IfaceType`
We add an `Ord` instance so that we can store `IfaceType` in a
`Data.Map` container.
This is required to deduplicate `IfaceType` while writing `.hi` files to
disk. Deduplication has many beneficial consequences to both file size
and memory usage, as the deduplication enables implicit sharing of
values.
See issue #24540 for more motivation.
The `Ord` instance would be unnecessary if we used a `TrieMap` instead
of `Data.Map` for the deduplication process. While in theory this is
clerarly the better option, experiments on the agda code base showed
that a `TrieMap` implementation has worse run-time performance
characteristics.
To the change itself, we mostly derive `Eq` and `Ord`. This requires us
to change occurrences of `FastString` with `LexicalFastString`, since
`FastString` has no `Ord` instance.
We change the definition of `IfLclName` to a newtype of
`LexicalFastString`, to make such changes in the future easier.
Bump haddock submodule for IfLclName changes
- - - - -
795deda1 by Fendor at 2024-04-30T11:29:50+02:00
Break cyclic module dependency
- - - - -
31b93f5a by Fendor at 2024-04-30T11:29:50+02:00
Add deduplication table for `IfaceType`
The type `IfaceType` is a highly redundant, tree-like data structure.
While benchmarking, we realised that the high redundancy of `IfaceType`
causes high memory consumption in GHCi sessions when byte code is
embedded into the `.hi` file via `-fwrite-if-simplified-core` or
`-fbyte-code-and-object-code`.
Loading such `.hi` files from disk introduces many duplicates of
memory expensive values in `IfaceType`, such as `IfaceTyCon`,
`IfaceTyConApp`, `IA_Arg` and many more.
We improve the memory behaviour of GHCi by adding an additional
deduplication table for `IfaceType` to the serialisation of `ModIface`,
similar to how we deduplicate `Name`s and `FastString`s.
When reading the interface file back, the table allows us to automatically
share identical values of `IfaceType`.
To provide some numbers, we evaluated this patch on the agda code base.
We loaded the full library from the `.hi` files, which contained the
embedded core expressions (`-fwrite-if-simplified-core`).
Before this patch:
* Load time: 11.7 s, 2.5 GB maximum residency.
After this patch:
* Load time: 7.3 s, 1.7 GB maximum residency.
This deduplication has the beneficial side effect to additionally reduce
the size of the on-disk interface files tremendously.
For example, on agda, we reduce the size of `.hi` files (with
`-fwrite-if-simplified-core`):
* Before: 101 MB on disk
* Now: 24 MB on disk
This has even a beneficial side effect on the cabal store. We reduce the
size of the store on disk:
* Before: 341 MB on disk
* Now: 310 MB on disk
Note, none of the dependencies have been compiled with
`-fwrite-if-simplified-core`, but `IfaceType` occurs in multiple
locations in a `ModIface`.
We also add IfaceType deduplication table to .hie serialisation and
refactor .hie file serialisation to use the same infrastrucutre as
`putWithTables`.
Bump haddock submodule to accomodate for changes to the deduplication
table layout and binary interface.
- - - - -
c064e2ef by Matthew Pickering at 2024-04-30T11:29:50+02:00
Add run-time configurability of .hi file compression
Introduce the flag `-fwrite-if-compression=<n>` which allows to
configure the compression level of writing .hi files.
The motivation is that some deduplication operations are too expensive
for the average use case. Hence, we introduce multiple compression
levels that have a minimal impact on performance, but still reduce the
memory residency and `.hi` file size on disk considerably.
We introduce three compression levels:
* `1`: `Normal` mode. This is the least amount of compression.
It deduplicates only `Name` and `FastString`s, and is naturally the
fastest compression mode.
* `2`: `Safe` mode. It has a noticeable impact on .hi file size and is
marginally slower than `Normal` mode. In general, it should be safe to
always use `Safe` mode.
* `3`: `Full` deduplication mode. Deduplicate as much as we can,
resulting in minimal .hi files, but at the cost of additional
compilation time.
Reading .hi files doesn't need to know the initial compression level,
and can always deserialise a `ModIface`.
This allows users to experiment with different compression levels for
packages, without recompilation of dependencies.
Note, the deduplication also has an additional side effect of reduced
memory consumption to implicit sharing of deduplicated elements.
See https://gitlab.haskell.org/ghc/ghc/-/issues/24540 for example where
that matters.
-------------------------
Metric Decrease:
MultiLayerModulesDefsGhciWithCore
T21839c
T24471
-------------------------
- - - - -
158af23b by Matthew Pickering at 2024-04-30T11:29:50+02:00
Introduce regression tests for `.hi` file sizes
Add regression tests to track how `-fwrite-if-compression` levels affect
the size of `.hi` files.
- - - - -
485d2cb3 by Fendor at 2024-04-30T11:29:50+02:00
Implement TrieMap for IfaceType
- - - - -
a3703244 by Fendor at 2024-04-30T11:47:20+02:00
Improve sharing of duplicated values in `ModIface`
As a `ModIface` contains often duplicated values that are not
necessarily shared, we improve sharing by serialising the `ModIface`
to an in-memory byte array. Serialisation uses deduplication tables, and
deserialisation implicitly shares duplicated values.
This helps reducing the peak memory usage while compiling in
`--make` mode. The peak memory usage is especially reduced when
generating interface files with core expressions
(`-fwrite-if-simplified-core`).
On agda, this reduces the peak memory usage:
* `2.2 GB` to `1.9 GB` for a ghci session.
On `lib:Cabal`, we report:
* `570 MB` to `500 MB` for a ghci session
* `790 MB` to `667 MB` for compiling `lib:Cabal` with ghc
The execution time is not affected.
- - - - -
ada0e6a3 by Fendor at 2024-04-30T11:47:30+02:00
Avoid unneccessarily re-serialising the `ModIface`
To reduce memory usage of `ModIface`, we serialise `ModIface` to an
in-memory byte array, which implicitly shares duplicated values.
This serailised byte array can be reused to avoid work when we actually
write the `ModIface` to disk.
We introduce a new field to `ModIface` which allows us to save the byte
array, and write it to disk if the `ModIface` wasn't changed after the
initial serialisation.
This requires us to change absolute offsets, for example to jump to the
deduplication table for `Name` or `FastString` with relative offsets, as
the deduplication byte array doesn't contain header information, such as
fingerprints.
To allow us to dump the binary blob to disk, we need to replace all
absolute offsets with relative ones.
This leads to new primitives for `ModIface`, which help to construct
relative offsets.
- - - - -
ebe3a0ad by Fendor at 2024-04-30T11:47:30+02:00
Make sure to always invalidate correctly
- - - - -
30 changed files:
- .gitlab-ci.yml
- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- compiler/GHC.hs
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/CmmToLlvm/CodeGen.hs
- compiler/GHC/Core/Map/Expr.hs
- compiler/GHC/Core/TyCo/Rep.hs
- compiler/GHC/CoreToIface.hs
- compiler/GHC/Data/FastString.hs
- compiler/GHC/Data/TrieMap.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Iface/Binary.hs
- compiler/GHC/Iface/Decl.hs
- compiler/GHC/Iface/Env.hs
- compiler/GHC/Iface/Ext/Binary.hs
- compiler/GHC/Iface/Ext/Fields.hs
- compiler/GHC/Iface/Ext/Utils.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/Iface/Make.hs
- compiler/GHC/Iface/Recomp.hs
- compiler/GHC/Iface/Recomp/Binary.hs
- compiler/GHC/Iface/Recomp/Flags.hs
- compiler/GHC/Iface/Rename.hs
- compiler/GHC/Iface/Syntax.hs
- compiler/GHC/Iface/Type.hs
- + compiler/GHC/Iface/Type/Map.hs
- compiler/GHC/IfaceToCore.hs
The diff was not included because it is too large.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b7ada9b6f2949298f9b9dec32eabb3b70e9108ad...ebe3a0ad890dfe6a56d64bd486f69d4dc4fe057a
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b7ada9b6f2949298f9b9dec32eabb3b70e9108ad...ebe3a0ad890dfe6a56d64bd486f69d4dc4fe057a
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240430/e1941e0b/attachment-0001.html>
More information about the ghc-commits
mailing list