[Git][ghc/ghc][wip/compact-sourcetext] 4 commits: compiler: Use compact representation for SourceText

Zubin (@wz1000) gitlab at gitlab.haskell.org
Mon May 15 10:03:13 UTC 2023



Zubin pushed to branch wip/compact-sourcetext at Glasgow Haskell Compiler / GHC


Commits:
1eefcb73 by Zubin Duggal at 2023-05-15T15:19:09+05:30
compiler: Use compact representation for SourceText

SourceText is serialized along with INLINE pragmas into interface files. Many of
these SourceTexts are identical, for example "{-# INLINE#". When deserialized,
each such SourceText was previously expanded out into a [Char], which is highly
wasteful of memory, and each such instance of the text would allocate an
independent list with its contents as deserializing breaks any sharing that might
have existed.

Instead, we use a `FastString` to represent these, so that each instance unique
text will be interned and stored in a memory efficient manner.

- - - - -
c6d55776 by Zubin Duggal at 2023-05-15T15:25:14+05:30
compiler: Use compact representation/FastStrings for `SourceNote`s

`SourceNote`s should not be stored as [Char] as this is highly wasteful
and in certain scenarios can be highly duplicated.

Metric Decrease:
  hard_hole_fits

- - - - -
745ed77e by Zubin Duggal at 2023-05-15T15:28:22+05:30
compiler: Use compact representation for UsageFile (#22744)

Use FastString to store filepaths in interface files, as this data is
highly redundant so we want to share all instances of filepaths in the
compiler session.

- - - - -
d9394fd9 by Zubin Duggal at 2023-05-15T15:29:42+05:30
testsuite: add test for T22744

This test checks for #22744 by compiling 100 modules which each have
a dependency on 1000 distinct external files.

Previously, when loading these interfaces from disk, each individual instance
of a filepath in the interface will would be allocated as an individual object
on the heap, meaning we have heap objects for 100*1000 files, when there are
only 1000 distinct files we care about.

This test checks this by first compiling the module normally, then measuring
the peak memory usage in a no-op recompile, as the recompilation checking will
force the allocation of all these filepaths.

- - - - -


30 changed files:

- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/Dwarf.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/CoreToIface.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Hs/Binds.hs
- compiler/GHC/Hs/Decls.hs
- compiler/GHC/Hs/Dump.hs
- compiler/GHC/Hs/Expr.hs
- compiler/GHC/Hs/ImpExp.hs
- compiler/GHC/HsToCore/Ticks.hs
- compiler/GHC/HsToCore/Usage.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/Iface/Recomp.hs
- compiler/GHC/Iface/Syntax.hs
- compiler/GHC/IfaceToCore.hs
- compiler/GHC/Parser/Lexer.x
- compiler/GHC/Parser/PostProcess.hs
- compiler/GHC/Stg/Debug.hs
- compiler/GHC/StgToCmm/InfoTableProv.hs
- compiler/GHC/ThToHs.hs
- compiler/GHC/Types/Basic.hs
- compiler/GHC/Types/IPE.hs
- compiler/GHC/Types/SourceText.hs
- compiler/GHC/Types/Tickish.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/44027418ebd05b1311933f3f066a642ab5b5f06c...d9394fd9a72d4c85da6c53444fc5f99189e7ad8a

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/44027418ebd05b1311933f3f066a642ab5b5f06c...d9394fd9a72d4c85da6c53444fc5f99189e7ad8a
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230515/80b3e19c/attachment.html>


More information about the ghc-commits mailing list