[Git][ghc/ghc][master] 12 commits: Several improvements to the handling of coercions
Marge Bot (@marge-bot)
gitlab at gitlab.haskell.org
Wed Apr 3 05:28:42 UTC 2024
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC
Commits:
e869605e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Several improvements to the handling of coercions
* Make `mkSymCo` and `mkInstCo` smarter
Fixes #23642
* Fix return role of `SelCo` in the coercion optimiser.
Fixes #23617
* Make the coercion optimiser `opt_trans_rule` work better for newtypes
Fixes #23619
- - - - -
1efd0714 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
FloatOut: improve floating for join point
See the new Note [Floating join point bindings].
* Completely get rid of the complicated join_ceiling nonsense, which
I have never understood.
* Do not float join points at all, except perhaps to top level.
* Some refactoring around wantToFloat, to treat Rec and NonRec more
uniformly
- - - - -
9c00154d by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve eta-expansion through call stacks
See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity
This is a one-line change, that fixes an inconsistency
- || isCallStackPredTy ty
+ || isCallStackPredTy ty || isCallStackTy ty
- - - - -
95a9a172 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Spelling, layout, pretty-printing only
- - - - -
bdf1660f by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve exprIsConApp_maybe a little
Eliminate a redundant case at birth. This sometimes reduces
Simplifier iterations.
See Note [Case elim in exprIsConApp_maybe].
- - - - -
609cd32c by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo
When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced. It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so. So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.
- - - - -
ae24c9bc by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Slight improvement in WorkWrap
Ensure that WorkWrap preserves lambda binders, in case of join points. Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
* it does no harm,
* it is slightly more efficient, and
* presumably it made something better!
Anyway I have kept it in a separate commit.
- - - - -
e9297181 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Use named record fields for the CastIt { ... } data constructor
This is a pure refactor
- - - - -
b4581e23 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Remove a long-commented-out line
Pure refactoring
- - - - -
e026bdf2 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Simplifier improvements
This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do. But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.
The main changes are below. There are also many new or rewritten Notes.
Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]
* The SimplEnv now has a seInlineDepth field, which says how deep
in unfoldings we are. See Note [Inline depth] in Simplify.Env.
Currently used only for the next point: avoiding repeatedly
simplifying coercions.
* Avoid repeatedly simplifying coercions.
see Note [Avoid re-simplifying coercions] in Simplify.Iteration
As you'll see from the Note, this makes use of the seInlineDepth.
* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
This is another part of Note [Post-inline for single-use things], and
is really good for reducing simplifier iterations in situations like
case K e of { K x -> blah }
wher x is used once in blah.
* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
elimination. Note [Case elim in exprIsConApp_maybe]
* Improve the case-merge transformation:
- Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
and friends. See Note [Merge Nested Cases] in GHC.Core.Utils.
- Add a new case for `tagToEnum#`; see wrinkle (MC3).
- Add a new case to look through join points: see wrinkle (MC4)
postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
that are used exactly once. See Note [Post-inline for single-use things].
* Do not postInlineUnconditionally join point, ever.
Doing so does not reduce allocation, which is the main point,
and with join points that are used a lot it can bloat code.
See point (1) of Note [Duplicating join points] in
GHC.Core.Opt.Simplify.Iteration.
* Do not postInlineUnconditionally a strict (demanded) binding.
It will not allocate a thunk (it'll turn into a case instead)
so again the main point of inlining it doesn't hold. Better
to check per-call-site.
* Improve occurrence analyis for bottoming function calls, to help
postInlineUnconditionally. See Note [Bottoming function calls]
in GHC.Core.Opt.OccurAnal
Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
See Note [Seq is boring] Also, wrinkle (SB1), inline in that
`seq` context only for INLINE functions (UnfWhen guidance).
* In GHC.Core.Opt.Simplify.Utils.interestingArg,
- return ValueArg for OtherCon [c1,c2, ...], but
- return NonTrivArg for OtherCon []
This makes a function a little less likely to inline if all we
know is that the argument is evaluated, but nothing else.
* isConLikeUnfolding is no longer true for OtherCon {}.
This propagates to exprIsConLike. Con-like-ness has /positive/
information.
Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
See these two long Notes
Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline
* When making join points, don't do so if the join point is so small
it will immediately be inlined; check uncondInlineJoin.
* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
heuristics for join points. In general we /do not/ want to inline
join points /even if they are small/. See Note [Duplicating join points]
GHC.Core.Opt.Simplify.Iteration.
But sometimes we do: see Note [Inlining join points] in
GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.
* Do not add an unfolding to a join point at birth. This is a tricky one
and has a long Note [Do not add unfoldings to join points at birth]
It shows up in two places
- In `mkDupableAlt` do not add an inlining
- (trickier) In `simplLetUnfolding` don't add an unfolding for a
fresh join point
I am not fully satisifed with this, but it works and is well documented.
* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
having a non-inlined join point.
Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.
* Compile times improve slightly. Here are the figures over 1%.
I investiate the biggest differnce in T18304. It's a very small module, just
a few hundred nodes. The large percentage difffence is due to a single
function that didn't quite inline before, and does now, making code size a
bit bigger. I decided gains outweighed the losses.
Metrics: compile_time/bytes allocated (changes over +/- 1%)
------------------------------------------------
CoOpt_Singletons(normal) -9.2% GOOD
LargeRecord(normal) -23.5% GOOD
MultiComponentModulesRecomp(normal) +1.2%
MultiLayerModulesTH_OneShot(normal) +4.1% BAD
PmSeriesS(normal) -3.8%
PmSeriesV(normal) -1.5%
T11195(normal) -1.3%
T12227(normal) -20.4% GOOD
T12545(normal) -3.2%
T12707(normal) -2.1% GOOD
T13253(normal) -1.2%
T13253-spj(normal) +8.1% BAD
T13386(normal) -3.1% GOOD
T14766(normal) -2.6% GOOD
T15164(normal) -1.4%
T15304(normal) +1.2%
T15630(normal) -8.2%
T15630a(normal) NEW
T15703(normal) -14.7% GOOD
T16577(normal) -2.3% GOOD
T17516(normal) -39.7% GOOD
T18140(normal) +1.2%
T18223(normal) -17.1% GOOD
T18282(normal) -5.0% GOOD
T18304(normal) +10.8% BAD
T18923(normal) -2.9% GOOD
T1969(normal) +1.0%
T19695(normal) -1.5%
T20049(normal) -12.7% GOOD
T21839c(normal) -4.1% GOOD
T3064(normal) -1.5%
T3294(normal) +1.2% BAD
T4801(normal) +1.2%
T5030(normal) -15.2% GOOD
T5321Fun(normal) -2.2% GOOD
T6048(optasm) -16.8% GOOD
T783(normal) -1.2%
T8095(normal) -6.0% GOOD
T9630(normal) -4.7% GOOD
T9961(normal) +1.9% BAD
WWRec(normal) -1.4%
info_table_map_perf(normal) -1.3%
parsing001(normal) +1.5%
geo. mean -2.0%
minimum -39.7%
maximum +10.8%
* Runtimes generally improve. In the testsuite perf/should_run gives:
Metrics: runtime/bytes allocated
------------------------------------------
Conversions(normal) -0.3%
T13536a(optasm) -41.7% GOOD
T4830(normal) -0.1%
haddock.Cabal(normal) -0.1%
haddock.base(normal) -0.1%
haddock.compiler(normal) -0.1%
geo. mean -0.8%
minimum -41.7%
maximum +0.0%
* For runtime, nofib is a better test. The news is mostly good.
Here are the number more than +/- 0.1%:
# bytes allocated
==========================++==========
imaginary/digits-of-e1 || -14.40%
imaginary/digits-of-e2 || -4.41%
imaginary/paraffins || -0.17%
imaginary/rfib || -0.15%
imaginary/wheel-sieve2 || -0.10%
real/compress || -0.47%
real/fluid || -0.10%
real/fulsom || +0.14%
real/gamteb || -1.47%
real/gg || -0.20%
real/infer || +0.24%
real/pic || -0.23%
real/prolog || -0.36%
real/scs || -0.46%
real/smallpt || +4.03%
shootout/k-nucleotide || -20.23%
shootout/n-body || -0.42%
shootout/spectral-norm || -0.13%
spectral/boyer2 || -3.80%
spectral/constraints || -0.27%
spectral/hartel/ida || -0.82%
spectral/mate || -20.34%
spectral/para || +0.46%
spectral/rewrite || +1.30%
spectral/sphere || -0.14%
==========================++==========
geom mean || -0.59%
real/smallpt has a huge nest of local definitions, and I
could not pin down a reason for a regression. But there are
three big wins!
Metric Decrease:
CoOpt_Singletons
LargeRecord
T12227
T12707
T13386
T13536a
T14766
T15703
T16577
T17516
T18223
T18282
T18923
T21839c
T20049
T5321Fun
T5030
T6048
T8095
T9630
T783
Metric Increase:
MultiLayerModulesTH_OneShot
T13253-spj
T18304
T18698a
T9961
T3294
- - - - -
27db3c5e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Testsuite message changes from simplifier improvements
- - - - -
271a7812 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Account for bottoming functions in OccurAnal
This fixes #24582, a small but long-standing bug
- - - - -
14 changed files:
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/FloatOut.hs
- compiler/GHC/Core/Opt/Monad.hs
- compiler/GHC/Core/Opt/OccurAnal.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/Simplify.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Inline.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Monad.hs
The diff was not included because it is too large.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8d95096857dbea39b80befacab35182e0f64ae62...271a7812cbb47494c74b1dc3b7d2a26fd8d88365
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8d95096857dbea39b80befacab35182e0f64ae62...271a7812cbb47494c74b1dc3b7d2a26fd8d88365
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240403/ad31c11d/attachment-0001.html>
More information about the ghc-commits
mailing list