[Git][ghc/ghc][wip/simplifier-tweaks] 14 commits: EPA: Extend StringLiteral range to include trailing commas

Mon Apr 1 08:38:30 UTC 2024

Simon Peyton Jones pushed to branch wip/simplifier-tweaks at Glasgow Haskell Compiler / GHC

Commits:
00d3ecf0 by Alan Zimmerman at 2024-03-29T12:19:10+00:00
EPA: Extend StringLiteral range to include trailing commas

This goes slightly against the exact printing philosophy where
trailing decorations should be in an annotation, but the
practicalities of adding it to the WarningTxt environment, and the
problems caused by deviating do not make a more principles approach
worthwhile.

- - - - -
efab3649 by brandon s allbery kf8nh at 2024-03-31T20:04:01-04:00
clarify Note [Preproccesing invocations]

- - - - -
ac55fe7a by Simon Peyton Jones at 2024-04-01T09:37:28+01:00
Several improvements to the handling of coercions

* Make `mkSymCo` and `mkInstCo` smarter
  Fixes #23642

* Fix return role of `SelCo` in the coercion optimiser.
  Fixes #23617

* Make the coercion optimiser `opt_trans_rule` work better for newtypes
  Fixes #23619

- - - - -
3e4ac7b3 by Simon Peyton Jones at 2024-04-01T09:37:28+01:00
FloatOut: improve floating for join point

See the new Note [Floating join point bindings].

* Completely get rid of the complicated join_ceiling nonsense, which
  I have never understood.

* Do not float join points at all, except perhaps to top level.

* Some refactoring around wantToFloat, to treat Rec and NonRec more
  uniformly

- - - - -
f1b36c74 by Simon Peyton Jones at 2024-04-01T09:37:28+01:00
Improve eta-expansion through call stacks

See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity

This is a one-line change, that fixes an inconsistency
-               || isCallStackPredTy ty
+               || isCallStackPredTy ty || isCallStackTy ty

- - - - -
346a21f7 by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Spelling, layout, pretty-printing only

- - - - -
fae6aa6a by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Improve exprIsConApp_maybe a little

Eliminate a redundant case at birth.  This sometimes reduces
Simplifier iterations.

See Note [Case elim in exprIsConApp_maybe].

- - - - -
84236391 by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo

When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced.  It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so.  So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.

- - - - -
8e4bc1d5 by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Slight improvement in WorkWrap

Ensure that WorkWrap preserves lambda binders, in case of join points.  Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
  * it does no harm,
  * it is slightly more efficient, and
  * presumably it made something better!

Anyway I have kept it in a separate commit.

- - - - -
54c038a8 by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Use named record fields for the CastIt { ... } data constructor

This is a pure refactor

- - - - -
42b4a18d by Simon Peyton Jones at 2024-04-01T09:37:29+01:00
Remove a long-commented-out line

Pure refactoring

- - - - -
af913d6a by Simon Peyton Jones at 2024-04-01T09:37:32+01:00
Simplifier improvements

This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do.  But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.

The main changes are below.  There are also many new or rewritten Notes.

Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]

* The SimplEnv now has a seInlineDepth field, which says how deep
  in unfoldings we are.  See Note [Inline depth] in Simplify.Env.
  Currently used only for the next point: avoiding repeatedly
  simplifying coercions.

* Avoid repeatedly simplifying coercions.
  see Note [Avoid re-simplifying coercions] in Simplify.Iteration
  As you'll see from the Note, this makes use of the seInlineDepth.

* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
  This is another part of Note [Post-inline for single-use things], and
  is really good for reducing simplifier iterations in situations like
      case K e of { K x -> blah }
  wher x is used once in blah.

* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
  elimination.  Note [Case elim in exprIsConApp_maybe]

* Improve the case-merge transformation:
  - Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
    and friends.  See Note [Merge Nested Cases] in GHC.Core.Utils.
  - Add a new case for `tagToEnum#`; see wrinkle (MC3).
  - Add a new case to look through join points: see wrinkle (MC4)

postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
  that are used exactly once. See Note [Post-inline for single-use things].

* Do not postInlineUnconditionally join point, ever.
  Doing so does not reduce allocation, which is the main point,
  and with join points that are used a lot it can bloat code.
  See point (1) of Note [Duplicating join points] in
  GHC.Core.Opt.Simplify.Iteration.

* Do not postInlineUnconditionally a strict (demanded) binding.
  It will not allocate a thunk (it'll turn into a case instead)
  so again the main point of inlining it doesn't hold.  Better
  to check per-call-site.

* Improve occurrence analyis for bottoming function calls, to help
  postInlineUnconditionally.  See Note [Bottoming function calls]
  in GHC.Core.Opt.OccurAnal

Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
  use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
  See Note [Seq is boring]  Also, wrinkle (SB1), inline in that
  `seq` context only for INLINE functions (UnfWhen guidance).

* In GHC.Core.Opt.Simplify.Utils.interestingArg,
  - return ValueArg for OtherCon [c1,c2, ...], but
  - return NonTrivArg for OtherCon []
  This makes a function a little less likely to inline if all we
  know is that the argument is evaluated, but nothing else.

* isConLikeUnfolding is no longer true for OtherCon {}.
  This propagates to exprIsConLike.  Con-like-ness has /positive/
  information.

Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
  See these two long Notes
    Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
    Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline

* When making join points, don't do so if the join point is so small
  it will immediately be inlined; check uncondInlineJoin.

* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
  heuristics for join points. In general we /do not/ want to inline
  join points /even if they are small/.  See Note [Duplicating join points]
  GHC.Core.Opt.Simplify.Iteration.

  But sometimes we do: see Note [Inlining join points] in
  GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.

* Do not add an unfolding to a join point at birth.  This is a tricky one
  and has a long Note [Do not add unfoldings to join points at birth]
  It shows up in two places
  - In `mkDupableAlt` do not add an inlining
  - (trickier) In `simplLetUnfolding` don't add an unfolding for a
    fresh join point
  I am not fully satisifed with this, but it works and is well documented.

* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
  having a non-inlined join point.

Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.

* Compile times improve slightly. Here are the figures over 1%.

  I investiate the biggest differnce in T18304. It's a very small module, just
  a few hundred nodes. The large percentage difffence is due to a single
  function that didn't quite inline before, and does now, making code size a
  bit bigger.  I decided gains outweighed the losses.

    Metrics: compile_time/bytes allocated (changes over +/- 1%)
    ------------------------------------------------
               CoOpt_Singletons(normal)   -9.2% GOOD
                    LargeRecord(normal)  -23.5% GOOD
    MultiComponentModulesRecomp(normal)   +1.1%
    MultiLayerModulesTH_OneShot(normal)   +4.1%  BAD
                      PmSeriesS(normal)   -4.0%
                      PmSeriesV(normal)   -1.7%
                         T11195(normal)   -1.3%
                         T12227(normal)  -20.5% GOOD
                         T12545(normal)   -3.2%
                         T12707(normal)   -2.2% GOOD
                         T13253(normal)   -1.5%
                     T13253-spj(normal)   +8.1%  BAD
                         T13386(normal)   -3.0% GOOD
                         T14766(normal)   -2.7% GOOD
                         T15164(normal)   -1.4%
                         T15304(normal)   +1.2%
                         T15630(normal)   -8.2%
                         T15703(normal)  -14.8% GOOD
                         T16577(normal)   -2.3% GOOD
                         T16875(normal)   -0.1%
                         T17516(normal)  -39.7% GOOD
                         T18140(normal)   +1.1%
                         T18223(normal)  -17.2% GOOD
                         T18282(normal)   -5.1% GOOD
                         T18304(normal)  +10.8%  BAD
                         T18923(normal)   -3.0% GOOD
                         T19695(normal)   -1.5%
                         T20049(normal)  -12.8% GOOD
                        T21839c(normal)   -4.3% GOOD
                          T3064(normal)   -1.5%
                          T3294(normal)   +1.2%  BAD
                          T4801(normal)   +1.2%
                          T5030(normal)  -15.2% GOOD
                       T5321Fun(normal)   -2.2% GOOD
                          T6048(optasm)  -16.9% GOOD
                           T783(normal)   -1.2%
                          T8095(normal)   -6.0% GOOD
                          T9630(normal)   -4.8% GOOD
                          T9961(normal)   +1.8%  BAD
                          WWRec(normal)   -1.4%
            info_table_map_perf(normal)   -1.4%
                     parsing001(normal)   +1.4%

                              geo. mean   -2.1%
                              minimum    -39.7%
                              maximum    +10.8%

* Runtimes generally improve. In the testsuite perf/should_run gives:
   Metrics: runtime/bytes allocated
   ------------------------------------------
             Conversions(normal)   -0.3%
                 T13536a(optasm)  -41.7% GOOD
                   T4830(normal)   -0.1%
           haddock.Cabal(normal)   -0.1%
            haddock.base(normal)   -0.1%
        haddock.compiler(normal)   -0.1%

                       geo. mean   -0.8%
                       minimum    -41.7%
                       maximum     +0.0%

* For runtime, nofib is a better test.  The news is mostly good.
  Here are the number more than +/- 0.1%:

    # bytes allocated
    ==========================++==========
       imaginary/digits-of-e1 ||  -14.40%
       imaginary/digits-of-e2 ||   -4.41%
          imaginary/paraffins ||   -0.17%
               imaginary/rfib ||   -0.15%
       imaginary/wheel-sieve2 ||   -0.10%
                real/compress ||   -0.47%
                   real/fluid ||   -0.10%
                  real/fulsom ||   +0.14%
                  real/gamteb ||   -1.47%
                      real/gg ||   -0.20%
                   real/infer ||   +0.24%
                     real/pic ||   -0.23%
                  real/prolog ||   -0.36%
                     real/scs ||   -0.46%
                 real/smallpt ||   +4.03%
        shootout/k-nucleotide ||  -20.23%
              shootout/n-body ||   -0.42%
       shootout/spectral-norm ||   -0.13%
              spectral/boyer2 ||   -3.80%
         spectral/constraints ||   -0.27%
          spectral/hartel/ida ||   -0.82%
                spectral/mate ||  -20.34%
                spectral/para ||   +0.46%
             spectral/rewrite ||   +1.30%
              spectral/sphere ||   -0.14%
    ==========================++==========
                    geom mean ||   -0.59%

    real/smallpt has a huge nest of local definitions, and I
    could not pin down a reason for a regression.  But there are
    three big wins!

Metric Decrease:
    CoOpt_Singletons
    LargeRecord
    T12227
    T12707
    T13386
    T13536a
    T14766
    T15703
    T16577
    T17516
    T18223
    T18282
    T18923
    T21839c
    T20049
    T5321Fun
    T5030
    T6048
    T8095
    T9630
    T783
Metric Increase:
    MultiLayerModulesTH_OneShot
    T13253-spj
    T18304
    T3294
    T9961

- - - - -
616c0d4a by Simon Peyton Jones at 2024-04-01T09:38:19+01:00
Testsuite message changes from simplifier improvements

- - - - -
c556cf6d by Simon Peyton Jones at 2024-04-01T09:38:20+01:00
Account for bottoming functions in OccurAnal

This fixes #24582, a small but long-standing bug

- - - - -

14 changed files:

- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/FloatOut.hs
- compiler/GHC/Core/Opt/Monad.hs
- compiler/GHC/Core/Opt/OccurAnal.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/Simplify.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Inline.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Monad.hs

The diff was not included because it is too large.

View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6483e76a896546107d6323c9d43d9c63af3bf08b...c556cf6d71e096eb62851503fc04266d1e32895b

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6483e76a896546107d6323c9d43d9c63af3bf08b...c556cf6d71e096eb62851503fc04266d1e32895b
You're receiving this email because of your account on gitlab.haskell.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240401/1eddb1b7/attachment-0001.html>