<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html lang="en" style='--code-editor-font: var(--default-mono-font, "GitLab Mono"), JetBrains Mono, Menlo, DejaVu Sans Mono, Liberation Mono, Consolas, Ubuntu Mono, Courier New, andale mono, lucida console, monospace;'>
<head>
<meta content="text/html; charset=US-ASCII" http-equiv="Content-Type">
<title>
GitLab
</title>

<style data-premailer="ignore" type="text/css">
a { color: #1068bf; }
</style>


<style>img {
max-width: 100%; height: auto;
}
body {
font-size: 0.875rem;
}
body {
-webkit-text-shadow: rgba(255,255,255,0.01) 0 0 1px;
}
body {
font-family: var(--default-regular-font, "GitLab Sans"),-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"Noto Sans",Ubuntu,Cantarell,"Helvetica Neue",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji"; font-size: inherit;
}
</style>
</head>
<body style='font-size: inherit; -webkit-text-shadow: rgba(255,255,255,0.01) 0 0 1px; font-family: var(--default-regular-font, "GitLab Sans"),-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"Noto Sans",Ubuntu,Cantarell,"Helvetica Neue",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";'>
<div class="content">

<h3 style="margin-top: 20px; margin-bottom: 10px;">
Simon Peyton Jones pushed to branch wip/simplifier-tweaks at <a href="https://gitlab.haskell.org/ghc/ghc">Glasgow Haskell Compiler / GHC</a>
</h3>
<h4 style="margin-top: 10px; margin-bottom: 10px;">
Commits:
</h4>
<ul>
<li>
<strong style="font-weight: bold;"><a href="https://gitlab.haskell.org/ghc/ghc/-/commit/8b2a088d23427174de6bf5e7f463c81c2cd0c518">8b2a088d</a></strong>
<div>
<span> by Simon Peyton Jones </span> <i> at 2024-04-02T15:52:49+01:00 </i>
</div>
<pre class="commit-message" style='white-space: pre-wrap; display: block; font-size: 14px; color: #333238; position: relative; font-family: var(--default-mono-font, "GitLab Mono"),"JetBrains Mono","Menlo","DejaVu Sans Mono","Liberation Mono","Consolas","Ubuntu Mono","Courier New","andale mono","lucida console",monospace; word-break: break-all; word-wrap: break-word; background-color: #fbfafd; border-radius: 2px; margin: 0; padding: 8px 12px; border: 1px solid #dcdcde;'>Simplifier improvements

This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do.  But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.

The main changes are below.  There are also many new or rewritten Notes.

Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]

* The SimplEnv now has a seInlineDepth field, which says how deep
  in unfoldings we are.  See Note [Inline depth] in Simplify.Env.
  Currently used only for the next point: avoiding repeatedly
  simplifying coercions.

* Avoid repeatedly simplifying coercions.
  see Note [Avoid re-simplifying coercions] in Simplify.Iteration
  As you'll see from the Note, this makes use of the seInlineDepth.

* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
  This is another part of Note [Post-inline for single-use things], and
  is really good for reducing simplifier iterations in situations like
      case K e of { K x -> blah }
  wher x is used once in blah.

* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
  elimination.  Note [Case elim in exprIsConApp_maybe]

* Improve the case-merge transformation:
  - Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
    and friends.  See Note [Merge Nested Cases] in GHC.Core.Utils.
  - Add a new case for `tagToEnum#`; see wrinkle (MC3).
  - Add a new case to look through join points: see wrinkle (MC4)

postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
  that are used exactly once. See Note [Post-inline for single-use things].

* Do not postInlineUnconditionally join point, ever.
  Doing so does not reduce allocation, which is the main point,
  and with join points that are used a lot it can bloat code.
  See point (1) of Note [Duplicating join points] in
  GHC.Core.Opt.Simplify.Iteration.

* Do not postInlineUnconditionally a strict (demanded) binding.
  It will not allocate a thunk (it'll turn into a case instead)
  so again the main point of inlining it doesn't hold.  Better
  to check per-call-site.

* Improve occurrence analyis for bottoming function calls, to help
  postInlineUnconditionally.  See Note [Bottoming function calls]
  in GHC.Core.Opt.OccurAnal

Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
  use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
  See Note [Seq is boring]  Also, wrinkle (SB1), inline in that
  `seq` context only for INLINE functions (UnfWhen guidance).

* In GHC.Core.Opt.Simplify.Utils.interestingArg,
  - return ValueArg for OtherCon [c1,c2, ...], but
  - return NonTrivArg for OtherCon []
  This makes a function a little less likely to inline if all we
  know is that the argument is evaluated, but nothing else.

* isConLikeUnfolding is no longer true for OtherCon {}.
  This propagates to exprIsConLike.  Con-like-ness has /positive/
  information.

Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
  See these two long Notes
    Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
    Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline

* When making join points, don't do so if the join point is so small
  it will immediately be inlined; check uncondInlineJoin.

* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
  heuristics for join points. In general we /do not/ want to inline
  join points /even if they are small/.  See Note [Duplicating join points]
  GHC.Core.Opt.Simplify.Iteration.

  But sometimes we do: see Note [Inlining join points] in
  GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.

* Do not add an unfolding to a join point at birth.  This is a tricky one
  and has a long Note [Do not add unfoldings to join points at birth]
  It shows up in two places
  - In `mkDupableAlt` do not add an inlining
  - (trickier) In `simplLetUnfolding` don't add an unfolding for a
    fresh join point
  I am not fully satisifed with this, but it works and is well documented.

* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
  having a non-inlined join point.

Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.

* Compile times improve slightly. Here are the figures over 1%.

  I investiate the biggest differnce in T18304. It's a very small module, just
  a few hundred nodes. The large percentage difffence is due to a single
  function that didn't quite inline before, and does now, making code size a
  bit bigger.  I decided gains outweighed the losses.

    Metrics: compile_time/bytes allocated (changes over +/- 1%)
    ------------------------------------------------
           CoOpt_Singletons(normal)   -9.2% GOOD
                LargeRecord(normal)  -23.5% GOOD
MultiComponentModulesRecomp(normal)   +1.2%
MultiLayerModulesTH_OneShot(normal)   +4.1%  BAD
                  PmSeriesS(normal)   -3.8%
                  PmSeriesV(normal)   -1.5%
                     T11195(normal)   -1.3%
                     T12227(normal)  -20.4% GOOD
                     T12545(normal)   -3.2%
                     T12707(normal)   -2.1% GOOD
                     T13253(normal)   -1.2%
                 T13253-spj(normal)   +8.1%  BAD
                     T13386(normal)   -3.1% GOOD
                     T14766(normal)   -2.6% GOOD
                     T15164(normal)   -1.4%
                     T15304(normal)   +1.2%
                     T15630(normal)   -8.2%
                    T15630a(normal)          NEW
                     T15703(normal)  -14.7% GOOD
                     T16577(normal)   -2.3% GOOD
                     T17516(normal)  -39.7% GOOD
                     T18140(normal)   +1.2%
                     T18223(normal)  -17.1% GOOD
                     T18282(normal)   -5.0% GOOD
                     T18304(normal)  +10.8%  BAD
                     T18923(normal)   -2.9% GOOD
                      T1969(normal)   +1.0%
                     T19695(normal)   -1.5%
                     T20049(normal)  -12.7% GOOD
                    T21839c(normal)   -4.1% GOOD
                      T3064(normal)   -1.5%
                      T3294(normal)   +1.2%  BAD
                      T4801(normal)   +1.2%
                      T5030(normal)  -15.2% GOOD
                   T5321Fun(normal)   -2.2% GOOD
                      T6048(optasm)  -16.8% GOOD
                       T783(normal)   -1.2%
                      T8095(normal)   -6.0% GOOD
                      T9630(normal)   -4.7% GOOD
                      T9961(normal)   +1.9%  BAD
                      WWRec(normal)   -1.4%
        info_table_map_perf(normal)   -1.3%
                 parsing001(normal)   +1.5%

                          geo. mean   -2.0%
                          minimum    -39.7%
                          maximum    +10.8%

* Runtimes generally improve. In the testsuite perf/should_run gives:
   Metrics: runtime/bytes allocated
   ------------------------------------------
             Conversions(normal)   -0.3%
                 T13536a(optasm)  -41.7% GOOD
                   T4830(normal)   -0.1%
           haddock.Cabal(normal)   -0.1%
            haddock.base(normal)   -0.1%
        haddock.compiler(normal)   -0.1%

                       geo. mean   -0.8%
                       minimum    -41.7%
                       maximum     +0.0%

* For runtime, nofib is a better test.  The news is mostly good.
  Here are the number more than +/- 0.1%:

    # bytes allocated
    ==========================++==========
       imaginary/digits-of-e1 ||  -14.40%
       imaginary/digits-of-e2 ||   -4.41%
          imaginary/paraffins ||   -0.17%
               imaginary/rfib ||   -0.15%
       imaginary/wheel-sieve2 ||   -0.10%
                real/compress ||   -0.47%
                   real/fluid ||   -0.10%
                  real/fulsom ||   +0.14%
                  real/gamteb ||   -1.47%
                      real/gg ||   -0.20%
                   real/infer ||   +0.24%
                     real/pic ||   -0.23%
                  real/prolog ||   -0.36%
                     real/scs ||   -0.46%
                 real/smallpt ||   +4.03%
        shootout/k-nucleotide ||  -20.23%
              shootout/n-body ||   -0.42%
       shootout/spectral-norm ||   -0.13%
              spectral/boyer2 ||   -3.80%
         spectral/constraints ||   -0.27%
          spectral/hartel/ida ||   -0.82%
                spectral/mate ||  -20.34%
                spectral/para ||   +0.46%
             spectral/rewrite ||   +1.30%
              spectral/sphere ||   -0.14%
    ==========================++==========
                    geom mean ||   -0.59%

    real/smallpt has a huge nest of local definitions, and I
    could not pin down a reason for a regression.  But there are
    three big wins!

Metric Decrease:
    CoOpt_Singletons
    LargeRecord
    T12227
    T12707
    T13386
    T13536a
    T14766
    T15703
    T16577
    T17516
    T18223
    T18282
    T18923
    T21839c
    T20049
    T5321Fun
    T5030
    T6048
    T8095
    T9630
    T783
Metric Increase:
    T13253-spj
    T18304
    T18698a
    T9961
    T3294
</pre>
</li>
<li>
<strong style="font-weight: bold;"><a href="https://gitlab.haskell.org/ghc/ghc/-/commit/e9149c8c55da2c0089540788dc5f70424210ee62">e9149c8c</a></strong>
<div>
<span> by Simon Peyton Jones </span> <i> at 2024-04-02T15:52:49+01:00 </i>
</div>
<pre class="commit-message" style='white-space: pre-wrap; display: block; font-size: 14px; color: #333238; position: relative; font-family: var(--default-mono-font, "GitLab Mono"),"JetBrains Mono","Menlo","DejaVu Sans Mono","Liberation Mono","Consolas","Ubuntu Mono","Courier New","andale mono","lucida console",monospace; word-break: break-all; word-wrap: break-word; background-color: #fbfafd; border-radius: 2px; margin: 0; padding: 8px 12px; border: 1px solid #dcdcde;'>Testsuite message changes from simplifier improvements
</pre>
</li>
<li>
<strong style="font-weight: bold;"><a href="https://gitlab.haskell.org/ghc/ghc/-/commit/848f360fe1bcbdac28e9cc0014665bf8ccbfd259">848f360f</a></strong>
<div>
<span> by Simon Peyton Jones </span> <i> at 2024-04-02T15:52:49+01:00 </i>
</div>
<pre class="commit-message" style='white-space: pre-wrap; display: block; font-size: 14px; color: #333238; position: relative; font-family: var(--default-mono-font, "GitLab Mono"),"JetBrains Mono","Menlo","DejaVu Sans Mono","Liberation Mono","Consolas","Ubuntu Mono","Courier New","andale mono","lucida console",monospace; word-break: break-all; word-wrap: break-word; background-color: #fbfafd; border-radius: 2px; margin: 0; padding: 8px 12px; border: 1px solid #dcdcde;'>Account for bottoming functions in OccurAnal

This fixes #24582, a small but long-standing bug
</pre>
</li>
</ul>
<h4 style="margin-top: 10px; margin-bottom: 10px;">
24 changed files:
</h4>
<ul>
<li class="file-stats">
<a href="#182d6a315e784018aa9c8b2ad736036b97bd5d48">
compiler/GHC/Core.hs
</a>
</li>
<li class="file-stats">
<a href="#975dc08a8e7942b32d621f617d5a9c1b668601dd">
compiler/GHC/Core/Coercion/Opt.hs
</a>
</li>
<li class="file-stats">
<a href="#bac3d5159a5688007de3aa3f5c4e50569677b347">
compiler/GHC/Core/Opt/OccurAnal.hs
</a>
</li>
<li class="file-stats">
<a href="#f168a93cde5e2aec2441d6331dfe500172df4af3">
compiler/GHC/Core/Opt/Simplify.hs
</a>
</li>
<li class="file-stats">
<a href="#2f46b19cb85e3f7b4e72305644bc50015628c41d">
compiler/GHC/Core/Opt/Simplify/Env.hs
</a>
</li>
<li class="file-stats">
<a href="#8664fd2691eea70244c0a1bae2f51f2ad8a36eef">
compiler/GHC/Core/Opt/Simplify/Inline.hs
</a>
</li>
<li class="file-stats">
<a href="#ae6d91a5d028418bbf1431347d659e744e0a3128">
compiler/GHC/Core/Opt/Simplify/Iteration.hs
</a>
</li>
<li class="file-stats">
<a href="#48fbb5cdea308650de5756521feb28ec68819b9b">
compiler/GHC/Core/Opt/Simplify/Utils.hs
</a>
</li>
<li class="file-stats">
<a href="#a2c999a27f983737a4e40710396b1c7aaa5d4015">
compiler/GHC/Core/Opt/Stats.hs
</a>
</li>
<li class="file-stats">
<a href="#2811a7297b8aa206197ac1f5dabd0818e3c7ec5a">
compiler/GHC/Core/Unfold.hs
</a>
</li>
<li class="file-stats">
<a href="#940913dd549c6b1c334daafbc4b7eef29c94d924">
compiler/GHC/Core/Unfold/Make.hs
</a>
</li>
<li class="file-stats">
<a href="#1a7aba0daeafab195716dd25432479804a55ab60">
compiler/GHC/Core/Utils.hs
</a>
</li>
<li class="file-stats">
<a href="#06ff7bac58fd8cfe0c17b81963b03f4cce86a065">
compiler/GHC/IfaceToCore.hs
</a>
</li>
<li class="file-stats">
<a href="#20aaf8344f379f354fe31dd0c1c4db4ddc5b17aa">
compiler/GHC/Types/Id.hs
</a>
</li>
<li class="file-stats">
<a href="#930ade98b040b89136455bb644c3cee18c23ffdc">
compiler/GHC/Types/Tickish.hs
</a>
</li>
<li class="file-stats">
<a href="#46fdea7197af7a69a0a0d1b96b0a0dab860a614b">
testsuite/tests/arityanal/should_compile/Arity01.stderr
</a>
</li>
<li class="file-stats">
<a href="#d17dc73d83ae4f8e388a6375f61874dd8423a3d7">
testsuite/tests/arityanal/should_compile/Arity02.stderr
</a>
</li>
<li class="file-stats">
<a href="#943adaed1cf65cfe70778088aad52155f98e4206">
testsuite/tests/arityanal/should_compile/Arity09.stderr
</a>
</li>
<li class="file-stats">
<a href="#65bdc9cf6758e69932ae905d6326c402a48cb991">
testsuite/tests/arityanal/should_compile/Arity13.stderr
</a>
</li>
<li class="file-stats">
<a href="#782a0e6faf0eada0a94e75ce3b5bd4bd55192a3d">
testsuite/tests/cpranal/should_compile/T18401.stderr
</a>
</li>
<li class="file-stats">
<a href="#a65c5dfcf1c6d1e05003fb0b52f8ffef3e577e74">
testsuite/tests/driver/inline-check.stderr
</a>
</li>
<li class="file-stats">
<a href="#14582a9b3e149cd5da0ea57352802ae4299e978f">
testsuite/tests/lib/integer/Makefile
</a>
</li>
<li class="file-stats">
<a href="#40f72248511536d518e207be3b158d8c1d12e740">
testsuite/tests/numeric/should_compile/T19641.stderr
</a>
</li>
<li class="file-stats">
<a href="#852d4ab9cf8a80734f7f06cc119356a8eeb2f3ed">
testsuite/tests/perf/compiler/T15630.hs
</a>
</li>
</ul>
<h5 style="margin-top: 10px; margin-bottom: 10px; font-size: 0.875rem;">
The diff was not included because it is too large.
</h5>

</div>
<div class="footer" style="margin-top: 10px;">
<p style="font-size: small; color: #737278;">

<br>
<a href="https://gitlab.haskell.org/ghc/ghc/-/compare/c556cf6d71e096eb62851503fc04266d1e32895b...848f360fe1bcbdac28e9cc0014665bf8ccbfd259">View it on GitLab</a>.
<br>
You're receiving this email because of your account on <a target="_blank" rel="noopener noreferrer" href="https://gitlab.haskell.org">gitlab.haskell.org</a>. <a href="https://gitlab.haskell.org/-/profile/notifications" target="_blank" rel="noopener noreferrer" class="mng-notif-link">Manage all notifications</a> · <a href="https://gitlab.haskell.org/help" target="_blank" rel="noopener noreferrer" class="help-link">Help</a>



</p>
</div>
</body>
</html>