[Git][ghc/ghc][wip/andreask/stgLintFix] 16 commits: Add fragmentation statistic to GHC.Stats

Andreas Klebinger (@AndreasK) gitlab at gitlab.haskell.org
Thu Sep 29 15:29:01 UTC 2022



Andreas Klebinger pushed to branch wip/andreask/stgLintFix at Glasgow Haskell Compiler / GHC


Commits:
c0ba775d by Teo Camarasu at 2022-09-21T14:30:37-04:00
Add fragmentation statistic to GHC.Stats

Implements #21537

- - - - -
2463df2f by Torsten Schmits at 2022-09-21T14:31:24-04:00
Rename Solo[constructor] to MkSolo

Part of proposal 475 (https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0475-tuple-syntax.rst)

Moves all tuples to GHC.Tuple.Prim
Updates ghc-prim version (and bumps bounds in dependents)

updates haddock submodule
updates deepseq submodule
updates text submodule

- - - - -
9034fada by Matthew Pickering at 2022-09-22T09:25:29-04:00
Update filepath to filepath-1.4.100.0

Updates submodule

* Always rely on vendored filepath
* filepath must be built as stage0 dependency because it uses
  template-haskell.

Towards #22098

- - - - -
615e2278 by Krzysztof Gogolewski at 2022-09-22T09:26:05-04:00
Minor refactor around Outputable

* Replace 'text . show' and 'ppr' with 'int'.
* Remove Outputable.hs-boot, no longer needed
* Use pprWithCommas
* Factor out instructions in AArch64 codegen

- - - - -
aeafdba5 by Sebastian Graf at 2022-09-27T15:14:54+02:00
Demand: Clear distinction between Call SubDmd and eval Dmd (#21717)

In #21717 we saw a reportedly unsound strictness signature due to an unsound
definition of plusSubDmd on Calls. This patch contains a description and the fix
to the unsoundness as outlined in `Note [Call SubDemand vs. evaluation Demand]`.

This fix means we also get rid of the special handling of `-fpedantic-bottoms`
in eta-reduction. Thanks to less strict and actually sound strictness results,
we will no longer eta-reduce the problematic cases in the first place, even
without `-fpedantic-bottoms`.

So fixing the unsoundness also makes our eta-reduction code simpler with less
hacks to explain. But there is another, more unfortunate side-effect:
We *unfix* #21085, but fortunately we have a new fix ready:
See `Note [mkCall and plusSubDmd]`.

There's another change:
I decided to make `Note [SubDemand denotes at least one evaluation]` a lot
simpler by using `plusSubDmd` (instead of `lubPlusSubDmd`) even if both argument
demands are lazy. That leads to less precise results, but in turn rids ourselves
from the need for 4 different `OpMode`s and the complication of
`Note [Manual specialisation of lub*Dmd/plus*Dmd]`. The result is simpler code
that is in line with the paper draft on Demand Analysis.

I left the abandoned idea in `Note [Unrealised opportunity in plusDmd]` for
posterity. The fallout in terms of regressions is negligible, as the testsuite
and NoFib shows.

```
        Program         Allocs    Instrs
--------------------------------------------------------------------------------
         hidden          +0.2%     -0.2%
         linear          -0.0%     -0.7%
--------------------------------------------------------------------------------
            Min          -0.0%     -0.7%
            Max          +0.2%     +0.0%
 Geometric Mean          +0.0%     -0.0%
```

Fixes #21717.

- - - - -
9b1595c8 by Ross Paterson at 2022-09-27T14:12:01-04:00
implement proposal 106 (Define Kinds Without Promotion) (fixes #6024)

includes corresponding changes to haddock submodule

- - - - -
c2d73cb4 by Andreas Klebinger at 2022-09-28T15:07:30-04:00
Apply some tricks to speed up core lint.

Below are the noteworthy changes and if given their impact on compiler
allocations for a type heavy module:

* Use the oneShot trick on LintM
* Use a unboxed tuple for the result of LintM: ~6% reduction
* Avoid a thunk for the result of typeKind in lintType: ~5% reduction
* lint_app: Don't allocate the error msg in the hot code path: ~4%
  reduction
* lint_app: Eagerly force the in scope set: ~4%
* nonDetCmpType: Try to short cut using reallyUnsafePtrEquality#: ~2%
* lintM: Use a unboxed maybe for the `a` result: ~12%
* lint_app: make go_app tail recursive to avoid allocating the go function
            as heap closure: ~7%
* expandSynTyCon_maybe: Use a specialized data type

For a less type heavy module like nofib/spectral/simple compiled with
-O -dcore-lint allocations went down by ~24% and compile time by ~9%.

-------------------------
Metric Decrease:
    T1969
-------------------------

- - - - -
b74b6191 by sheaf at 2022-09-28T15:08:10-04:00
matchLocalInst: do domination analysis

When multiple Given quantified constraints match a Wanted, and there is
a quantified constraint that dominates all others, we now pick it
to solve the Wanted.

See Note [Use only the best matching quantified constraint].

For example:

  [G] d1: forall a b. ( Eq a, Num b, C a b  ) => D a b
  [G] d2: forall a  .                C a Int  => D a Int
  [W] {w}: D a Int

When solving the Wanted, we find that both Givens match, but we pick
the second, because it has a weaker precondition, C a Int, compared
to (Eq a, Num Int, C a Int). We thus say that d2 dominates d1;
see Note [When does a quantified instance dominate another?].

This domination test is done purely in terms of superclass expansion,
in the function GHC.Tc.Solver.Interact.impliedBySCs. We don't attempt
to do a full round of constraint solving; this simple check suffices
for now.

Fixes #22216 and #22223

- - - - -
2a53ac18 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
Improve aggressive specialisation

This patch fixes #21286, by not unboxing dictionaries in
worker/wrapper (ever). The main payload is tiny:

* In `GHC.Core.Opt.DmdAnal.finaliseArgBoxities`, do not unbox
  dictionaries in `get_dmd`.  See Note [Do not unbox class dictionaries]
  in that module

* I also found that imported wrappers were being fruitlessly
  specialised, so I fixed that too, in canSpecImport.
  See Note [Specialising imported functions] point (2).

In doing due diligence in the testsuite I fixed a number of
other things:

* Improve Note [Specialising unfoldings] in GHC.Core.Unfold.Make,
  and Note [Inline specialisations] in GHC.Core.Opt.Specialise,
  and remove duplication between the two. The new Note describes
  how we specialise functions with an INLINABLE pragma.

  And simplify the defn of `spec_unf` in `GHC.Core.Opt.Specialise.specCalls`.

* Improve Note [Worker/wrapper for INLINABLE functions] in
  GHC.Core.Opt.WorkWrap.

  And (critially) make an actual change which is to propagate the
  user-written pragma from the original function to the wrapper; see
  `mkStrWrapperInlinePrag`.

* Write new Note [Specialising imported functions] in
  GHC.Core.Opt.Specialise

All this has a big effect on some compile times. This is
compiler/perf, showing only changes over 1%:

Metrics: compile_time/bytes allocated
-------------------------------------
                LargeRecord(normal)  -50.2% GOOD
           ManyConstructors(normal)   +1.0%
MultiLayerModulesTH_OneShot(normal)   +2.6%
                  PmSeriesG(normal)   -1.1%
                     T10547(normal)   -1.2%
                     T11195(normal)   -1.2%
                     T11276(normal)   -1.0%
                    T11303b(normal)   -1.6%
                     T11545(normal)   -1.4%
                     T11822(normal)   -1.3%
                     T12150(optasm)   -1.0%
                     T12234(optasm)   -1.2%
                     T13056(optasm)   -9.3% GOOD
                     T13253(normal)   -3.8% GOOD
                     T15164(normal)   -3.6% GOOD
                     T16190(normal)   -2.1%
                     T16577(normal)   -2.8% GOOD
                     T16875(normal)   -1.6%
                     T17836(normal)   +2.2%
                    T17977b(normal)   -1.0%
                     T18223(normal)  -33.3% GOOD
                     T18282(normal)   -3.4% GOOD
                     T18304(normal)   -1.4%
                    T18698a(normal)   -1.4% GOOD
                    T18698b(normal)   -1.3% GOOD
                     T19695(normal)   -2.5% GOOD
                      T5837(normal)   -2.3%
                      T9630(normal)  -33.0% GOOD
                      WWRec(normal)   -9.7% GOOD
             hard_hole_fits(normal)   -2.1% GOOD
                     hie002(normal)   +1.6%

                          geo. mean   -2.2%
                          minimum    -50.2%
                          maximum     +2.6%

I diligently investigated some of the big drops.

* Caused by not doing w/w for dictionaries:
    T13056, T15164, WWRec, T18223

* Caused by not fruitlessly specialising wrappers
    LargeRecord, T9630

For runtimes, here is perf/should+_run:

Metrics: runtime/bytes allocated
--------------------------------
               T12990(normal)   -3.8%
                T5205(normal)   -1.3%
                T9203(normal)  -10.7% GOOD
        haddock.Cabal(normal)   +0.1%
         haddock.base(normal)   -1.1%
     haddock.compiler(normal)   -0.3%
        lazy-bs-alloc(normal)   -0.2%
------------------------------------------
                    geo. mean   -0.3%
                    minimum    -10.7%
                    maximum     +0.1%

I did not investigate exactly what happens in T9203.

Nofib is a wash:

+-------------------------------++--+-----------+-----------+
|                               ||  | tsv (rel) | std. err. |
+===============================++==+===========+===========+
|                     real/anna ||  |    -0.13% |      0.0% |
|                      real/fem ||  |    +0.13% |      0.0% |
|                   real/fulsom ||  |    -0.16% |      0.0% |
|                     real/lift ||  |    -1.55% |      0.0% |
|                  real/reptile ||  |    -0.11% |      0.0% |
|                  real/smallpt ||  |    +0.51% |      0.0% |
|          spectral/constraints ||  |    +0.20% |      0.0% |
|               spectral/dom-lt ||  |    +1.80% |      0.0% |
|               spectral/expert ||  |    +0.33% |      0.0% |
+===============================++==+===========+===========+
|                     geom mean ||  |           |           |
+-------------------------------++--+-----------+-----------+

I spent quite some time investigating dom-lt, but it's pretty
complicated.  See my note on !7847.  Conclusion: it's just a delicate
inlining interaction, and we have plenty of those.

Metric Decrease:
    LargeRecord
    T13056
    T13253
    T15164
    T16577
    T18223
    T18282
    T18698a
    T18698b
    T19695
    T9630
    WWRec
    hard_hole_fits
    T9203

- - - - -
addeefc0 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
Refactor UnfoldingSource and IfaceUnfolding

I finally got tired of the way that IfaceUnfolding reflected
a previous structure of unfoldings, not the current one. This
MR refactors UnfoldingSource and IfaceUnfolding to be simpler
and more consistent.

It's largely just a refactor, but in UnfoldingSource (which moves
to GHC.Types.Basic, since it is now used in IfaceSyn too), I
distinguish between /user-specified/ and /system-generated/ stable
unfoldings.

    data UnfoldingSource
      = VanillaSrc
      | StableUserSrc   -- From a user-specified pragma
      | StableSystemSrc -- From a system-generated unfolding
      | CompulsorySrc

This has a minor effect in CSE (see the use of isisStableUserUnfolding
in GHC.Core.Opt.CSE), which I tripped over when working on
specialisation, but it seems like a Good Thing to know anyway.

- - - - -
7be6f9a4 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
INLINE/INLINEABLE pragmas in Foreign.Marshal.Array

Foreign.Marshal.Array contains many small functions, all of which are
overloaded, and which are critical for performance. Yet none of them
had pragmas, so it was a fluke whether or not they got inlined.

This patch makes them all either INLINE (small ones) or
INLINEABLE and hence specialisable (larger ones).

See Note [Specialising array operations] in that module.

- - - - -
b0c89dfa by Jade Lovelace at 2022-09-28T17:49:49-04:00
Export OnOff from GHC.Driver.Session

I was working on fixing an issue where HLS was trying to pass its
DynFlags to HLint, but didn't pass any of the disabled language
extensions, which HLint would then assume are on because of their
default values.

Currently it's not possible to get any of the "No" flags because the
`DynFlags.extensions` field can't really be used since it is [OnOff
Extension] and OnOff is not exported.

So let's export it.

- - - - -
2f050687 by Bodigrim at 2022-09-28T17:50:28-04:00
Avoid Data.List.group; prefer Data.List.NonEmpty.group

This allows to avoid further partiality, e. g., map head . group is
replaced by map NE.head . NE.group, and there are less panic calls.

- - - - -
bc0020fa by M Farkas-Dyck at 2022-09-28T22:51:59-04:00
Clean up `findWiredInUnit`. In particular, avoid `head`.

- - - - -
6a2eec98 by Bodigrim at 2022-09-28T22:52:38-04:00
Eliminate headFS, use unconsFS instead

A small step towards #22185 to avoid partial functions + safe implementation
of `startsWithUnderscore`.

- - - - -
c7be73fc by Andreas Klebinger at 2022-09-29T15:28:59+00:00
Improve stg lint for unboxed sums.

It now properly lints cases where sums end up distributed
over multiple args after unarise.

Fixes #22026.

- - - - -


30 changed files:

- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/DebugBlock.hs
- compiler/GHC/Cmm/Switch.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/Reg/Liveness.hs
- compiler/GHC/CmmToLlvm/Base.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Axiom.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/FamInstEnv.hs
- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/CSE.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Opt/Specialise.hs
- compiler/GHC/Core/Opt/Stats.hs
- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Ppr.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Core/TyCon.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Unfold.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5caedd50db499638420207a5a871b7d3f43f189f...c7be73fc761501317244999e262b08616ed03955

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5caedd50db499638420207a5a871b7d3f43f189f...c7be73fc761501317244999e262b08616ed03955
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20220929/35c8a8de/attachment-0001.html>


More information about the ghc-commits mailing list