[Git][ghc/ghc][wip/clc-85] 23 commits: Demand: Clear distinction between Call SubDmd and eval Dmd (#21717)

Ryan Scott (@RyanGlScott) gitlab at gitlab.haskell.org
Thu Oct 6 11:49:40 UTC 2022



Ryan Scott pushed to branch wip/clc-85 at Glasgow Haskell Compiler / GHC


Commits:
aeafdba5 by Sebastian Graf at 2022-09-27T15:14:54+02:00
Demand: Clear distinction between Call SubDmd and eval Dmd (#21717)

In #21717 we saw a reportedly unsound strictness signature due to an unsound
definition of plusSubDmd on Calls. This patch contains a description and the fix
to the unsoundness as outlined in `Note [Call SubDemand vs. evaluation Demand]`.

This fix means we also get rid of the special handling of `-fpedantic-bottoms`
in eta-reduction. Thanks to less strict and actually sound strictness results,
we will no longer eta-reduce the problematic cases in the first place, even
without `-fpedantic-bottoms`.

So fixing the unsoundness also makes our eta-reduction code simpler with less
hacks to explain. But there is another, more unfortunate side-effect:
We *unfix* #21085, but fortunately we have a new fix ready:
See `Note [mkCall and plusSubDmd]`.

There's another change:
I decided to make `Note [SubDemand denotes at least one evaluation]` a lot
simpler by using `plusSubDmd` (instead of `lubPlusSubDmd`) even if both argument
demands are lazy. That leads to less precise results, but in turn rids ourselves
from the need for 4 different `OpMode`s and the complication of
`Note [Manual specialisation of lub*Dmd/plus*Dmd]`. The result is simpler code
that is in line with the paper draft on Demand Analysis.

I left the abandoned idea in `Note [Unrealised opportunity in plusDmd]` for
posterity. The fallout in terms of regressions is negligible, as the testsuite
and NoFib shows.

```
        Program         Allocs    Instrs
--------------------------------------------------------------------------------
         hidden          +0.2%     -0.2%
         linear          -0.0%     -0.7%
--------------------------------------------------------------------------------
            Min          -0.0%     -0.7%
            Max          +0.2%     +0.0%
 Geometric Mean          +0.0%     -0.0%
```

Fixes #21717.

- - - - -
9b1595c8 by Ross Paterson at 2022-09-27T14:12:01-04:00
implement proposal 106 (Define Kinds Without Promotion) (fixes #6024)

includes corresponding changes to haddock submodule

- - - - -
c2d73cb4 by Andreas Klebinger at 2022-09-28T15:07:30-04:00
Apply some tricks to speed up core lint.

Below are the noteworthy changes and if given their impact on compiler
allocations for a type heavy module:

* Use the oneShot trick on LintM
* Use a unboxed tuple for the result of LintM: ~6% reduction
* Avoid a thunk for the result of typeKind in lintType: ~5% reduction
* lint_app: Don't allocate the error msg in the hot code path: ~4%
  reduction
* lint_app: Eagerly force the in scope set: ~4%
* nonDetCmpType: Try to short cut using reallyUnsafePtrEquality#: ~2%
* lintM: Use a unboxed maybe for the `a` result: ~12%
* lint_app: make go_app tail recursive to avoid allocating the go function
            as heap closure: ~7%
* expandSynTyCon_maybe: Use a specialized data type

For a less type heavy module like nofib/spectral/simple compiled with
-O -dcore-lint allocations went down by ~24% and compile time by ~9%.

-------------------------
Metric Decrease:
    T1969
-------------------------

- - - - -
b74b6191 by sheaf at 2022-09-28T15:08:10-04:00
matchLocalInst: do domination analysis

When multiple Given quantified constraints match a Wanted, and there is
a quantified constraint that dominates all others, we now pick it
to solve the Wanted.

See Note [Use only the best matching quantified constraint].

For example:

  [G] d1: forall a b. ( Eq a, Num b, C a b  ) => D a b
  [G] d2: forall a  .                C a Int  => D a Int
  [W] {w}: D a Int

When solving the Wanted, we find that both Givens match, but we pick
the second, because it has a weaker precondition, C a Int, compared
to (Eq a, Num Int, C a Int). We thus say that d2 dominates d1;
see Note [When does a quantified instance dominate another?].

This domination test is done purely in terms of superclass expansion,
in the function GHC.Tc.Solver.Interact.impliedBySCs. We don't attempt
to do a full round of constraint solving; this simple check suffices
for now.

Fixes #22216 and #22223

- - - - -
2a53ac18 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
Improve aggressive specialisation

This patch fixes #21286, by not unboxing dictionaries in
worker/wrapper (ever). The main payload is tiny:

* In `GHC.Core.Opt.DmdAnal.finaliseArgBoxities`, do not unbox
  dictionaries in `get_dmd`.  See Note [Do not unbox class dictionaries]
  in that module

* I also found that imported wrappers were being fruitlessly
  specialised, so I fixed that too, in canSpecImport.
  See Note [Specialising imported functions] point (2).

In doing due diligence in the testsuite I fixed a number of
other things:

* Improve Note [Specialising unfoldings] in GHC.Core.Unfold.Make,
  and Note [Inline specialisations] in GHC.Core.Opt.Specialise,
  and remove duplication between the two. The new Note describes
  how we specialise functions with an INLINABLE pragma.

  And simplify the defn of `spec_unf` in `GHC.Core.Opt.Specialise.specCalls`.

* Improve Note [Worker/wrapper for INLINABLE functions] in
  GHC.Core.Opt.WorkWrap.

  And (critially) make an actual change which is to propagate the
  user-written pragma from the original function to the wrapper; see
  `mkStrWrapperInlinePrag`.

* Write new Note [Specialising imported functions] in
  GHC.Core.Opt.Specialise

All this has a big effect on some compile times. This is
compiler/perf, showing only changes over 1%:

Metrics: compile_time/bytes allocated
-------------------------------------
                LargeRecord(normal)  -50.2% GOOD
           ManyConstructors(normal)   +1.0%
MultiLayerModulesTH_OneShot(normal)   +2.6%
                  PmSeriesG(normal)   -1.1%
                     T10547(normal)   -1.2%
                     T11195(normal)   -1.2%
                     T11276(normal)   -1.0%
                    T11303b(normal)   -1.6%
                     T11545(normal)   -1.4%
                     T11822(normal)   -1.3%
                     T12150(optasm)   -1.0%
                     T12234(optasm)   -1.2%
                     T13056(optasm)   -9.3% GOOD
                     T13253(normal)   -3.8% GOOD
                     T15164(normal)   -3.6% GOOD
                     T16190(normal)   -2.1%
                     T16577(normal)   -2.8% GOOD
                     T16875(normal)   -1.6%
                     T17836(normal)   +2.2%
                    T17977b(normal)   -1.0%
                     T18223(normal)  -33.3% GOOD
                     T18282(normal)   -3.4% GOOD
                     T18304(normal)   -1.4%
                    T18698a(normal)   -1.4% GOOD
                    T18698b(normal)   -1.3% GOOD
                     T19695(normal)   -2.5% GOOD
                      T5837(normal)   -2.3%
                      T9630(normal)  -33.0% GOOD
                      WWRec(normal)   -9.7% GOOD
             hard_hole_fits(normal)   -2.1% GOOD
                     hie002(normal)   +1.6%

                          geo. mean   -2.2%
                          minimum    -50.2%
                          maximum     +2.6%

I diligently investigated some of the big drops.

* Caused by not doing w/w for dictionaries:
    T13056, T15164, WWRec, T18223

* Caused by not fruitlessly specialising wrappers
    LargeRecord, T9630

For runtimes, here is perf/should+_run:

Metrics: runtime/bytes allocated
--------------------------------
               T12990(normal)   -3.8%
                T5205(normal)   -1.3%
                T9203(normal)  -10.7% GOOD
        haddock.Cabal(normal)   +0.1%
         haddock.base(normal)   -1.1%
     haddock.compiler(normal)   -0.3%
        lazy-bs-alloc(normal)   -0.2%
------------------------------------------
                    geo. mean   -0.3%
                    minimum    -10.7%
                    maximum     +0.1%

I did not investigate exactly what happens in T9203.

Nofib is a wash:

+-------------------------------++--+-----------+-----------+
|                               ||  | tsv (rel) | std. err. |
+===============================++==+===========+===========+
|                     real/anna ||  |    -0.13% |      0.0% |
|                      real/fem ||  |    +0.13% |      0.0% |
|                   real/fulsom ||  |    -0.16% |      0.0% |
|                     real/lift ||  |    -1.55% |      0.0% |
|                  real/reptile ||  |    -0.11% |      0.0% |
|                  real/smallpt ||  |    +0.51% |      0.0% |
|          spectral/constraints ||  |    +0.20% |      0.0% |
|               spectral/dom-lt ||  |    +1.80% |      0.0% |
|               spectral/expert ||  |    +0.33% |      0.0% |
+===============================++==+===========+===========+
|                     geom mean ||  |           |           |
+-------------------------------++--+-----------+-----------+

I spent quite some time investigating dom-lt, but it's pretty
complicated.  See my note on !7847.  Conclusion: it's just a delicate
inlining interaction, and we have plenty of those.

Metric Decrease:
    LargeRecord
    T13056
    T13253
    T15164
    T16577
    T18223
    T18282
    T18698a
    T18698b
    T19695
    T9630
    WWRec
    hard_hole_fits
    T9203

- - - - -
addeefc0 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
Refactor UnfoldingSource and IfaceUnfolding

I finally got tired of the way that IfaceUnfolding reflected
a previous structure of unfoldings, not the current one. This
MR refactors UnfoldingSource and IfaceUnfolding to be simpler
and more consistent.

It's largely just a refactor, but in UnfoldingSource (which moves
to GHC.Types.Basic, since it is now used in IfaceSyn too), I
distinguish between /user-specified/ and /system-generated/ stable
unfoldings.

    data UnfoldingSource
      = VanillaSrc
      | StableUserSrc   -- From a user-specified pragma
      | StableSystemSrc -- From a system-generated unfolding
      | CompulsorySrc

This has a minor effect in CSE (see the use of isisStableUserUnfolding
in GHC.Core.Opt.CSE), which I tripped over when working on
specialisation, but it seems like a Good Thing to know anyway.

- - - - -
7be6f9a4 by Simon Peyton Jones at 2022-09-28T17:49:09-04:00
INLINE/INLINEABLE pragmas in Foreign.Marshal.Array

Foreign.Marshal.Array contains many small functions, all of which are
overloaded, and which are critical for performance. Yet none of them
had pragmas, so it was a fluke whether or not they got inlined.

This patch makes them all either INLINE (small ones) or
INLINEABLE and hence specialisable (larger ones).

See Note [Specialising array operations] in that module.

- - - - -
b0c89dfa by Jade Lovelace at 2022-09-28T17:49:49-04:00
Export OnOff from GHC.Driver.Session

I was working on fixing an issue where HLS was trying to pass its
DynFlags to HLint, but didn't pass any of the disabled language
extensions, which HLint would then assume are on because of their
default values.

Currently it's not possible to get any of the "No" flags because the
`DynFlags.extensions` field can't really be used since it is [OnOff
Extension] and OnOff is not exported.

So let's export it.

- - - - -
2f050687 by Bodigrim at 2022-09-28T17:50:28-04:00
Avoid Data.List.group; prefer Data.List.NonEmpty.group

This allows to avoid further partiality, e. g., map head . group is
replaced by map NE.head . NE.group, and there are less panic calls.

- - - - -
bc0020fa by M Farkas-Dyck at 2022-09-28T22:51:59-04:00
Clean up `findWiredInUnit`. In particular, avoid `head`.

- - - - -
6a2eec98 by Bodigrim at 2022-09-28T22:52:38-04:00
Eliminate headFS, use unconsFS instead

A small step towards #22185 to avoid partial functions + safe implementation
of `startsWithUnderscore`.

- - - - -
5a535172 by Sebastian Graf at 2022-09-29T17:04:20+02:00
Demand: Format Call SubDemands `Cn(sd)` as `C(n,sd)` (#22231)

Justification in #22231. Short form: In a demand like `1C1(C1(L))`
it was too easy to confuse which `1` belongs to which `C`. Now
that should be more obvious.

Fixes #22231

- - - - -
ea0083bf by Bryan Richter at 2022-09-29T15:48:38-04:00
Revert "ci: enable parallel compression for xz"

Combined wxth XZ_OPT=9, this blew the memory capacity of CI runners.

This reverts commit a5f9c35f5831ef5108e87813a96eac62803852ab.

- - - - -
f5e8f493 by Sebastian Graf at 2022-09-30T18:42:13+02:00
Boxity: Don't update Boxity unless worker/wrapper follows (#21754)

A small refactoring in our Core Opt pipeline and some new functions for
transfering argument boxities from one signature to another to facilitate
`Note [Don't change boxity without worker/wrapper]`.

Fixes #21754.

- - - - -
4baf7b1c by M Farkas-Dyck at 2022-09-30T17:45:47-04:00
Scrub various partiality involving empty lists.

Avoids some uses of `head` and `tail`, and some panics when an argument is null.

- - - - -
95ead839 by Alexis King at 2022-10-01T00:37:43-04:00
Fix a bug in continuation capture across multiple stack chunks

- - - - -
22096652 by Bodigrim at 2022-10-01T00:38:22-04:00
Enforce internal invariant of OrdList and fix bugs in viewCons / viewSnoc

`viewCons` used to ignore `Many` constructor completely, returning `VNothing`.
`viewSnoc` violated internal invariant of `Many` being a non-empty list.

- - - - -
48ab9ca5 by Nicolas Trangez at 2022-10-04T20:34:10-04:00
chore: extend `.editorconfig` for C files

- - - - -
b8df5c72 by Brandon Chinn at 2022-10-04T20:34:46-04:00
Fix docs for pattern synonyms
- - - - -
463ffe02 by Oleg Grenrus at 2022-10-04T20:35:24-04:00
Use sameByteArray# in sameByteArray

- - - - -
fbe1e86e by Pierre Le Marre at 2022-10-05T15:58:43+02:00
Minor fixes following Unicode 15.0.0 update

- Fix changelog for Unicode 15.0.0
- Fix the checksums of the downloaded Unicode files, in base's tool: "ucd2haskell".

- - - - -
8a31d02e by Cheng Shao at 2022-10-05T20:40:41-04:00
rts: don't enforce aligned((8)) on 32-bit targets

We simply need to align to the word size for pointer tagging to work. On
32-bit targets, aligned((8)) is wasteful.

- - - - -
532de368 by Ryan Scott at 2022-10-06T07:45:46-04:00
Export symbolSing, SSymbol, and friends (CLC#85)

This implements this Core Libraries Proposal:
https://github.com/haskell/core-libraries-committee/issues/85

In particular, it:

1. Exposes the `symbolSing` method of `KnownSymbol`,
2. Exports the abstract `SSymbol` type used in `symbolSing`, and
3. Defines an API for interacting with `SSymbol`.

This also makes corresponding changes for `natSing`/`KnownNat`/`SNat` and
`charSing`/`KnownChar`/`SChar`. This fixes #15183 and addresses part (2)
of #21568.

- - - - -


30 changed files:

- .editorconfig
- .gitlab/ci.sh
- compiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Switch.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/Reg/Liveness.hs
- compiler/GHC/CmmToC.hs
- compiler/GHC/CmmToLlvm.hs
- compiler/GHC/CmmToLlvm/Base.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/FamInstEnv.hs
- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/CSE.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/OccurAnal.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/Pipeline/Types.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Opt/Specialise.hs
- compiler/GHC/Core/Opt/Stats.hs
- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Ppr.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Core/TyCon.hs
- compiler/GHC/Core/Type.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5ecb4ecebb3b439850c2719f0341109e2d1ffebf...532de36870ed9e880d5f146a478453701e9db25d

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5ecb4ecebb3b439850c2719f0341109e2d1ffebf...532de36870ed9e880d5f146a478453701e9db25d
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221006/1ba570ea/attachment-0001.html>


More information about the ghc-commits mailing list