[Git][ghc/ghc][wip/simplifier-tweaks] 18 commits: rts: Fix TSAN_ENABLED CPP guard
Simon Peyton Jones (@simonpj)
gitlab at gitlab.haskell.org
Tue Apr 2 20:22:48 UTC 2024
Simon Peyton Jones pushed to branch wip/simplifier-tweaks at Glasgow Haskell Compiler / GHC
Commits:
c8a4c050 by Ben Gamari at 2024-04-02T12:50:35-04:00
rts: Fix TSAN_ENABLED CPP guard
This should be `#if defined(TSAN_ENABLED)`, not `#if TSAN_ENABLED`,
lest we suffer warnings.
- - - - -
e91dad93 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix errors when compiling with TSAN
This commit fixes rts compilation errors when compiling with TSAN:
- xxx_FENCE macros are redefined and trigger CPP warnings.
- Use SIZEOF_W. WORD_SIZE_IN_BITS is provided by MachDeps.h which
Cmm.h doesn't include by default.
- - - - -
a9ab9455 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix clang-specific errors when compiling with TSAN
This commit fixes clang-specific rts compilation errors when compiling
with TSAN:
- clang doesn't have -Wtsan flag
- Fix prototype of ghc_tsan_* helper functions
- __tsan_atomic_* functions aren't clang built-ins and
sanitizer/tsan_interface_atomic.h needs to be included
- On macOS, TSAN runtime library is
libclang_rt.tsan_osx_dynamic.dylib, not libtsan. -fsanitize-thread
as a link-time flag will take care of linking the TSAN runtime
library anyway so remove tsan as an rts extra library
- - - - -
865bd717 by Cheng Shao at 2024-04-02T12:50:35-04:00
compiler: fix github link to __tsan_memory_order in a comment
- - - - -
07cb627c by Cheng Shao at 2024-04-02T12:50:35-04:00
ci: improve TSAN CI jobs
- Run TSAN jobs with +thread_sanitizer_cmm which enables Cmm
instrumentation as well.
- Run TSAN jobs in deb12 which ships gcc-12, a reasonably recent gcc
that @bgamari confirms he's using in #GHC:matrix.org. Ideally we
should be using latest clang release for latest improvements in
sanitizers, though that's left as future work.
- Mark TSAN jobs as manual+allow_failure in validate pipelines. The
purpose is to demonstrate that we have indeed at least fixed
building of TSAN mode in CI without blocking the patch to land, and
once merged other people can begin playing with TSAN using their own
dev setups and feature branches.
- - - - -
a1c18c7b by Andrei Borzenkov at 2024-04-02T12:51:11-04:00
Merge tc_infer_hs_type and tc_hs_type into one function using ExpType philosophy (#24299, #23639)
This patch implements refactoring which is a prerequisite to
updating kind checking of type patterns. This is a huge simplification
of the main worker that checks kind of HsType.
It also fixes the issues caused by previous code duplication, e.g.
that we didn't add module finalizers from splices in inference mode.
- - - - -
c93eecf1 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Several improvements to the handling of coercions
* Make `mkSymCo` and `mkInstCo` smarter
Fixes #23642
* Fix return role of `SelCo` in the coercion optimiser.
Fixes #23617
* Make the coercion optimiser `opt_trans_rule` work better for newtypes
Fixes #23619
- - - - -
0a2c3c2d by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
FloatOut: improve floating for join point
See the new Note [Floating join point bindings].
* Completely get rid of the complicated join_ceiling nonsense, which
I have never understood.
* Do not float join points at all, except perhaps to top level.
* Some refactoring around wantToFloat, to treat Rec and NonRec more
uniformly
- - - - -
6d605aa5 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Improve eta-expansion through call stacks
See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity
This is a one-line change, that fixes an inconsistency
- || isCallStackPredTy ty
+ || isCallStackPredTy ty || isCallStackTy ty
- - - - -
447f2ac0 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Spelling, layout, pretty-printing only
- - - - -
f36b9603 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Improve exprIsConApp_maybe a little
Eliminate a redundant case at birth. This sometimes reduces
Simplifier iterations.
See Note [Case elim in exprIsConApp_maybe].
- - - - -
7fc8cfc3 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo
When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced. It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so. So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.
- - - - -
a818823d by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Slight improvement in WorkWrap
Ensure that WorkWrap preserves lambda binders, in case of join points. Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
* it does no harm,
* it is slightly more efficient, and
* presumably it made something better!
Anyway I have kept it in a separate commit.
- - - - -
c471144d by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Use named record fields for the CastIt { ... } data constructor
This is a pure refactor
- - - - -
346115e0 by Simon Peyton Jones at 2024-04-02T21:20:44+01:00
Remove a long-commented-out line
Pure refactoring
- - - - -
57f42293 by Simon Peyton Jones at 2024-04-02T21:20:49+01:00
Simplifier improvements
This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do. But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.
The main changes are below. There are also many new or rewritten Notes.
Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]
* The SimplEnv now has a seInlineDepth field, which says how deep
in unfoldings we are. See Note [Inline depth] in Simplify.Env.
Currently used only for the next point: avoiding repeatedly
simplifying coercions.
* Avoid repeatedly simplifying coercions.
see Note [Avoid re-simplifying coercions] in Simplify.Iteration
As you'll see from the Note, this makes use of the seInlineDepth.
* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
This is another part of Note [Post-inline for single-use things], and
is really good for reducing simplifier iterations in situations like
case K e of { K x -> blah }
wher x is used once in blah.
* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
elimination. Note [Case elim in exprIsConApp_maybe]
* Improve the case-merge transformation:
- Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
and friends. See Note [Merge Nested Cases] in GHC.Core.Utils.
- Add a new case for `tagToEnum#`; see wrinkle (MC3).
- Add a new case to look through join points: see wrinkle (MC4)
postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
that are used exactly once. See Note [Post-inline for single-use things].
* Do not postInlineUnconditionally join point, ever.
Doing so does not reduce allocation, which is the main point,
and with join points that are used a lot it can bloat code.
See point (1) of Note [Duplicating join points] in
GHC.Core.Opt.Simplify.Iteration.
* Do not postInlineUnconditionally a strict (demanded) binding.
It will not allocate a thunk (it'll turn into a case instead)
so again the main point of inlining it doesn't hold. Better
to check per-call-site.
* Improve occurrence analyis for bottoming function calls, to help
postInlineUnconditionally. See Note [Bottoming function calls]
in GHC.Core.Opt.OccurAnal
Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
See Note [Seq is boring] Also, wrinkle (SB1), inline in that
`seq` context only for INLINE functions (UnfWhen guidance).
* In GHC.Core.Opt.Simplify.Utils.interestingArg,
- return ValueArg for OtherCon [c1,c2, ...], but
- return NonTrivArg for OtherCon []
This makes a function a little less likely to inline if all we
know is that the argument is evaluated, but nothing else.
* isConLikeUnfolding is no longer true for OtherCon {}.
This propagates to exprIsConLike. Con-like-ness has /positive/
information.
Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
See these two long Notes
Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline
* When making join points, don't do so if the join point is so small
it will immediately be inlined; check uncondInlineJoin.
* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
heuristics for join points. In general we /do not/ want to inline
join points /even if they are small/. See Note [Duplicating join points]
GHC.Core.Opt.Simplify.Iteration.
But sometimes we do: see Note [Inlining join points] in
GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.
* Do not add an unfolding to a join point at birth. This is a tricky one
and has a long Note [Do not add unfoldings to join points at birth]
It shows up in two places
- In `mkDupableAlt` do not add an inlining
- (trickier) In `simplLetUnfolding` don't add an unfolding for a
fresh join point
I am not fully satisifed with this, but it works and is well documented.
* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
having a non-inlined join point.
Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.
* Compile times improve slightly. Here are the figures over 1%.
I investiate the biggest differnce in T18304. It's a very small module, just
a few hundred nodes. The large percentage difffence is due to a single
function that didn't quite inline before, and does now, making code size a
bit bigger. I decided gains outweighed the losses.
Metrics: compile_time/bytes allocated (changes over +/- 1%)
------------------------------------------------
CoOpt_Singletons(normal) -9.2% GOOD
LargeRecord(normal) -23.5% GOOD
MultiComponentModulesRecomp(normal) +1.2%
MultiLayerModulesTH_OneShot(normal) +4.1% BAD
PmSeriesS(normal) -3.8%
PmSeriesV(normal) -1.5%
T11195(normal) -1.3%
T12227(normal) -20.4% GOOD
T12545(normal) -3.2%
T12707(normal) -2.1% GOOD
T13253(normal) -1.2%
T13253-spj(normal) +8.1% BAD
T13386(normal) -3.1% GOOD
T14766(normal) -2.6% GOOD
T15164(normal) -1.4%
T15304(normal) +1.2%
T15630(normal) -8.2%
T15630a(normal) NEW
T15703(normal) -14.7% GOOD
T16577(normal) -2.3% GOOD
T17516(normal) -39.7% GOOD
T18140(normal) +1.2%
T18223(normal) -17.1% GOOD
T18282(normal) -5.0% GOOD
T18304(normal) +10.8% BAD
T18923(normal) -2.9% GOOD
T1969(normal) +1.0%
T19695(normal) -1.5%
T20049(normal) -12.7% GOOD
T21839c(normal) -4.1% GOOD
T3064(normal) -1.5%
T3294(normal) +1.2% BAD
T4801(normal) +1.2%
T5030(normal) -15.2% GOOD
T5321Fun(normal) -2.2% GOOD
T6048(optasm) -16.8% GOOD
T783(normal) -1.2%
T8095(normal) -6.0% GOOD
T9630(normal) -4.7% GOOD
T9961(normal) +1.9% BAD
WWRec(normal) -1.4%
info_table_map_perf(normal) -1.3%
parsing001(normal) +1.5%
geo. mean -2.0%
minimum -39.7%
maximum +10.8%
* Runtimes generally improve. In the testsuite perf/should_run gives:
Metrics: runtime/bytes allocated
------------------------------------------
Conversions(normal) -0.3%
T13536a(optasm) -41.7% GOOD
T4830(normal) -0.1%
haddock.Cabal(normal) -0.1%
haddock.base(normal) -0.1%
haddock.compiler(normal) -0.1%
geo. mean -0.8%
minimum -41.7%
maximum +0.0%
* For runtime, nofib is a better test. The news is mostly good.
Here are the number more than +/- 0.1%:
# bytes allocated
==========================++==========
imaginary/digits-of-e1 || -14.40%
imaginary/digits-of-e2 || -4.41%
imaginary/paraffins || -0.17%
imaginary/rfib || -0.15%
imaginary/wheel-sieve2 || -0.10%
real/compress || -0.47%
real/fluid || -0.10%
real/fulsom || +0.14%
real/gamteb || -1.47%
real/gg || -0.20%
real/infer || +0.24%
real/pic || -0.23%
real/prolog || -0.36%
real/scs || -0.46%
real/smallpt || +4.03%
shootout/k-nucleotide || -20.23%
shootout/n-body || -0.42%
shootout/spectral-norm || -0.13%
spectral/boyer2 || -3.80%
spectral/constraints || -0.27%
spectral/hartel/ida || -0.82%
spectral/mate || -20.34%
spectral/para || +0.46%
spectral/rewrite || +1.30%
spectral/sphere || -0.14%
==========================++==========
geom mean || -0.59%
real/smallpt has a huge nest of local definitions, and I
could not pin down a reason for a regression. But there are
three big wins!
Metric Decrease:
CoOpt_Singletons
LargeRecord
T12227
T12707
T13386
T13536a
T14766
T15703
T16577
T17516
T18223
T18282
T18923
T21839c
T20049
T5321Fun
T5030
T6048
T8095
T9630
T783
Metric Increase:
MultiLayerModulesTH_OneShot
T13253-spj
T18304
T18698a
T9961
T3294
- - - - -
ec584eaf by Simon Peyton Jones at 2024-04-02T21:21:23+01:00
Testsuite message changes from simplifier improvements
- - - - -
9c910ae1 by Simon Peyton Jones at 2024-04-02T21:21:23+01:00
Account for bottoming functions in OccurAnal
This fixes #24582, a small but long-standing bug
- - - - -
15 changed files:
- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- compiler/GHC/Cmm/ThreadSanitizer.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/FloatOut.hs
- compiler/GHC/Core/Opt/Monad.hs
- compiler/GHC/Core/Opt/OccurAnal.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/Simplify.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Inline.hs
The diff was not included because it is too large.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/848f360fe1bcbdac28e9cc0014665bf8ccbfd259...9c910ae1a616e74f5389828ad9b9126d0282764e
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/848f360fe1bcbdac28e9cc0014665bf8ccbfd259...9c910ae1a616e74f5389828ad9b9126d0282764e
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240402/66e7a572/attachment-0001.html>
More information about the ghc-commits
mailing list