[Git][ghc/ghc][wip/T22084] 4 commits: Make rewrite rules "win" over inlining
Simon Peyton Jones (@simonpj)
gitlab at gitlab.haskell.org
Tue Oct 11 07:40:39 UTC 2022
Simon Peyton Jones pushed to branch wip/T22084 at Glasgow Haskell Compiler / GHC
Commits:
96d32ff2 by Simon Peyton Jones at 2022-10-10T22:30:21+01:00
Make rewrite rules "win" over inlining
If a rewrite rule and a rewrite rule compete in the simplifier, this
patch makes sure that the rewrite rule "win". That is, in general
a bit fragile, but it's a huge help when making specialisation work
reliably, as #21851 and #22097 showed.
The change is fairly straightforwad, and documented in
Note [Rewrite rules and inlining]
in GHC.Core.Opt.Simplify.Iteration.
Compile-times change, up and down a bit -- in some cases because
we get better specialisation. But the payoff (more reliable
specialisation) is large.
Metrics: compile_time/bytes allocated
-----------------------------------------------
T10421(normal) +3.7% BAD
T10421a(normal) +5.5%
T13253(normal) +1.3%
T14052(ghci) +1.8%
T15304(normal) -1.4%
T16577(normal) +3.1% BAD
T17516(normal) +2.3%
T17836(normal) -1.9%
T18223(normal) -1.8%
T8095(normal) -1.3%
T9961(normal) +2.5% BAD
geo. mean +0.0%
minimum -1.9%
maximum +5.5%
Nofib results are (bytes allocated)
+-------------------------------++----------+
| ||tsv (rel) |
+===============================++==========+
| imaginary/paraffins || +0.27% |
| imaginary/rfib || -0.04% |
| real/anna || +0.02% |
| real/fem || -0.04% |
| real/fluid || +1.68% |
| real/gamteb || -0.34% |
| real/gg || +1.54% |
| real/hidden || -0.01% |
| real/hpg || -0.03% |
| real/infer || -0.03% |
| real/prolog || +0.02% |
| real/veritas || -0.47% |
| shootout/fannkuch-redux || -0.03% |
| shootout/k-nucleotide || -0.02% |
| shootout/n-body || -0.06% |
| shootout/spectral-norm || -0.01% |
| spectral/cryptarithm2 || +1.25% |
| spectral/fibheaps || +18.33% |
| spectral/last-piece || -0.34% |
+===============================++==========+
| geom mean || +0.17% |
There are extensive notes in !8897 about the regressions.
Briefly
* fibheaps: there was a very delicately balanced inlining that
tipped over the wrong way after this change.
* cryptarithm2 and paraffins are caused by #22274, which is
a separate issue really. (I.e. the right fix is *not* to
make inlining "win" over rules.)
So I'm accepting these changes
Metric Increase:
T10421
T16577
T9961
- - - - -
ed4b5885 by Joachim Breitner at 2022-10-10T23:16:11-04:00
Utils.JSON: do not escapeJsonString in ToJson String instance
as `escapeJsonString` is used in `renderJSON`, so the `JSString`
constructor is meant to carry the unescaped string.
- - - - -
2fb056d7 by Simon Peyton Jones at 2022-10-11T08:42:27+01:00
Don't keep exit join points so much
We were religiously keeping exit join points throughout, which
had some bad effects (#21148, #22084).
This MR does two things:
* Arranges that exit join points are inhibited from inlining
only in /one/ Simplifier pass (right after Exitification).
See Note [Be selective about not-inlining exit join points]
in GHC.Core.Opt.Exitify
It's not a big deal, but it shaves 0.1% off compile times.
* Inline used-once non-recursive join points very aggressively
Given join j x = rhs in
joinrec k y = ....j x....
where this is the only occurrence of `j`, we want to inline `j`.
(Unless sm_keep_exits is on.)
See Note [Inline used-once non-recursive join points] in
GHC.Core.Opt.Simplify.Utils
This is just a tidy-up really. It doesn't change allocation, but
getting rid of a binding is always good.
Very effect on nofib -- some up and down.
- - - - -
e4cac1b8 by Simon Peyton Jones at 2022-10-11T08:42:27+01:00
Make SpecConstr bale out less often
When doing performance debugging on #22084 / !8901, I found that the
algorithm in SpecConstr.decreaseSpecCount was so aggressive that if
there were /more/ specialisations available for an outer function,
that could more or less kill off specialisation for an /inner/
function. (An example was in nofib/spectral/fibheaps.)
This patch makes it a bit more aggressive, by dividing by 2, rather
than by the number of outer specialisations.
This makes the program bigger, temporarily:
T19695(normal) ghc/alloc +11.3% BAD
because we get more specialisation. But lots of other programs
compile a bit faster and the geometric mean in perf/compiler
is 0.0%.
Metric Increase:
T19695
- - - - -
22 changed files:
- compiler/GHC/Core/Opt/Exitify.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
- compiler/GHC/Utils/Json.hs
- compiler/GHC/Utils/Monad.hs
- testsuite/tests/lib/integer/Makefile
- + testsuite/tests/simplCore/should_compile/T21148.hs
- + testsuite/tests/simplCore/should_compile/T21148.stderr
- + testsuite/tests/simplCore/should_compile/T21851.hs
- + testsuite/tests/simplCore/should_compile/T21851.stderr
- + testsuite/tests/simplCore/should_compile/T21851a.hs
- + testsuite/tests/simplCore/should_compile/T22097.hs
- + testsuite/tests/simplCore/should_compile/T22097.stderr
- + testsuite/tests/simplCore/should_compile/T22097a.hs
- testsuite/tests/simplCore/should_compile/T6056.stderr
- testsuite/tests/simplCore/should_compile/all.T
- testsuite/tests/stranal/should_compile/T21128.hs
- testsuite/tests/stranal/should_compile/T21128.stderr
Changes:
=====================================
compiler/GHC/Core/Opt/Exitify.hs
=====================================
@@ -433,6 +433,7 @@ inlining.
Exit join points, recognizable using `isExitJoinId` are join points with an
occurrence in a recursive group, and can be recognized (after the occurrence
analyzer ran!) using `isExitJoinId`.
+
This function detects joinpoints with `occ_in_lam (idOccinfo id) == True`,
because the lambdas of a non-recursive join point are not considered for
`occ_in_lam`. For example, in the following code, `j1` is /not/ marked
@@ -446,6 +447,29 @@ To prevent inlining, we check for isExitJoinId
* In `simplLetUnfolding` we simply give exit join points no unfolding, which
prevents inlining in `postInlineUnconditionally` and call sites.
+But see Note [Be selective about not-inlining exit join points]
+
+Note [Be selective about not-inlining exit join points]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we follow "do not inline exit join points" mantra throughout,
+some bad things happen.
+
+* We can lose CPR information: see #21148
+
+* We get useless clutter (#22084) that
+ - makes the program bigger (including duplicated code #20739), and
+ - adds extra jumps (and maybe stack saves) at runtime
+
+So instead we follow "do not inline exit join points" for a /single run/
+of the simplifier, right after Exitification. That should give a
+sufficient chance for used-once things to inline, but subsequent runs
+will inline them back in. (Annoyingly, as things stand, only with -O2
+is there a subsequent run, but that might change, and it's not a huge
+deal anyway.)
+
+This is controlled by the Simplifier's sm_keep_exits flag; see
+GHC.Core.Opt.Pipeline.
+
Note [Placement of the exitification pass]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I (Joachim) experimented with multiple positions for the Exitification pass in
=====================================
compiler/GHC/Core/Opt/Pipeline.hs
=====================================
@@ -15,7 +15,7 @@ import GHC.Driver.Plugins ( withPlugins, installCoreToDos )
import GHC.Driver.Env
import GHC.Driver.Config.Core.Lint ( endPass )
import GHC.Driver.Config.Core.Opt.LiberateCase ( initLiberateCaseOpts )
-import GHC.Driver.Config.Core.Opt.Simplify ( initSimplifyOpts, initSimplMode, initGentleSimplMode )
+import GHC.Driver.Config.Core.Opt.Simplify ( initSimplifyOpts, initSimplMode )
import GHC.Driver.Config.Core.Opt.WorkWrap ( initWorkWrapOpts )
import GHC.Driver.Config.Core.Rules ( initRuleOpts )
import GHC.Platform.Ways ( hasWay, Way(WayProf) )
@@ -28,6 +28,7 @@ import GHC.Core.Utils ( dumpIdInfoOfProgram )
import GHC.Core.Lint ( lintAnnots )
import GHC.Core.Lint.Interactive ( interactiveInScope )
import GHC.Core.Opt.Simplify ( simplifyExpr, simplifyPgm )
+import GHC.Core.Opt.Simplify.Env( SimplMode(..) )
import GHC.Core.Opt.Simplify.Monad
import GHC.Core.Opt.Monad
import GHC.Core.Opt.Pipeline.Types
@@ -154,32 +155,45 @@ getCoreToDo dflags rule_base extra_vars
maybe_strictness_before _
= CoreDoNothing
- simpl_phase phase name iter
- = CoreDoPasses
- $ [ maybe_strictness_before phase
- , CoreDoSimplify $ initSimplifyOpts dflags extra_vars iter
- (initSimplMode dflags phase name) rule_base
- , maybe_rule_check phase ]
+ ----------------------------
+ base_simpl_mode :: SimplMode
+ base_simpl_mode = initSimplMode dflags
- -- Run GHC's internal simplification phase, after all rules have run.
- -- See Note [Compiler phases] in GHC.Types.Basic
- simplify name = simpl_phase FinalPhase name max_iter
-
- -- initial simplify: mk specialiser happy: minimum effort please
+ -- gentle_mode: make specialiser happy: minimum effort please
-- See Note [Inline in InitialPhase]
-- See Note [RULEs enabled in InitialPhase]
- simpl_gently = CoreDoSimplify $ initSimplifyOpts dflags extra_vars max_iter
- (initGentleSimplMode dflags) rule_base
+ gentle_mode = base_simpl_mode { sm_names = ["Gentle"]
+ , sm_phase = InitialPhase
+ , sm_case_case = False }
+
+ simpl_mode phase name
+ = base_simpl_mode { sm_names = [name], sm_phase = phase }
+
+ keep_exits :: SimplMode -> SimplMode
+ -- See Note [Be selective about not-inlining exit join points]
+ -- in GHC.Core.Opt.Exitify
+ keep_exits mode = mode { sm_keep_exits = True }
+
+ ----------------------------
+ run_simplifier mode iter
+ = CoreDoSimplify $ initSimplifyOpts dflags extra_vars iter mode rule_base
+ simpl_phase phase name iter = CoreDoPasses $
+ [ maybe_strictness_before phase
+ , run_simplifier (simpl_mode phase name) iter
+ , maybe_rule_check phase ]
+
+ -- Run GHC's internal simplification phase, after all rules have run.
+ -- See Note [Compiler phases] in GHC.Types.Basic
+ simpl_gently = run_simplifier gentle_mode max_iter
+ simplify_final name = run_simplifier ( simpl_mode FinalPhase name) max_iter
+ simpl_keep_exits name = run_simplifier (keep_exits $ simpl_mode FinalPhase name) max_iter
+
+ ----------------------------
dmd_cpr_ww = if ww_on then [CoreDoDemand True,CoreDoCpr,CoreDoWorkerWrapper]
else [CoreDoDemand False] -- NB: No CPR! See Note [Don't change boxity without worker/wrapper]
- demand_analyser = (CoreDoPasses (
- dmd_cpr_ww ++
- [simplify "post-worker-wrapper"]
- ))
-
-- Static forms are moved to the top level with the FloatOut pass.
-- See Note [Grand plan for static forms] in GHC.Iface.Tidy.StaticPtrTable.
static_ptrs_float_outwards =
@@ -269,14 +283,16 @@ getCoreToDo dflags rule_base extra_vars
runWhen call_arity $ CoreDoPasses
[ CoreDoCallArity
- , simplify "post-call-arity"
+ , simplify_final "post-call-arity"
],
-- Strictness analysis
- runWhen strictness demand_analyser,
+ runWhen strictness $ CoreDoPasses
+ (dmd_cpr_ww ++ [simplify_final "post-worker-wrapper"]),
runWhen exitification CoreDoExitify,
-- See Note [Placement of the exitification pass]
+ -- in GHC.Core.Opt.Exitify
runWhen full_laziness $
CoreDoFloatOutwards FloatOutSwitches {
@@ -298,7 +314,17 @@ getCoreToDo dflags rule_base extra_vars
runWhen do_float_in CoreDoFloatInwards,
- simplify "final", -- Final tidy-up
+ -- Final tidy-up run of the simplifier
+ simpl_keep_exits "final tidy up",
+ -- Keep exit join point because this is the first
+ -- Simplifier run after Exitify. Subsequent runs will
+ -- re-inline those exit join points; their work is done.
+ -- See Note [Be selective about not-inlining exit join points]
+ -- in GHC.Core.Opt.Exitify
+ --
+ -- Annoyingly, we only /have/ a subsequent run with -O2. With
+ -- plain -O we'll still have those exit join points hanging around.
+ -- Oh well.
maybe_rule_check FinalPhase,
@@ -308,31 +334,31 @@ getCoreToDo dflags rule_base extra_vars
-- Case-liberation for -O2. This should be after
-- strictness analysis and the simplification which follows it.
runWhen liberate_case $ CoreDoPasses
- [ CoreLiberateCase, simplify "post-liberate-case" ],
+ [ CoreLiberateCase, simplify_final "post-liberate-case" ],
-- Run the simplifier after LiberateCase to vastly
-- reduce the possibility of shadowing
-- Reason: see Note [Shadowing] in GHC.Core.Opt.SpecConstr
runWhen spec_constr $ CoreDoPasses
- [ CoreDoSpecConstr, simplify "post-spec-constr"],
+ [ CoreDoSpecConstr, simplify_final "post-spec-constr"],
-- See Note [Simplify after SpecConstr]
maybe_rule_check FinalPhase,
runWhen late_specialise $ CoreDoPasses
- [ CoreDoSpecialising, simplify "post-late-spec"],
+ [ CoreDoSpecialising, simplify_final "post-late-spec"],
-- LiberateCase can yield new CSE opportunities because it peels
-- off one layer of a recursive function (concretely, I saw this
-- in wheel-sieve1), and I'm guessing that SpecConstr can too
-- And CSE is a very cheap pass. So it seems worth doing here.
runWhen ((liberate_case || spec_constr) && cse) $ CoreDoPasses
- [ CoreCSE, simplify "post-final-cse" ],
+ [ CoreCSE, simplify_final "post-final-cse" ],
--------- End of -O2 passes --------------
runWhen late_dmd_anal $ CoreDoPasses (
- dmd_cpr_ww ++ [simplify "post-late-ww"]
+ dmd_cpr_ww ++ [simplify_final "post-late-ww"]
),
-- Final run of the demand_analyser, ensures that one-shot thunks are
=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -248,13 +248,16 @@ data SimplMode = SimplMode -- See comments in GHC.Core.Opt.Simplify.Monad
, sm_uf_opts :: !UnfoldingOpts -- ^ Unfolding options
, sm_case_case :: !Bool -- ^ Whether case-of-case is enabled
, sm_pre_inline :: !Bool -- ^ Whether pre-inlining is enabled
- , sm_float_enable :: !FloatEnable -- ^ Whether to enable floating out
+ , sm_keep_exits :: !Bool -- ^ True <=> keep ExitJoinIds
+ -- See Note [Do not inline exit join points]
+ -- in GHC.Core.Opt.Exitify
+ , sm_float_enable :: !FloatEnable -- ^ Whether to enable floating out
, sm_do_eta_reduction :: !Bool
- , sm_arity_opts :: !ArityOpts
- , sm_rule_opts :: !RuleOpts
- , sm_case_folding :: !Bool
- , sm_case_merge :: !Bool
- , sm_co_opt_opts :: !OptCoercionOpts -- ^ Coercion optimiser options
+ , sm_arity_opts :: !ArityOpts
+ , sm_rule_opts :: !RuleOpts
+ , sm_case_folding :: !Bool
+ , sm_case_merge :: !Bool
+ , sm_co_opt_opts :: !OptCoercionOpts -- ^ Coercion optimiser options
}
instance Outputable SimplMode where
=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -1919,7 +1919,9 @@ wrapJoinCont env cont thing_inside
--------------------
-trimJoinCont :: Id -> Maybe JoinArity -> SimplCont -> SimplCont
+trimJoinCont :: Id -- Used only in error message
+ -> Maybe JoinArity
+ -> SimplCont -> SimplCont
-- Drop outer context from join point invocation (jump)
-- See Note [Join points and case-of-case]
@@ -2017,6 +2019,17 @@ outside. Surprisingly tricky!
Variables
* *
************************************************************************
+
+Note [zapSubstEnv]
+~~~~~~~~~~~~~~~~~~
+When simplifying something that has already been simplified, be sure to
+zap the SubstEnv. This is VITAL. Consider
+ let x = e in
+ let y = \z -> ...x... in
+ \ x -> ...y...
+
+We'll clone the inner \x, adding x->x' in the id_subst Then when we
+inline y, we must *not* replace x by x' in the inlined copy!!
-}
simplVar :: SimplEnv -> InVar -> SimplM OutExpr
@@ -2035,86 +2048,28 @@ simplVar env var
simplIdF :: SimplEnv -> InId -> SimplCont -> SimplM (SimplFloats, OutExpr)
simplIdF env var cont
= case substId env var of
- ContEx tvs cvs ids e ->
- let env' = setSubstEnv env tvs cvs ids
- in simplExprF env' e cont
- -- Don't trim; haven't already simplified e,
- -- so the cont is not embodied in e
-
- DoneId var1 -> do
- logger <- getLogger
- let cont' = trimJoinCont var (isJoinId_maybe var1) cont
- completeCall logger env var1 cont'
-
- DoneEx e mb_join ->
- let env' = zapSubstEnv env
- cont' = trimJoinCont var mb_join cont
- in simplExprF env' e cont'
- -- Note [zapSubstEnv]
- -- ~~~~~~~~~~~~~~~~~~
- -- The template is already simplified, so don't re-substitute.
- -- This is VITAL. Consider
- -- let x = e in
- -- let y = \z -> ...x... in
- -- \ x -> ...y...
- -- We'll clone the inner \x, adding x->x' in the id_subst
- -- Then when we inline y, we must *not* replace x by x' in
- -- the inlined copy!!
-
----------------------------------------------------------
--- Dealing with a call site
-
-completeCall :: Logger -> SimplEnv -> OutId -> SimplCont -> SimplM (SimplFloats, OutExpr)
-completeCall logger env var cont
- | Just expr <- callSiteInline logger uf_opts case_depth var active_unf
- lone_variable arg_infos interesting_cont
- -- Inline the variable's RHS
- = do { checkedTick (UnfoldingDone var)
- ; dump_inline expr cont
- ; let env1 = zapSubstEnv env
- ; simplExprF env1 expr cont }
-
- | otherwise
- -- Don't inline; instead rebuild the call
- = do { rule_base <- getSimplRules
- ; let rules = getRules rule_base var
- info = mkArgInfo env var rules
- n_val_args call_cont
- ; rebuildCall env info cont }
+ ContEx tvs cvs ids e -> simplExprF env' e cont
+ -- Don't trimJoinCont; haven't already simplified e,
+ -- so the cont is not embodied in e
+ where
+ env' = setSubstEnv env tvs cvs ids
- where
- uf_opts = seUnfoldingOpts env
- case_depth = seCaseDepth env
- (lone_variable, arg_infos, call_cont) = contArgs cont
- n_val_args = length arg_infos
- interesting_cont = interestingCallContext env call_cont
- active_unf = activeUnfolding (seMode env) var
+ DoneId var1 ->
+ do { rule_base <- getSimplRules
+ ; let cont' = trimJoinCont var1 (isJoinId_maybe var1) cont
+ info = mkArgInfo env rule_base var1 cont'
+ ; rebuildCall env info cont' }
- log_inlining doc
- = liftIO $ logDumpFile logger (mkDumpStyle alwaysQualify)
- Opt_D_dump_inlinings
- "" FormatText doc
+ DoneEx e mb_join -> simplExprF env' e cont'
+ where
+ cont' = trimJoinCont var mb_join cont
+ env' = zapSubstEnv env -- See Note [zapSubstEnv]
- dump_inline unfolding cont
- | not (logHasDumpFlag logger Opt_D_dump_inlinings) = return ()
- | not (logHasDumpFlag logger Opt_D_verbose_core2core)
- = when (isExternalName (idName var)) $
- log_inlining $
- sep [text "Inlining done:", nest 4 (ppr var)]
- | otherwise
- = log_inlining $
- sep [text "Inlining done: " <> ppr var,
- nest 4 (vcat [text "Inlined fn: " <+> nest 2 (ppr unfolding),
- text "Cont: " <+> ppr cont])]
+---------------------------------------------------------
+-- Dealing with a call site
-rebuildCall :: SimplEnv
- -> ArgInfo
- -> SimplCont
+rebuildCall :: SimplEnv -> ArgInfo -> SimplCont
-> SimplM (SimplFloats, OutExpr)
--- We decided not to inline, so
--- - simplify the arguments
--- - try rewrite rules
--- - and rebuild
---------- Bottoming applications --------------
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_dmds = [] }) cont
@@ -2137,27 +2092,48 @@ rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_dmds = [] }) con
res = argInfoExpr fun rev_args
cont_ty = contResultType cont
----------- Try rewrite RULES --------------
--- See Note [Trying rewrite rules]
+---------- Try inlining, if ai_rewrite = TryInlining --------
+-- In the TryInlining case we try inlining immediately, before simplifying
+-- any (more) arguments. Why? See Note [Rewrite rules and inlining].
+--
+-- If there are rewrite rules we'll skip this case until we have
+-- simplified enough args to satisfy nr_wanted==0 in the TryRules case below
+-- Then we'll try the rules, and if that fails, we'll do TryInlining
+rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args
+ , ai_rewrite = TryInlining }) cont
+ = do { logger <- getLogger
+ ; let full_cont = pushSimplifiedRevArgs env rev_args cont
+ ; mb_inline <- tryInlining env logger fun full_cont
+ ; case mb_inline of
+ Just expr -> do { checkedTick (UnfoldingDone fun)
+ ; let env1 = zapSubstEnv env
+ ; simplExprF env1 expr full_cont }
+ Nothing -> rebuildCall env (info { ai_rewrite = TryNothing }) cont
+ }
+
+---------- Try rewrite RULES, if ai_rewrite = TryRules --------------
+-- See Note [Rewrite rules and inlining]
+-- See also Note [Trying rewrite rules]
rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args
- , ai_rules = Just (nr_wanted, rules) }) cont
+ , ai_rewrite = TryRules nr_wanted rules }) cont
| nr_wanted == 0 || no_more_args
- , let info' = info { ai_rules = Nothing }
= -- We've accumulated a simplified call in <fun,rev_args>
-- so try rewrite rules; see Note [RULES apply to simplified arguments]
-- See also Note [Rules for recursive functions]
do { mb_match <- tryRules env rules fun (reverse rev_args) cont
; case mb_match of
Just (env', rhs, cont') -> simplExprF env' rhs cont'
- Nothing -> rebuildCall env info' cont }
+ Nothing -> rebuildCall env (info { ai_rewrite = TryInlining }) cont }
where
+ -- If we have run out of arguments, just try the rules; there might
+ -- be some with lower arity. Casts get in the way -- they aren't
+ -- allowed on rule LHSs
no_more_args = case cont of
ApplyToTy {} -> False
ApplyToVal {} -> False
_ -> True
-
----------- Simplify applications and casts --------------
+---------- Simplify type applications and casts --------------
rebuildCall env info (CastIt co cont)
= rebuildCall env (addCastTo info co) cont
@@ -2202,6 +2178,7 @@ rebuildCall env (ArgInfo { ai_fun = fun_id, ai_args = rev_args })
call' = mkApps (Var fun_id) [mkTyArg rr', mkTyArg ty', arg']
; return (emptyFloats env, call') }
+---------- Simplify value arguments --------------------
rebuildCall env fun_info
(ApplyToVal { sc_arg = arg, sc_env = arg_se
, sc_dup = dup_flag, sc_hole_ty = fun_ty
@@ -2237,6 +2214,42 @@ rebuildCall env fun_info
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args }) cont
= rebuild env (argInfoExpr fun rev_args) cont
+-----------------------------------
+tryInlining :: SimplEnv -> Logger -> OutId -> SimplCont -> SimplM (Maybe OutExpr)
+tryInlining env logger var cont
+ | Just expr <- callSiteInline logger uf_opts case_depth var active_unf
+ lone_variable arg_infos interesting_cont
+ = do { dump_inline expr cont
+ ; return (Just expr) }
+
+ | otherwise
+ = return Nothing
+
+ where
+ uf_opts = seUnfoldingOpts env
+ case_depth = seCaseDepth env
+ (lone_variable, arg_infos, call_cont) = contArgs cont
+ interesting_cont = interestingCallContext env call_cont
+ active_unf = activeUnfolding (seMode env) var
+
+ log_inlining doc
+ = liftIO $ logDumpFile logger (mkDumpStyle alwaysQualify)
+ Opt_D_dump_inlinings
+ "" FormatText doc
+
+ dump_inline unfolding cont
+ | not (logHasDumpFlag logger Opt_D_dump_inlinings) = return ()
+ | not (logHasDumpFlag logger Opt_D_verbose_core2core)
+ = when (isExternalName (idName var)) $
+ log_inlining $
+ sep [text "Inlining done:", nest 4 (ppr var)]
+ | otherwise
+ = log_inlining $
+ sep [text "Inlining done: " <> ppr var,
+ nest 4 (vcat [text "Inlined fn: " <+> nest 2 (ppr unfolding),
+ text "Cont: " <+> ppr cont])]
+
+
{- Note [Trying rewrite rules]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider an application (f e1 e2 e3) where the e1,e2,e3 are not yet
@@ -2272,6 +2285,38 @@ makes a particularly big difference when superclass selectors are involved:
op ($p1 ($p2 (df d)))
We want all this to unravel in one sweep.
+Note [Rewrite rules and inlining]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In general we try to arrange that inlining is disabled (via a pragma) if
+a rewrite rule should apply, so that the rule has a decent chance to fire
+before we inline the function.
+
+But it turns out that (especially when type-class specialisation or
+SpecConstr is involved) it is very helpful for the the rewrite rule to
+"win" over inlining when both are active at once: see #21851, #22097.
+
+The simplifier arranges to do this, as follows. In effect, the ai_rewrite
+field of the ArgInfo record is the state of a little state-machine:
+
+* mkArgInfo sets the ai_rewrite field to TryRules if there are any rewrite
+ rules avaialable for that function.
+
+* rebuildCall simplifies arguments until enough are simplified to match the
+ rule with greatest arity. See Note [RULES apply to simplified arguments]
+ and the first field of `TryRules`.
+
+ But no more! As soon as we have simplified enough arguments to satisfy the
+ maximum-arity rules, we try the rules; see Note [Trying rewrite rules].
+
+* Once we have tried rules (or immediately if there are no rules) set
+ ai_rewrite to TryInlining, and the Simplifier will try to inline the
+ function. We want to try this immediately (before simplifying any (more)
+ arguments). Why? Consider
+ f BIG where f = \x{OneOcc}. ...x...
+ If we inline `f` before simplifying `BIG` well use preInlineUnconditionally,
+ and we'll simplify BIG once, at x's occurrence, rather than twice.
+
+
Note [Avoid redundant simplification]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because RULES apply to simplified arguments, there's a danger of repeatedly
@@ -2327,7 +2372,8 @@ See Note [No free join points in arityType] in GHC.Core.Opt.Arity
-}
tryRules :: SimplEnv -> [CoreRule]
- -> Id -> [ArgSpec]
+ -> Id
+ -> [ArgSpec] -- In /normal, forward/ order
-> SimplCont
-> SimplM (Maybe (SimplEnv, CoreExpr, SimplCont))
@@ -3668,7 +3714,7 @@ mkDupableStrictBind env arg_bndr join_rhs res_ty
| otherwise
= do { join_bndr <- newJoinId [arg_bndr] res_ty
; let arg_info = ArgInfo { ai_fun = join_bndr
- , ai_rules = Nothing, ai_args = []
+ , ai_rewrite = TryNothing, ai_args = []
, ai_encl = False, ai_dmds = repeat topDmd
, ai_discs = repeat 0 }
; return ( addJoinFloats (emptyFloats env) $
=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -30,9 +30,10 @@ module GHC.Core.Opt.Simplify.Utils (
interestingCallContext,
-- ArgInfo
- ArgInfo(..), ArgSpec(..), mkArgInfo,
+ ArgInfo(..), ArgSpec(..), RewriteCall(..), mkArgInfo,
addValArgTo, addCastTo, addTyArgTo,
- argInfoExpr, argInfoAppArgs, pushSimplifiedArgs,
+ argInfoExpr, argInfoAppArgs,
+ pushSimplifiedArgs, pushSimplifiedRevArgs,
isStrictArgInfo, lazyArgContext,
abstractFloats,
@@ -52,6 +53,7 @@ import GHC.Core.Ppr
import GHC.Core.TyCo.Ppr ( pprParendType )
import GHC.Core.FVs
import GHC.Core.Utils
+import GHC.Core.Rules( getRules )
import GHC.Core.Opt.Arity
import GHC.Core.Unfold
import GHC.Core.Unfold.Make
@@ -210,6 +212,7 @@ data SimplCont
type StaticEnv = SimplEnv -- Just the static part is relevant
+-- See Note [DupFlag invariants]
data DupFlag = NoDup -- Unsimplified, might be big
| Simplified -- Simplified
| OkToDup -- Simplified and small
@@ -226,8 +229,9 @@ perhapsSubstTy dup env ty
{- Note [StaticEnv invariant]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We pair up an InExpr or InAlts with a StaticEnv, which establishes the
-lexical scope for that InExpr. When we simplify that InExpr/InAlts, we
-use
+lexical scope for that InExpr.
+
+When we simplify that InExpr/InAlts, we use
- Its captured StaticEnv
- Overriding its InScopeSet with the larger one at the
simplification point.
@@ -244,13 +248,14 @@ isn't big enough.
Note [DupFlag invariants]
~~~~~~~~~~~~~~~~~~~~~~~~~
-In both (ApplyToVal dup _ env k)
- and (Select dup _ _ env k)
+In both ApplyToVal { se_dup = dup, se_env = env, se_cont = k}
+ and Select { se_dup = dup, se_env = env, se_cont = k}
the following invariants hold
(a) if dup = OkToDup, then continuation k is also ok-to-dup
- (b) if dup = OkToDup or Simplified, the subst-env is empty
- (and hence no need to re-simplify)
+ (b) if dup = OkToDup or Simplified, the subst-env is empty,
+ or at least is always ignored; the payload is
+ already an OutThing
-}
instance Outputable DupFlag where
@@ -309,7 +314,8 @@ data ArgInfo
ai_fun :: OutId, -- The function
ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order)
- ai_rules :: FunRules, -- Rules for this function
+ ai_rewrite :: RewriteCall, -- What transformation to try next for this call
+ -- See Note [Rewrite rules and inlining] in GHC.Core.Opt.Simplify.Iteration
ai_encl :: Bool, -- Flag saying whether this function
-- or an enclosing one has rules (recursively)
@@ -325,6 +331,12 @@ data ArgInfo
-- Always infinite
}
+data RewriteCall -- What rewriting to try next for this call
+ -- See Note [Rewrite rules and inlining] in GHC.Core.Opt.Simplify.Iteration
+ = TryRules FullArgCount [CoreRule]
+ | TryInlining
+ | TryNothing
+
data ArgSpec
= ValArg { as_dmd :: Demand -- Demand placed on this argument
, as_arg :: OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
@@ -349,20 +361,20 @@ instance Outputable ArgSpec where
addValArgTo :: ArgInfo -> OutExpr -> OutType -> ArgInfo
addValArgTo ai arg hole_ty
- | ArgInfo { ai_dmds = dmd:dmds, ai_discs = _:discs, ai_rules = rules } <- ai
+ | ArgInfo { ai_dmds = dmd:dmds, ai_discs = _:discs, ai_rewrite = rew } <- ai
-- Pop the top demand and and discounts off
, let arg_spec = ValArg { as_arg = arg, as_hole_ty = hole_ty, as_dmd = dmd }
- = ai { ai_args = arg_spec : ai_args ai
- , ai_dmds = dmds
- , ai_discs = discs
- , ai_rules = decRules rules }
+ = ai { ai_args = arg_spec : ai_args ai
+ , ai_dmds = dmds
+ , ai_discs = discs
+ , ai_rewrite = decArgCount rew }
| otherwise
= pprPanic "addValArgTo" (ppr ai $$ ppr arg)
-- There should always be enough demands and discounts
addTyArgTo :: ArgInfo -> OutType -> OutType -> ArgInfo
-addTyArgTo ai arg_ty hole_ty = ai { ai_args = arg_spec : ai_args ai
- , ai_rules = decRules (ai_rules ai) }
+addTyArgTo ai arg_ty hole_ty = ai { ai_args = arg_spec : ai_args ai
+ , ai_rewrite = decArgCount (ai_rewrite ai) }
where
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
@@ -381,19 +393,22 @@ argInfoAppArgs (CastBy {} : _) = [] -- Stop at a cast
argInfoAppArgs (ValArg { as_arg = arg } : as) = arg : argInfoAppArgs as
argInfoAppArgs (TyArg { as_arg_ty = ty } : as) = Type ty : argInfoAppArgs as
-pushSimplifiedArgs :: SimplEnv -> [ArgSpec] -> SimplCont -> SimplCont
-pushSimplifiedArgs _env [] k = k
-pushSimplifiedArgs env (arg : args) k
- = case arg of
- TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
- -> ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = rest }
- ValArg { as_arg = arg, as_hole_ty = hole_ty }
- -> ApplyToVal { sc_arg = arg, sc_env = env, sc_dup = Simplified
- , sc_hole_ty = hole_ty, sc_cont = rest }
- CastBy c -> CastIt c rest
- where
- rest = pushSimplifiedArgs env args k
- -- The env has an empty SubstEnv
+pushSimplifiedArgs, pushSimplifiedRevArgs
+ :: SimplEnv
+ -> [ArgSpec] -- In normal, forward order for pushSimplifiedArgs,
+ -- in /reverse/ order for pushSimplifiedRevArgs
+ -> SimplCont -> SimplCont
+pushSimplifiedArgs env args cont = foldr (pushSimplifiedArg env) cont args
+pushSimplifiedRevArgs env args cont = foldl' (\k a -> pushSimplifiedArg env a k) cont args
+
+pushSimplifiedArg :: SimplEnv -> ArgSpec -> SimplCont -> SimplCont
+pushSimplifiedArg _env (TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }) cont
+ = ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = cont }
+pushSimplifiedArg env (ValArg { as_arg = arg, as_hole_ty = hole_ty }) cont
+ = ApplyToVal { sc_arg = arg, sc_env = env, sc_dup = Simplified
+ -- The SubstEnv will be ignored since sc_dup=Simplified
+ , sc_hole_ty = hole_ty, sc_cont = cont }
+pushSimplifiedArg _ (CastBy c) cont = CastIt c cont
argInfoExpr :: OutId -> [ArgSpec] -> OutExpr
-- NB: the [ArgSpec] is reversed so that the first arg
@@ -406,18 +421,14 @@ argInfoExpr fun rev_args
go (TyArg { as_arg_ty = ty } : as) = go as `App` Type ty
go (CastBy co : as) = mkCast (go as) co
+decArgCount :: RewriteCall -> RewriteCall
+decArgCount (TryRules n rules) = TryRules (n-1) rules
+decArgCount rew = rew
-type FunRules = Maybe (Int, [CoreRule]) -- Remaining rules for this function
- -- Nothing => No rules
- -- Just (n, rules) => some rules, requiring at least n more type/value args
-
-decRules :: FunRules -> FunRules
-decRules (Just (n, rules)) = Just (n-1, rules)
-decRules Nothing = Nothing
-
-mkFunRules :: [CoreRule] -> FunRules
-mkFunRules [] = Nothing
-mkFunRules rs = Just (n_required, rs)
+mkTryRules :: [CoreRule] -> RewriteCall
+-- See Note [Rewrite rules and inlining] in GHC.Core.Opt.Simplify.Iteration
+mkTryRules [] = TryInlining
+mkTryRules rs = TryRules n_required rs
where
n_required = maximum (map ruleArity rs)
@@ -516,6 +527,7 @@ contHoleScaling (StrictArg { sc_fun_ty = fun_ty, sc_cont = k })
contHoleScaling (ApplyToTy { sc_cont = k }) = contHoleScaling k
contHoleScaling (ApplyToVal { sc_cont = k }) = contHoleScaling k
contHoleScaling (TickIt _ k) = contHoleScaling k
+
-------------------
countArgs :: SimplCont -> Int
-- Count all arguments, including types, coercions,
@@ -525,6 +537,14 @@ countArgs (ApplyToVal { sc_cont = cont }) = 1 + countArgs cont
countArgs (CastIt _ cont) = countArgs cont
countArgs _ = 0
+countValArgs :: SimplCont -> Int
+-- Count value arguments only
+countValArgs (ApplyToTy { sc_cont = cont }) = 1 + countValArgs cont
+countValArgs (ApplyToVal { sc_cont = cont }) = 1 + countValArgs cont
+countValArgs (CastIt _ cont) = countValArgs cont
+countValArgs _ = 0
+
+-------------------
contArgs :: SimplCont -> (Bool, [ArgSummary], SimplCont)
-- Summarises value args, discards type args and coercions
-- The returned continuation of the call is only used to
@@ -579,29 +599,26 @@ contEvalContext k = case k of
-- and case binder dmds, see addCaseBndrDmd. No priority right now.
-------------------
-mkArgInfo :: SimplEnv
- -> Id
- -> [CoreRule] -- Rules for function
- -> Int -- Number of value args
- -> SimplCont -- Context of the call
- -> ArgInfo
-
-mkArgInfo env fun rules n_val_args call_cont
+mkArgInfo :: SimplEnv -> RuleEnv -> Id -> SimplCont -> ArgInfo
+
+mkArgInfo env rule_base fun cont
| n_val_args < idArity fun -- Note [Unsaturated functions]
= ArgInfo { ai_fun = fun, ai_args = []
- , ai_rules = fun_rules
+ , ai_rewrite = fun_rules
, ai_encl = False
, ai_dmds = vanilla_dmds
, ai_discs = vanilla_discounts }
| otherwise
= ArgInfo { ai_fun = fun
, ai_args = []
- , ai_rules = fun_rules
- , ai_encl = interestingArgContext rules call_cont
+ , ai_rewrite = fun_rules
+ , ai_encl = notNull rules || contHasRules cont
, ai_dmds = add_type_strictness (idType fun) arg_dmds
, ai_discs = arg_discounts }
where
- fun_rules = mkFunRules rules
+ rules = getRules rule_base fun
+ fun_rules = mkTryRules rules
+ n_val_args = countValArgs cont
vanilla_discounts, arg_discounts :: [Int]
vanilla_discounts = repeat 0
@@ -814,7 +831,7 @@ interestingCallContext env cont
-- a build it's *great* to inline it here. So we must ensure that
-- the context for (f x) is not totally uninteresting.
-interestingArgContext :: [CoreRule] -> SimplCont -> Bool
+contHasRules :: SimplCont -> Bool
-- If the argument has form (f x y), where x,y are boring,
-- and f is marked INLINE, then we don't want to inline f.
-- But if the context of the argument is
@@ -822,33 +839,29 @@ interestingArgContext :: [CoreRule] -> SimplCont -> Bool
-- where g has rules, then we *do* want to inline f, in case it
-- exposes a rule that might fire. Similarly, if the context is
-- h (g (f x x))
--- where h has rules, then we do want to inline f; hence the
--- call_cont argument to interestingArgContext
+-- where h has rules, then we do want to inline f. So contHasRules
+-- tries to see if the context of the f-call is a call to a function
+-- with rules.
--
--- The ai-rules flag makes this happen; if it's
+-- The ai_encl flag makes this happen; if it's
-- set, the inliner gets just enough keener to inline f
-- regardless of how boring f's arguments are, if it's marked INLINE
--
-- The alternative would be to *always* inline an INLINE function,
-- regardless of how boring its context is; but that seems overkill
-- For example, it'd mean that wrapper functions were always inlined
---
--- The call_cont passed to interestingArgContext is the context of
--- the call itself, e.g. g <hole> in the example above
-interestingArgContext rules call_cont
- = notNull rules || enclosing_fn_has_rules
+contHasRules cont
+ = go cont
where
- enclosing_fn_has_rules = go call_cont
-
- go (Select {}) = False
- go (ApplyToVal {}) = False -- Shouldn't really happen
- go (ApplyToTy {}) = False -- Ditto
- go (StrictArg { sc_fun = fun }) = ai_encl fun
- go (StrictBind {}) = False -- ??
- go (CastIt _ c) = go c
- go (Stop _ RuleArgCtxt _) = True
- go (Stop _ _ _) = False
- go (TickIt _ c) = go c
+ go (ApplyToVal { sc_cont = cont }) = go cont
+ go (ApplyToTy { sc_cont = cont }) = go cont
+ go (CastIt _ cont) = go cont
+ go (StrictArg { sc_fun = fun }) = ai_encl fun
+ go (Stop _ RuleArgCtxt _) = True
+ go (TickIt _ c) = go c
+ go (Select {}) = False
+ go (StrictBind {}) = False -- ??
+ go (Stop _ _ _) = False
{- Note [Interesting arguments]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1320,11 +1333,11 @@ preInlineUnconditionally
-- Reason: we don't want to inline single uses, or discard dead bindings,
-- for unlifted, side-effect-ful bindings
preInlineUnconditionally env top_lvl bndr rhs rhs_env
- | not pre_inline_unconditionally = Nothing
+ | not pre_inline = Nothing
| not active = Nothing
| isTopLevel top_lvl && isDeadEndId bndr = Nothing -- Note [Top-level bottoming Ids]
| isCoVar bndr = Nothing -- Note [Do not inline CoVars unconditionally]
- | isExitJoinId bndr = Nothing -- Note [Do not inline exit join points]
+ | keep_exits, isExitJoinId bndr = Nothing -- Note [Do not inline exit join points]
-- in module Exitify
| not (one_occ (idOccInfo bndr)) = Nothing
| not (isStableUnfolding unf) = Just $! (extend_subst_with rhs)
@@ -1334,19 +1347,36 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
, Just inl <- maybeUnfoldingTemplate unf = Just $! (extend_subst_with inl)
| otherwise = Nothing
where
+ mode = seMode env
+ phase = sm_phase mode
+ keep_exits = sm_keep_exits mode
+ pre_inline = sm_pre_inline mode
+
unf = idUnfolding bndr
extend_subst_with inl_rhs = extendIdSubst env bndr $! (mkContEx rhs_env inl_rhs)
one_occ IAmDead = True -- Happens in ((\x.1) v)
+
one_occ OneOcc{ occ_n_br = 1
, occ_in_lam = NotInsideLam } = isNotTopLevel top_lvl || early_phase
+
one_occ OneOcc{ occ_n_br = 1
, occ_in_lam = IsInsideLam
, occ_int_cxt = IsInteresting } = canInlineInLam rhs
- one_occ _ = False
- pre_inline_unconditionally = sePreInline env
- active = isActive (sePhase env) (inlinePragmaActivation inline_prag)
+ one_occ OneOcc{ occ_n_br = 1 } -- Inline join point that are used once, even inside
+ | isJoinId bndr = True -- lambdas (which are presumably other join points)
+ -- E.g. join j x = rhs in
+ -- joinrec k y = ....j x....
+ -- Here j must be an exit for k, and we can safely inline it under the lambda
+ -- This includes the case where j is nullary: a nullary join point is just the
+ -- same as an arity-1 one. So we don't look at occ_int_cxt.
+ -- All of this only applies if keep_exits is False, otherwise the
+ -- earlier guard on preInlineUnconditionally would have fired
+
+ one_occ _ = False
+
+ active = isActive phase (inlinePragmaActivation inline_prag)
-- See Note [pre/postInlineUnconditionally in gentle mode]
inline_prag = idInlinePragma bndr
@@ -1378,7 +1408,7 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
-- not ticks. Counting ticks cannot be duplicated, and non-counting
-- ticks around a Lam will disappear anyway.
- early_phase = sePhase env /= FinalPhase
+ early_phase = phase /= FinalPhase
-- If we don't have this early_phase test, consider
-- x = length [1,2,3]
-- The full laziness pass carefully floats all the cons cells to
=====================================
compiler/GHC/Core/Opt/SpecConstr.hs
=====================================
@@ -881,7 +881,7 @@ data SpecConstrOpts = SpecConstrOpts
, sc_count :: !(Maybe Int)
-- ^ Max # of specialisations for any one function. Nothing => no limit.
- -- See Note [Avoiding exponential blowup].
+ -- See Note [Avoiding exponential blowup] and decreaseSpecCount
, sc_recursive :: !Int
-- ^ Max # of specialisations over recursive type. Stops
@@ -1098,16 +1098,20 @@ extendCaseBndrs env scrut case_bndr con alt_bndrs
decreaseSpecCount :: ScEnv -> Int -> ScEnv
-- See Note [Avoiding exponential blowup]
-decreaseSpecCount env n_specs
+decreaseSpecCount env _n_specs
= env { sc_force = False -- See Note [Forcing specialisation]
- , sc_opts = (sc_opts env)
- { sc_count = case sc_count $ sc_opts env of
- Nothing -> Nothing
- Just n -> Just $! (n `div` (n_specs + 1))
+ , sc_opts = opts { sc_count = case sc_count opts of
+ Nothing -> Nothing
+ Just n -> Just $! dec n
}
}
- -- The "+1" takes account of the original function;
- -- See Note [Avoiding exponential blowup]
+ where
+ opts = sc_opts env
+ dec n = n `div` 2 -- See Note [Avoiding exponential blowup]
+
+ -- Or: n `div` (n_specs + 1)
+ -- See the historical note part of Note [Avoiding exponential blowup]
+ -- The "+1" takes account of the original function;
---------------------------------------------------
-- See Note [Forcing specialisation]
@@ -1183,9 +1187,20 @@ we can specialise $j2, and similarly $j3. Even if we make just *one*
specialisation of each, because we also have the original we'll get 2^n
copies of $j3, which is not good.
-So when recursively specialising we divide the sc_count by the number of
-copies we are making at this level, including the original.
-
+So when recursively specialising we divide the sc_count (the maximum
+number of specialisations, in the ScEnv) by two. You might think that
+gives us n*(n/2)*(n/4)... copies of the innnermost thing, which is
+still exponential the depth. But we use integer division, rounding
+down, so if the starting sc_count is 3, we'll get 3 -> 1 -> 0, and
+stop. In fact, simply subtracting 1 would be good enough, for the same
+reason.
+
+Historical note: in the past we divided by (n_specs+1), where n_specs
+is the number of specialisations at this level; but that gets us down
+to zero jolly quickly, which I found led to some regressions. (An
+example is nofib/spectral/fibheaps, the getMin' function inside the
+outer function $sfibToList, which has several interesting call
+patterns.)
************************************************************************
* *
@@ -1512,8 +1527,10 @@ scExpr' env (Case scrut b ty alts)
scrut_occ = case con of
DataAlt dc -- See Note [Do not specialise evals]
| not (single_alt && all deadArgOcc arg_occs)
- -> ScrutOcc (unitUFM dc arg_occs)
- _ -> UnkOcc
+ -> -- pprTrace "sc_alt1" (ppr b' $$ ppr con $$ ppr bs $$ ppr arg_occs) $
+ ScrutOcc (unitUFM dc arg_occs)
+ _ -> -- pprTrace "sc_alt1" (ppr b' $$ ppr con $$ ppr bs $$ ppr arg_occs) $
+ UnkOcc
; return (usg', b_occ `combineOcc` scrut_occ, Alt con bs2 rhs') }
@@ -1792,16 +1809,19 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
, not (null arg_bndrs) -- Only specialise functions
, Just all_calls <- lookupVarEnv bind_calls fn -- Some calls to it
= -- pprTrace "specialise entry {" (ppr fn <+> ppr all_calls) $
- do { (boring_call, new_pats) <- callsToNewPats env fn spec_info arg_occs all_calls
+ do { (boring_call, pats_discarded, new_pats)
+ <- callsToNewPats env fn spec_info arg_occs all_calls
; let n_pats = length new_pats
--- ; if (not (null new_pats) || isJust mb_unspec) then
--- pprTrace "specialise" (vcat [ ppr fn <+> text "with" <+> int n_pats <+> text "good patterns"
--- , text "mb_unspec" <+> ppr (isJust mb_unspec)
--- , text "arg_occs" <+> ppr arg_occs
--- , text "good pats" <+> ppr new_pats]) $
--- return ()
--- else return ()
+-- ; when (not (null new_pats) || isJust mb_unspec) $
+-- pprTraceM "specialise" (vcat [ ppr fn <+> text "with" <+> int n_pats <+> text "good patterns"
+-- , text "boring_call:" <+> ppr boring_call
+-- , text "pats_discarded:" <+> ppr pats_discarded
+-- , text "old spec_count" <+> ppr spec_count
+-- , text "spec count limit" <+> ppr (sc_count (sc_opts env))
+-- , text "mb_unspec" <+> ppr (isJust mb_unspec)
+-- , text "arg_occs" <+> ppr arg_occs
+-- , text "new_pats" <+> ppr new_pats])
; let spec_env = decreaseSpecCount env n_pats
; (spec_usgs, new_specs) <- mapAndUnzipM (spec_one spec_env fn arg_bndrs body)
@@ -1810,7 +1830,7 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
; let spec_usg = combineUsages spec_usgs
- unspec_rhs_needed = boring_call || isExportedId fn
+ unspec_rhs_needed = pats_discarded || boring_call || isExportedId fn
-- If there were any boring calls among the seeds (= all_calls), then those
-- calls will call the un-specialised function. So we should use the seeds
@@ -1821,15 +1841,14 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
-> (spec_usg `combineUsage` rhs_usg, Nothing)
_ -> (spec_usg, mb_unspec)
--- ; pprTrace "specialise return }"
--- (vcat [ ppr fn
--- , text "boring_call:" <+> ppr boring_call
--- , text "new calls:" <+> ppr (scu_calls new_usg)]) $
--- return ()
+-- ; pprTraceM "specialise return }" $
+-- vcat [ ppr fn
+-- , text "unspec_rhs_needed:" <+> ppr unspec_rhs_needed
+-- , text "new calls:" <+> ppr (scu_calls new_usg)]
- ; return (new_usg, SI { si_specs = new_specs ++ specs
- , si_n_specs = spec_count + n_pats
- , si_mb_unspec = mb_unspec' }) }
+ ; return (new_usg, SI { si_specs = new_specs ++ specs
+ , si_n_specs = spec_count + n_pats
+ , si_mb_unspec = mb_unspec' }) }
| otherwise -- No calls, inactive, or not a function
-- Behave as if there was a single, boring call
@@ -1872,7 +1891,9 @@ spec_one :: ScEnv
spec_one env fn arg_bndrs body (call_pat, rule_number)
| CP { cp_qvars = qvars, cp_args = pats, cp_strict_args = cbv_args } <- call_pat
- = do { spec_uniq <- getUniqueM
+ = do { -- pprTraceM "spec_one {" (ppr fn <+> ppr pats)
+
+ ; spec_uniq <- getUniqueM
; let env1 = extendScSubstList (extendScInScope env qvars)
(arg_bndrs `zip` pats)
(body_env, extra_bndrs) = extendBndrs env1 (dropList pats arg_bndrs)
@@ -1898,9 +1919,6 @@ spec_one env fn arg_bndrs body (call_pat, rule_number)
-- ; pprTraceM "body_subst_for" $ ppr (spec_occ) $$ ppr (sc_subst body_env)
; (spec_usg, spec_body) <- scExpr body_env body
--- ; pprTrace "done spec_one }" (ppr fn $$ ppr (scu_calls spec_usg)) $
--- return ()
-
-- And build the results
; (qvars', pats') <- generaliseDictPats qvars pats
; let spec_body_ty = exprType spec_body
@@ -1944,21 +1962,22 @@ spec_one env fn arg_bndrs body (call_pat, rule_number)
fn_name qvars' pats' rule_rhs
-- See Note [Transfer activation]
- -- ; pprTraceM "spec_one {" (vcat [ text "function:" <+> ppr fn <+> braces (ppr (idUnique fn))
- -- , text "sc_count:" <+> ppr (sc_count env)
- -- , text "pats:" <+> ppr pats
- -- , text "call_pat:" <+> ppr call_pat
- -- , text "-->" <+> ppr spec_name
- -- , text "bndrs" <+> ppr arg_bndrs
- -- , text "extra_bndrs" <+> ppr extra_bndrs
- -- , text "cbv_args" <+> ppr cbv_args
- -- , text "spec_lam_args" <+> ppr spec_lam_args
- -- , text "spec_call_args" <+> ppr spec_call_args
- -- , text "rule_rhs" <+> ppr rule_rhs
- -- , text "adds_void_worker_arg" <+> ppr add_void_arg
- -- , text "body" <+> ppr body
- -- , text "spec_rhs" <+> ppr spec_rhs
- -- , text "how_bound" <+> ppr (sc_how_bound env) ])
+-- ; pprTraceM "spec_one end }" $
+-- vcat [ text "function:" <+> ppr fn <+> braces (ppr (idUnique fn))
+-- , text "pats:" <+> ppr pats
+-- , text "call_pat:" <+> ppr call_pat
+-- , text "-->" <+> ppr spec_name
+-- , text "bndrs" <+> ppr arg_bndrs
+-- , text "extra_bndrs" <+> ppr extra_bndrs
+-- , text "cbv_args" <+> ppr cbv_args
+-- , text "spec_lam_args" <+> ppr spec_lam_args
+-- , text "spec_call_args" <+> ppr spec_call_args
+-- , text "rule_rhs" <+> ppr rule_rhs
+-- , text "adds_void_worker_arg" <+> ppr add_void_arg
+---- , text "body" <+> ppr body
+---- , text "spec_rhs" <+> ppr spec_rhs
+---- , text "how_bound" <+> ppr (sc_how_bound env) ]
+-- ]
; return (spec_usg, OS { os_pat = call_pat, os_rule = rule
, os_id = spec_id
, os_rhs = spec_rhs }) }
@@ -2328,7 +2347,9 @@ instance Outputable CallPat where
callsToNewPats :: ScEnv -> Id
-> SpecInfo
-> [ArgOcc] -> [Call]
- -> UniqSM (Bool, [CallPat])
+ -> UniqSM ( Bool -- At least one boring call
+ , Bool -- Patterns were discarded
+ , [CallPat] ) -- Patterns to specialise
-- Result has no duplicate patterns,
-- nor ones mentioned in si_specs (hence "new" patterns)
-- Bool indicates that there was at least one boring pattern
@@ -2360,12 +2381,11 @@ callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
-- Discard specialisations if there are too many of them
(pats_were_discarded, trimmed_pats) = trim_pats env fn spec_info small_pats
--- ; pprTrace "callsToPats" (vcat [ text "calls to" <+> ppr fn <> colon <+> ppr calls
--- , text "done_specs:" <+> ppr (map os_pat done_specs)
--- , text "good_pats:" <+> ppr good_pats ]) $
--- return ()
+-- ; pprTraceM "callsToPats" (vcat [ text "calls to" <+> ppr fn <> colon <+> ppr calls
+-- , text "done_specs:" <+> ppr (map os_pat done_specs)
+-- , text "trimmed_pats:" <+> ppr trimmed_pats ])
- ; return (have_boring_call || pats_were_discarded, trimmed_pats) }
+ ; return (have_boring_call, pats_were_discarded, trimmed_pats) }
-- If any of the calls does not give rise to a specialisation, either
-- because it is boring, or because there are too many specialisations,
-- return a flag to say so, so that we know to keep the original function.
@@ -2474,29 +2494,29 @@ callToPats env bndr_occs call@(Call fn args con_env)
sanitise id = updateIdTypeAndMult expandTypeSynonyms id
-- See Note [Free type variables of the qvar types]
- -- Bad coercion variables: see Note [SpecConstr and casts]
- bad_covars :: CoVarSet
+
+ -- Check for bad coercion variables: see Note [SpecConstr and casts]
+ ; let bad_covars :: CoVarSet
bad_covars = mapUnionVarSet get_bad_covars pats
get_bad_covars :: CoreArg -> CoVarSet
get_bad_covars (Type ty) = filterVarSet bad_covar (tyCoVarsOfType ty)
get_bad_covars _ = emptyVarSet
bad_covar v = isId v && not (is_in_scope v)
- ; -- pprTrace "callToPats" (ppr args $$ ppr bndr_occs) $
- warnPprTrace (not (isEmptyVarSet bad_covars))
+ ; warnPprTrace (not (isEmptyVarSet bad_covars))
"SpecConstr: bad covars"
(ppr bad_covars $$ ppr call) $
+
if interesting && isEmptyVarSet bad_covars
- then do
- -- pprTraceM "callToPatsOut" (
- -- text "fn:" <+> ppr fn $$
- -- text "args:" <+> ppr args $$
- -- text "in_scope:" <+> ppr in_scope $$
- -- -- text "in_scope:" <+> ppr in_scope $$
- -- text "pat_fvs:" <+> ppr pat_fvs
- -- )
- -- ppr (CP { cp_qvars = qvars', cp_args = pats })) >>
- return (Just (CP { cp_qvars = qvars', cp_args = pats, cp_strict_args = concat cbv_ids }))
+ then do { let cp_res = CP { cp_qvars = qvars', cp_args = pats
+ , cp_strict_args = concat cbv_ids }
+-- ; pprTraceM "callToPatsOut" $
+-- vcat [ text "fn:" <+> ppr fn
+-- , text "args:" <+> ppr args
+-- , text "bndr_occs:" <+> ppr bndr_occs
+-- , text "pat_fvs:" <+> ppr pat_fvs
+-- , text "cp_res:" <+> ppr cp_res ]
+ ; return (Just cp_res) }
else return Nothing }
-- argToPat takes an actual argument, and returns an abstracted
=====================================
compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
=====================================
@@ -2,7 +2,6 @@ module GHC.Driver.Config.Core.Opt.Simplify
( initSimplifyExprOpts
, initSimplifyOpts
, initSimplMode
- , initGentleSimplMode
) where
import GHC.Prelude
@@ -27,12 +26,13 @@ import GHC.Types.Var ( Var )
initSimplifyExprOpts :: DynFlags -> InteractiveContext -> SimplifyExprOpts
initSimplifyExprOpts dflags ic = SimplifyExprOpts
{ se_fam_inst = snd $ ic_instances ic
- , se_mode = (initSimplMode dflags InitialPhase "GHCi")
- { sm_inline = False
- -- Do not do any inlining, in case we expose some
- -- unboxed tuple stuff that confuses the bytecode
+
+ , se_mode = (initSimplMode dflags) { sm_names = ["GHCi"]
+ , sm_inline = False }
+ -- sm_inline: do not do any inlining, in case we expose
+ -- some unboxed tuple stuff that confuses the bytecode
-- interpreter
- }
+
, se_top_env_cfg = TopEnvConfig
{ te_history_size = historySize dflags
, te_tick_factor = simplTickFactor dflags
@@ -57,31 +57,25 @@ initSimplifyOpts dflags extra_vars iterations mode rule_base = let
}
in opts
-initSimplMode :: DynFlags -> CompilerPhase -> String -> SimplMode
-initSimplMode dflags phase name = SimplMode
- { sm_names = [name]
- , sm_phase = phase
- , sm_rules = gopt Opt_EnableRewriteRules dflags
- , sm_eta_expand = gopt Opt_DoLambdaEtaExpansion dflags
- , sm_cast_swizzle = True
- , sm_inline = True
- , sm_uf_opts = unfoldingOpts dflags
- , sm_case_case = True
- , sm_pre_inline = gopt Opt_SimplPreInlining dflags
- , sm_float_enable = floatEnable dflags
+initSimplMode :: DynFlags -> SimplMode
+initSimplMode dflags = SimplMode
+ { sm_names = ["Unknown simplifier run"] -- Always overriden
+ , sm_phase = InitialPhase
+ , sm_rules = gopt Opt_EnableRewriteRules dflags
+ , sm_eta_expand = gopt Opt_DoLambdaEtaExpansion dflags
+ , sm_pre_inline = gopt Opt_SimplPreInlining dflags
, sm_do_eta_reduction = gopt Opt_DoEtaReduction dflags
- , sm_arity_opts = initArityOpts dflags
- , sm_rule_opts = initRuleOpts dflags
- , sm_case_folding = gopt Opt_CaseFolding dflags
- , sm_case_merge = gopt Opt_CaseMerge dflags
- , sm_co_opt_opts = initOptCoercionOpts dflags
- }
-
-initGentleSimplMode :: DynFlags -> SimplMode
-initGentleSimplMode dflags = (initSimplMode dflags InitialPhase "Gentle")
- { -- Don't do case-of-case transformations.
- -- This makes full laziness work better
- sm_case_case = False
+ , sm_uf_opts = unfoldingOpts dflags
+ , sm_float_enable = floatEnable dflags
+ , sm_arity_opts = initArityOpts dflags
+ , sm_rule_opts = initRuleOpts dflags
+ , sm_case_folding = gopt Opt_CaseFolding dflags
+ , sm_case_merge = gopt Opt_CaseMerge dflags
+ , sm_co_opt_opts = initOptCoercionOpts dflags
+ , sm_cast_swizzle = True
+ , sm_inline = True
+ , sm_case_case = True
+ , sm_keep_exits = False
}
floatEnable :: DynFlags -> FloatEnable
=====================================
compiler/GHC/Utils/Json.hs
=====================================
@@ -14,6 +14,7 @@ data JsonDoc where
JSBool :: Bool -> JsonDoc
JSInt :: Int -> JsonDoc
JSString :: String -> JsonDoc
+ -- ^ The 'String' is unescaped
JSArray :: [JsonDoc] -> JsonDoc
JSObject :: [(String, JsonDoc)] -> JsonDoc
@@ -57,7 +58,7 @@ class ToJson a where
json :: a -> JsonDoc
instance ToJson String where
- json = JSString . escapeJsonString
+ json = JSString
instance ToJson Int where
json = JSInt
=====================================
compiler/GHC/Utils/Monad.hs
=====================================
@@ -163,7 +163,10 @@ mapSndM = traverse . traverse
-- | Monadic version of concatMap
concatMapM :: (Monad m, Traversable f) => (a -> m [b]) -> f a -> m [b]
concatMapM f xs = liftM concat (mapM f xs)
-{-# SPECIALIZE concatMapM :: (Monad m) => (a -> m [b]) -> [a] -> m [b] #-}
+{-# INLINE concatMapM #-}
+-- It's better to inline to inline this than to specialise
+-- concatMapM :: (Monad m) => (a -> m [b]) -> [a] -> m [b]
+-- Inlining cuts compiler allocation by around 1%
-- | Applicative version of mapMaybe
mapMaybeM :: Applicative m => (a -> m (Maybe b)) -> [a] -> m [b]
=====================================
testsuite/tests/lib/integer/Makefile
=====================================
@@ -11,8 +11,9 @@ CHECK2 = grep -q -- '$1' folding.simpl || \
.PHONY: integerConstantFolding
integerConstantFolding:
- '$(TEST_HC)' -Wall -v0 -O --make integerConstantFolding -fforce-recomp -ddump-simpl > folding.simpl
+ '$(TEST_HC)' -Wall -v0 -O --make integerConstantFolding -fforce-recomp -ddump-simpl -dno-debug-output > folding.simpl
# All the 100nnn values should be constant-folded away
+# -dno-debug-output suppresses a "Glomming" message
! grep -q '\<100[0-9][0-9][0-9]\>' folding.simpl || { echo "Unfolded values found"; grep '\<100[0-9][0-9][0-9]\>' folding.simpl; }
$(call CHECK,\<200007\>,plusInteger)
$(call CHECK,\<683234160\>,timesInteger)
@@ -64,8 +65,9 @@ IntegerConversionRules:
.PHONY: naturalConstantFolding
naturalConstantFolding:
- '$(TEST_HC)' -Wall -v0 -O --make naturalConstantFolding -fforce-recomp -ddump-simpl > folding.simpl
+ '$(TEST_HC)' -Wall -v0 -O --make naturalConstantFolding -fforce-recomp -ddump-simpl -dno-debug-output > folding.simpl
# All the 100nnn values should be constant-folded away
+# -dno-debug-output suppresses a "Glomming" message
! grep -q '\<100[0-9][0-9][0-9]\>' folding.simpl || { echo "Unfolded values found"; grep '\<100[0-9][0-9][0-9]\>' folding.simpl; }
# Bit arithmetic
$(call CHECK,\<532\>,andNatural)
=====================================
testsuite/tests/simplCore/should_compile/T21148.hs
=====================================
@@ -0,0 +1,12 @@
+module T211148 where
+
+-- The point of this test is that f should get a (nested)
+-- CPR property, with a worker of type
+-- $wf :: Int# -> State# RealWorld -> (# State# RealWorld, Int# #)
+
+{-# NOINLINE f #-}
+-- The NOINLINE makes GHC do a worker/wrapper split
+-- even though f is small
+f :: Int -> IO Int
+f x = return $! sum [0..x]
+
=====================================
testsuite/tests/simplCore/should_compile/T21148.stderr
=====================================
@@ -0,0 +1,126 @@
+
+==================== Tidy Core ====================
+Result size of Tidy Core
+ = {terms: 73, types: 80, coercions: 6, joins: 2/2}
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T211148.$trModule4 :: GHC.Prim.Addr#
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 0}]
+T211148.$trModule4 = "main"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T211148.$trModule3 :: GHC.Types.TrName
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T211148.$trModule3 = GHC.Types.TrNameS T211148.$trModule4
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T211148.$trModule2 :: GHC.Prim.Addr#
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 0}]
+T211148.$trModule2 = "T211148"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T211148.$trModule1 :: GHC.Types.TrName
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T211148.$trModule1 = GHC.Types.TrNameS T211148.$trModule2
+
+-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
+T211148.$trModule :: GHC.Types.Module
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T211148.$trModule
+ = GHC.Types.Module T211148.$trModule3 T211148.$trModule1
+
+-- RHS size: {terms: 41, types: 35, coercions: 0, joins: 2/2}
+T211148.$wf [InlPrag=NOINLINE]
+ :: GHC.Prim.Int#
+ -> GHC.Prim.State# GHC.Prim.RealWorld
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Prim.Int# #)
+[GblId, Arity=2, Str=<L><L>, Unf=OtherCon []]
+T211148.$wf
+ = \ (ww_s179 :: GHC.Prim.Int#)
+ (eta_s17b [OS=OneShot] :: GHC.Prim.State# GHC.Prim.RealWorld) ->
+ case GHC.Prim.># 0# ww_s179 of {
+ __DEFAULT ->
+ join {
+ exit_X0 [Dmd=SC(S,C(1,!P(L,L)))]
+ :: GHC.Prim.Int#
+ -> GHC.Prim.Int#
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Prim.Int# #)
+ [LclId[JoinId(2)(Nothing)], Arity=2, Str=<L><L>]
+ exit_X0 (x_s16Z [OS=OneShot] :: GHC.Prim.Int#)
+ (ww1_s172 [OS=OneShot] :: GHC.Prim.Int#)
+ = (# eta_s17b, GHC.Prim.+# ww1_s172 x_s16Z #) } in
+ joinrec {
+ $wgo3_s175 [InlPrag=[2], Occ=LoopBreaker, Dmd=SC(S,C(1,!P(L,L)))]
+ :: GHC.Prim.Int#
+ -> GHC.Prim.Int#
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Prim.Int# #)
+ [LclId[JoinId(2)(Nothing)], Arity=2, Str=<L><L>, Unf=OtherCon []]
+ $wgo3_s175 (x_s16Z :: GHC.Prim.Int#) (ww1_s172 :: GHC.Prim.Int#)
+ = case GHC.Prim.==# x_s16Z ww_s179 of {
+ __DEFAULT ->
+ jump $wgo3_s175
+ (GHC.Prim.+# x_s16Z 1#) (GHC.Prim.+# ww1_s172 x_s16Z);
+ 1# -> jump exit_X0 x_s16Z ww1_s172
+ }; } in
+ jump $wgo3_s175 0# 0#;
+ 1# -> (# eta_s17b, 0# #)
+ }
+
+-- RHS size: {terms: 14, types: 19, coercions: 0, joins: 0/0}
+T211148.f1 [InlPrag=NOINLINE[final]]
+ :: Int
+ -> GHC.Prim.State# GHC.Prim.RealWorld
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, Int #)
+[GblId,
+ Arity=2,
+ Str=<1!P(L)><L>,
+ Cpr=1(, 1),
+ Unf=Unf{Src=StableSystem, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True,
+ Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=False)
+ Tmpl= \ (x_s177 [Occ=Once1!] :: Int)
+ (eta_s17b [Occ=Once1, OS=OneShot]
+ :: GHC.Prim.State# GHC.Prim.RealWorld) ->
+ case x_s177 of { GHC.Types.I# ww_s179 [Occ=Once1] ->
+ case T211148.$wf ww_s179 eta_s17b of
+ { (# ww1_s17e [Occ=Once1], ww2_s17j [Occ=Once1] #) ->
+ (# ww1_s17e, GHC.Types.I# ww2_s17j #)
+ }
+ }}]
+T211148.f1
+ = \ (x_s177 :: Int)
+ (eta_s17b [OS=OneShot] :: GHC.Prim.State# GHC.Prim.RealWorld) ->
+ case x_s177 of { GHC.Types.I# ww_s179 ->
+ case T211148.$wf ww_s179 eta_s17b of { (# ww1_s17e, ww2_s17j #) ->
+ (# ww1_s17e, GHC.Types.I# ww2_s17j #)
+ }
+ }
+
+-- RHS size: {terms: 1, types: 0, coercions: 6, joins: 0/0}
+f [InlPrag=NOINLINE[final]] :: Int -> IO Int
+[GblId,
+ Arity=2,
+ Str=<1!P(L)><L>,
+ Cpr=1(, 1),
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True,
+ Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)}]
+f = T211148.f1
+ `cast` (<Int>_R %<'Many>_N ->_R Sym (GHC.Types.N:IO[0] <Int>_R)
+ :: (Int
+ -> GHC.Prim.State# GHC.Prim.RealWorld
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, Int #))
+ ~R# (Int -> IO Int))
+
+
+
=====================================
testsuite/tests/simplCore/should_compile/T21851.hs
=====================================
@@ -0,0 +1,15 @@
+{-# OPTIONS_GHC -ddump-simpl #-}
+
+module T21851 (g') where
+import T21851a
+
+g :: Num a => a -> a
+g x = fst (f x)
+{-# NOINLINE[99] g #-}
+
+g' :: Int -> Int
+g' = g
+
+-- We should see a call to a /specialised/ verion of `f`,
+-- something like
+-- g' = \ (x :: Int) -> case T21851a.$w$sf x of { (# ww, ww1 #) -> ww }
=====================================
testsuite/tests/simplCore/should_compile/T21851.stderr
=====================================
@@ -0,0 +1,19 @@
+[1 of 2] Compiling T21851a ( T21851a.hs, T21851a.o )
+[2 of 2] Compiling T21851 ( T21851.hs, T21851.o )
+
+==================== Tidy Core ====================
+Result size of Tidy Core
+ = {terms: 7, types: 10, coercions: 0, joins: 0/0}
+
+-- RHS size: {terms: 6, types: 8, coercions: 0, joins: 0/0}
+g' :: Int -> Int
+[GblId,
+ Arity=1,
+ Str=<L>,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 30 0}]
+g'
+ = \ (x :: Int) -> case T21851a.$w$sf x of { (# ww, ww1 #) -> ww }
+
+
+
=====================================
testsuite/tests/simplCore/should_compile/T21851a.hs
=====================================
@@ -0,0 +1,5 @@
+module T21851a where
+
+f :: Num b => b -> (b, b) -- note: recursive to prevent inlining
+f x = (x + 1, snd (f x)) -- on such a small example
+{-# SPECIALIZE f :: Int -> (Int, Int) #-}
=====================================
testsuite/tests/simplCore/should_compile/T22097.hs
=====================================
@@ -0,0 +1,7 @@
+{-# OPTIONS_GHC -ddump-simpl #-}
+{-# LANGUAGE TypeApplications #-}
+module T22097 where
+import T22097a ( isEven )
+
+main :: IO ()
+main = print $ isEven @Int 10
=====================================
testsuite/tests/simplCore/should_compile/T22097.stderr
=====================================
@@ -0,0 +1,46 @@
+[1 of 2] Compiling T22097a ( T22097a.hs, T22097a.o )
+[2 of 2] Compiling T22097 ( T22097.hs, T22097.o )
+
+==================== Tidy Core ====================
+Result size of Tidy Core
+ = {terms: 15, types: 14, coercions: 3, joins: 0/0}
+
+-- RHS size: {terms: 5, types: 1, coercions: 0, joins: 0/0}
+T22097.main2 :: String
+[GblId,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
+ WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 30 0}]
+T22097.main2
+ = case T22097a.$wgoEven 10# of { (# #) -> GHC.Show.$fShowBool4 }
+
+-- RHS size: {terms: 6, types: 2, coercions: 0, joins: 0/0}
+T22097.main1
+ :: GHC.Prim.State# GHC.Prim.RealWorld
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
+[GblId,
+ Arity=1,
+ Str=<L>,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 40 0}]
+T22097.main1
+ = \ (eta [OS=OneShot] :: GHC.Prim.State# GHC.Prim.RealWorld) ->
+ GHC.IO.Handle.Text.hPutStr2
+ GHC.IO.Handle.FD.stdout T22097.main2 GHC.Types.True eta
+
+-- RHS size: {terms: 1, types: 0, coercions: 3, joins: 0/0}
+main :: IO ()
+[GblId,
+ Arity=1,
+ Str=<L>,
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+ WorkFree=True, Expandable=True,
+ Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)}]
+main
+ = T22097.main1
+ `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
+ :: (GHC.Prim.State# GHC.Prim.RealWorld
+ -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #))
+ ~R# IO ())
+
+
+
=====================================
testsuite/tests/simplCore/should_compile/T22097a.hs
=====================================
@@ -0,0 +1,23 @@
+module T22097a
+ ( isEven, isOdd )
+where
+
+{-# SPECIALIZE isEven :: Int -> Bool #-}
+isEven :: Integral a => a -> Bool
+isEven = fst evenOdd
+
+{-# SPECIALIZE isOdd :: Int -> Bool #-}
+isOdd :: Integral a => a -> Bool
+isOdd = snd evenOdd
+
+evenOdd :: Integral a => (a -> Bool, a -> Bool)
+evenOdd = (goEven, goOdd)
+ where
+ goEven n
+ | n < 0 = goEven (- n)
+ | n > 0 = goOdd (n - 1)
+ | otherwise = True
+
+ goOdd n
+ | n < 0 = goOdd n
+ | otherwise = goEven n
=====================================
testsuite/tests/simplCore/should_compile/T6056.stderr
=====================================
@@ -1,4 +1,4 @@
Rule fired: SPEC/T6056 $wsmallerAndRest @Int (T6056)
Rule fired: SPEC/T6056 $wsmallerAndRest @Int (T6056)
Rule fired: SPEC/T6056 $wsmallerAndRest @Int (T6056)
-Rule fired: SPEC/T6056 $wsmallerAndRest @Int (T6056)
+Rule fired: SPEC/T6056 smallerAndRest @Int (T6056)
=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -429,4 +429,9 @@ test('T21763a', only_ways(['optasm']), compile, ['-O2 -ddump-rules'])
test('T22028', normal, compile, ['-O -ddump-rule-firings'])
test('T22114', normal, compile, ['-O'])
test('T21286', normal, multimod_compile, ['T21286', '-O -ddump-rule-firings'])
+test('T21148', [grep_errmsg(r'Cpr=') ], compile, ['-O -ddump-simpl'])
+# One module, T21851.hs, has OPTIONS_GHC -ddump-simpl
+test('T21851', [grep_errmsg(r'case.*w\$sf') ], multimod_compile, ['T21851', '-O -dno-typeable-binds -dsuppress-uniques'])
+# One module, T22097.hs, has OPTIONS_GHC -ddump-simpl
+test('T22097', [grep_errmsg(r'case.*wgoEven') ], multimod_compile, ['T22097', '-O -dno-typeable-binds -dsuppress-uniques'])
=====================================
testsuite/tests/stranal/should_compile/T21128.hs
=====================================
@@ -2,6 +2,10 @@ module T21128 where
import T21128a
+{- This test originally had some unnecessary reboxing of y
+in the hot path of $wtheresCrud. That reboxing should
+not happen. -}
+
theresCrud :: Int -> Int -> Int
theresCrud x y = go x
where
@@ -9,3 +13,4 @@ theresCrud x y = go x
go 1 = index x y 1
go n = go (n-1)
{-# NOINLINE theresCrud #-}
+
=====================================
testsuite/tests/stranal/should_compile/T21128.stderr
=====================================
@@ -1,7 +1,7 @@
==================== Tidy Core ====================
Result size of Tidy Core
- = {terms: 137, types: 92, coercions: 4, joins: 0/0}
+ = {terms: 125, types: 68, coercions: 4, joins: 0/0}
lvl = "error"#
@@ -29,17 +29,11 @@ lvl9 = SrcLoc lvl2 lvl3 lvl5 lvl6 lvl7 lvl6 lvl8
lvl10 = PushCallStack lvl1 lvl9 EmptyCallStack
-$windexError
- = \ @a @b ww eta eta1 eta2 ->
- error
- (lvl10 `cast` <Co:4> :: CallStack ~R# (?callStack::CallStack))
- (++ (ww eta) (++ (ww eta1) (ww eta2)))
-
indexError
= \ @a @b $dShow eta eta1 eta2 ->
- case $dShow of { C:Show ww ww1 ww2 ->
- $windexError ww1 eta eta1 eta2
- }
+ error
+ (lvl10 `cast` <Co:4> :: ...)
+ (++ (show $dShow eta) (++ (show $dShow eta1) (show $dShow eta2)))
$trModule3 = TrNameS $trModule4
@@ -48,8 +42,7 @@ $trModule1 = TrNameS $trModule2
$trModule = Module $trModule3 $trModule1
$wlvl
- = \ ww ww1 ww2 ->
- $windexError $fShowInt_$cshow (I# ww2) (I# ww1) (I# ww)
+ = \ ww ww1 ww2 -> indexError $fShowInt (I# ww2) (I# ww1) (I# ww)
index
= \ l u i ->
@@ -73,7 +66,7 @@ index
==================== Tidy Core ====================
Result size of Tidy Core
- = {terms: 108, types: 47, coercions: 0, joins: 3/4}
+ = {terms: 108, types: 46, coercions: 0, joins: 3/3}
$trModule4 = "main"#
@@ -89,35 +82,34 @@ i = I# 1#
l = I# 0#
-lvl = \ y -> $windexError $fShowInt_$cshow l y l
+lvl = \ x ww -> indexError $fShowInt x (I# ww) i
-lvl1 = \ ww y -> $windexError $fShowInt_$cshow (I# ww) y i
+lvl1 = \ ww -> indexError $fShowInt l (I# ww) l
$wtheresCrud
= \ ww ww1 ->
- let { y = I# ww1 } in
join {
- lvl2
+ exit
+ = case <# 0# ww1 of {
+ __DEFAULT -> case lvl1 ww1 of wild { };
+ 1# -> 0#
+ } } in
+ join {
+ exit1
= case <=# ww 1# of {
- __DEFAULT -> case lvl1 ww y of wild { };
+ __DEFAULT -> case lvl (I# ww) ww1 of wild { };
1# ->
case <# 1# ww1 of {
- __DEFAULT -> case lvl1 ww y of wild { };
+ __DEFAULT -> case lvl (I# ww) ww1 of wild { };
1# -> -# 1# ww
}
} } in
- join {
- lvl3
- = case <# 0# ww1 of {
- __DEFAULT -> case lvl y of wild { };
- 1# -> 0#
- } } in
joinrec {
$wgo ww2
= case ww2 of wild {
__DEFAULT -> jump $wgo (-# wild 1#);
- 0# -> jump lvl3;
- 1# -> jump lvl2
+ 0# -> jump exit;
+ 1# -> jump exit1
}; } in
jump $wgo ww
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/af740ff6ebbe2fb378f7cb2d25223a816979d892...e4cac1b86f350d837a4b02eebb4cd5c637181cfc
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/af740ff6ebbe2fb378f7cb2d25223a816979d892...e4cac1b86f350d837a4b02eebb4cd5c637181cfc
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221011/e6c19dce/attachment-0001.html>
More information about the ghc-commits
mailing list