[Git][ghc/ghc][wip/T22084] 2 commits: Don't keep exit join points so much

Simon Peyton Jones (@simonpj) gitlab at gitlab.haskell.org
Thu Sep 29 14:18:42 UTC 2022



Simon Peyton Jones pushed to branch wip/T22084 at Glasgow Haskell Compiler / GHC


Commits:
8eaa813d by Simon Peyton Jones at 2022-09-28T09:37:42+01:00
Don't keep exit join points so much

We were religiously keeping exit join points throughout, which
had some bad effects (#21148, #22084).

This MR arranges that exit join points are inhibited from inlining
only in /one/ Simplifier pass (right after Exitification).

It's not a big deal, but it shaves 0.1% off compile times.

- - - - -
ac9987f3 by Simon Peyton Jones at 2022-09-29T15:20:01+01:00
Inline used-once non-recursive join points very aggressively

- - - - -


9 changed files:

- compiler/GHC/Core/Opt/Exitify.hs
- compiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
- + testsuite/tests/simplCore/should_compile/T21148.hs
- + testsuite/tests/simplCore/should_compile/T21148.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Opt/Exitify.hs
=====================================
@@ -433,6 +433,7 @@ inlining.
 Exit join points, recognizable using `isExitJoinId` are join points with an
 occurrence in a recursive group, and can be recognized (after the occurrence
 analyzer ran!) using `isExitJoinId`.
+
 This function detects joinpoints with `occ_in_lam (idOccinfo id) == True`,
 because the lambdas of a non-recursive join point are not considered for
 `occ_in_lam`.  For example, in the following code, `j1` is /not/ marked
@@ -446,6 +447,29 @@ To prevent inlining, we check for isExitJoinId
 * In `simplLetUnfolding` we simply give exit join points no unfolding, which
   prevents inlining in `postInlineUnconditionally` and call sites.
 
+But see Note [Be selective about not-inlining exit join points]
+
+Note [Be selective about not-inlining exit join points]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we follow "do not inline exit join points" mantra throughout,
+some bad things happen.
+
+* We can lose CPR information: see #21148
+
+* We get useless clutter (#22084) that
+  - makes the program bigger (including duplicated code #20739), and
+  - adds extra jumps (and maybe stack saves) at runtime
+
+So instead we follow "do not inline exit join points" for a /single run/
+of the simplifier, right after Exitification.  That should give a
+sufficient chance for used-once things to inline, but subsequent runs
+will inline them back in.  (Annoyingly, as things stand, only with -O2
+is there a subsequent run, but that might change, and it's not a huge
+deal anyway.)
+
+This is controlled by the Simplifier's sm_keep_exits flag; see
+GHC.Core.Opt.Pipeline.
+
 Note [Placement of the exitification pass]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 I (Joachim) experimented with multiple positions for the Exitification pass in


=====================================
compiler/GHC/Core/Opt/Pipeline.hs
=====================================
@@ -15,7 +15,7 @@ import GHC.Driver.Plugins ( withPlugins, installCoreToDos )
 import GHC.Driver.Env
 import GHC.Driver.Config.Core.Lint ( endPass )
 import GHC.Driver.Config.Core.Opt.LiberateCase ( initLiberateCaseOpts )
-import GHC.Driver.Config.Core.Opt.Simplify ( initSimplifyOpts, initSimplMode, initGentleSimplMode )
+import GHC.Driver.Config.Core.Opt.Simplify ( initSimplifyOpts, initSimplMode )
 import GHC.Driver.Config.Core.Opt.WorkWrap ( initWorkWrapOpts )
 import GHC.Driver.Config.Core.Rules ( initRuleOpts )
 import GHC.Platform.Ways  ( hasWay, Way(WayProf) )
@@ -28,6 +28,7 @@ import GHC.Core.Utils   ( dumpIdInfoOfProgram )
 import GHC.Core.Lint    ( lintAnnots )
 import GHC.Core.Lint.Interactive ( interactiveInScope )
 import GHC.Core.Opt.Simplify ( simplifyExpr, simplifyPgm )
+import GHC.Core.Opt.Simplify.Env( SimplMode(..) )
 import GHC.Core.Opt.Simplify.Monad
 import GHC.Core.Opt.Monad
 import GHC.Core.Opt.Pipeline.Types
@@ -154,32 +155,45 @@ getCoreToDo dflags rule_base extra_vars
     maybe_strictness_before _
       = CoreDoNothing
 
-    simpl_phase phase name iter
-      = CoreDoPasses
-      $   [ maybe_strictness_before phase
-          , CoreDoSimplify $ initSimplifyOpts dflags extra_vars iter
-                             (initSimplMode dflags phase name) rule_base
-          , maybe_rule_check phase ]
+    ----------------------------
+    base_simpl_mode :: SimplMode
+    base_simpl_mode = initSimplMode dflags
 
-    -- Run GHC's internal simplification phase, after all rules have run.
-    -- See Note [Compiler phases] in GHC.Types.Basic
-    simplify name = simpl_phase FinalPhase name max_iter
-
-    -- initial simplify: mk specialiser happy: minimum effort please
+    -- gentle_mode: make specialiser happy: minimum effort please
     -- See Note [Inline in InitialPhase]
     -- See Note [RULEs enabled in InitialPhase]
-    simpl_gently = CoreDoSimplify $ initSimplifyOpts dflags extra_vars max_iter
-                                    (initGentleSimplMode dflags) rule_base
+    gentle_mode = base_simpl_mode { sm_names     = ["Gentle"]
+                                  , sm_phase     = InitialPhase
+                                  , sm_case_case = False }
+
+    simpl_mode phase name
+      = base_simpl_mode { sm_names = [name], sm_phase = phase }
+
+    keep_exits :: SimplMode -> SimplMode
+    -- See Note [Be selective about not-inlining exit join points]
+    -- in GHC.Core.Opt.Exitify
+    keep_exits mode = mode { sm_keep_exits = True }
+
+    ----------------------------
+    run_simplifier mode iter
+      = CoreDoSimplify $ initSimplifyOpts dflags extra_vars iter mode rule_base
 
+    simpl_phase phase name iter = CoreDoPasses $
+                                  [ maybe_strictness_before phase
+                                  , run_simplifier (simpl_mode phase name) iter
+                                  , maybe_rule_check phase ]
+
+    -- Run GHC's internal simplification phase, after all rules have run.
+    -- See Note [Compiler phases] in GHC.Types.Basic
+    simpl_gently          = run_simplifier gentle_mode  max_iter
+    simplify_final   name = run_simplifier (             simpl_mode FinalPhase name) max_iter
+    simpl_keep_exits name = run_simplifier (keep_exits $ simpl_mode FinalPhase name) max_iter
+
+    ----------------------------
     dmd_cpr_ww = if ww_on then [CoreDoDemand,CoreDoCpr,CoreDoWorkerWrapper]
                           else [CoreDoDemand,CoreDoCpr]
 
 
-    demand_analyser = (CoreDoPasses (
-                           dmd_cpr_ww ++
-                           [simplify "post-worker-wrapper"]
-                           ))
-
     -- Static forms are moved to the top level with the FloatOut pass.
     -- See Note [Grand plan for static forms] in GHC.Iface.Tidy.StaticPtrTable.
     static_ptrs_float_outwards =
@@ -269,14 +283,16 @@ getCoreToDo dflags rule_base extra_vars
 
         runWhen call_arity $ CoreDoPasses
             [ CoreDoCallArity
-            , simplify "post-call-arity"
+            , simplify_final "post-call-arity"
             ],
 
         -- Strictness analysis
-        runWhen strictness demand_analyser,
+        runWhen strictness $ CoreDoPasses
+            (dmd_cpr_ww ++ [simplify_final "post-worker-wrapper"]),
 
         runWhen exitification CoreDoExitify,
             -- See Note [Placement of the exitification pass]
+            -- in GHC.Core.Opt.Exitify
 
         runWhen full_laziness $
            CoreDoFloatOutwards FloatOutSwitches {
@@ -298,7 +314,17 @@ getCoreToDo dflags rule_base extra_vars
 
         runWhen do_float_in CoreDoFloatInwards,
 
-        simplify "final",  -- Final tidy-up
+        -- Final tidy-up run of the simplifier
+        simpl_keep_exits "final tidy up",
+            -- Keep exit join point because this is the first
+            -- Simplifier run after Exitify. Subsequent runs will
+            -- re-inline those exit join points; their work is done.
+            -- See Note [Be selective about not-inlining exit join points]
+            -- in GHC.Core.Opt.Exitify
+            --
+            -- Annoyingly, we only /have/ a subsequent run with -O2.  With
+            -- plain -O we'll still have those exit join points hanging around.
+            -- Oh well.
 
         maybe_rule_check FinalPhase,
 
@@ -308,31 +334,31 @@ getCoreToDo dflags rule_base extra_vars
                 -- Case-liberation for -O2.  This should be after
                 -- strictness analysis and the simplification which follows it.
         runWhen liberate_case $ CoreDoPasses
-           [ CoreLiberateCase, simplify "post-liberate-case" ],
+           [ CoreLiberateCase, simplify_final "post-liberate-case" ],
            -- Run the simplifier after LiberateCase to vastly
            -- reduce the possibility of shadowing
            -- Reason: see Note [Shadowing] in GHC.Core.Opt.SpecConstr
 
         runWhen spec_constr $ CoreDoPasses
-           [ CoreDoSpecConstr, simplify "post-spec-constr"],
+           [ CoreDoSpecConstr, simplify_final "post-spec-constr"],
            -- See Note [Simplify after SpecConstr]
 
         maybe_rule_check FinalPhase,
 
         runWhen late_specialise $ CoreDoPasses
-           [ CoreDoSpecialising, simplify "post-late-spec"],
+           [ CoreDoSpecialising, simplify_final "post-late-spec"],
 
         -- LiberateCase can yield new CSE opportunities because it peels
         -- off one layer of a recursive function (concretely, I saw this
         -- in wheel-sieve1), and I'm guessing that SpecConstr can too
         -- And CSE is a very cheap pass. So it seems worth doing here.
         runWhen ((liberate_case || spec_constr) && cse) $ CoreDoPasses
-           [ CoreCSE, simplify "post-final-cse" ],
+           [ CoreCSE, simplify_final "post-final-cse" ],
 
         ---------  End of -O2 passes --------------
 
         runWhen late_dmd_anal $ CoreDoPasses (
-            dmd_cpr_ww ++ [simplify "post-late-ww"]
+            dmd_cpr_ww ++ [simplify_final "post-late-ww"]
           ),
 
         -- Final run of the demand_analyser, ensures that one-shot thunks are


=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -248,13 +248,16 @@ data SimplMode = SimplMode -- See comments in GHC.Core.Opt.Simplify.Monad
   , sm_uf_opts      :: !UnfoldingOpts -- ^ Unfolding options
   , sm_case_case    :: !Bool          -- ^ Whether case-of-case is enabled
   , sm_pre_inline   :: !Bool          -- ^ Whether pre-inlining is enabled
-  , sm_float_enable :: !FloatEnable   -- ^ Whether to enable floating out
+  , sm_keep_exits   :: !Bool          -- ^ True <=> keep ExitJoinIds
+                                      -- See Note [Do not inline exit join points]
+                                      --          in GHC.Core.Opt.Exitify
+  , sm_float_enable     :: !FloatEnable   -- ^ Whether to enable floating out
   , sm_do_eta_reduction :: !Bool
-  , sm_arity_opts :: !ArityOpts
-  , sm_rule_opts :: !RuleOpts
-  , sm_case_folding :: !Bool
-  , sm_case_merge :: !Bool
-  , sm_co_opt_opts :: !OptCoercionOpts -- ^ Coercion optimiser options
+  , sm_arity_opts       :: !ArityOpts
+  , sm_rule_opts        :: !RuleOpts
+  , sm_case_folding     :: !Bool
+  , sm_case_merge       :: !Bool
+  , sm_co_opt_opts      :: !OptCoercionOpts -- ^ Coercion optimiser options
   }
 
 instance Outputable SimplMode where


=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -1320,11 +1320,11 @@ preInlineUnconditionally
 -- Reason: we don't want to inline single uses, or discard dead bindings,
 --         for unlifted, side-effect-ful bindings
 preInlineUnconditionally env top_lvl bndr rhs rhs_env
-  | not pre_inline_unconditionally           = Nothing
+  | not pre_inline                           = Nothing
   | not active                               = Nothing
   | isTopLevel top_lvl && isDeadEndId bndr   = Nothing -- Note [Top-level bottoming Ids]
   | isCoVar bndr                             = Nothing -- Note [Do not inline CoVars unconditionally]
-  | isExitJoinId bndr                        = Nothing -- Note [Do not inline exit join points]
+  | keep_exits, isExitJoinId bndr            = Nothing -- Note [Do not inline exit join points]
                                                        -- in module Exitify
   | not (one_occ (idOccInfo bndr))           = Nothing
   | not (isStableUnfolding unf)              = Just $! (extend_subst_with rhs)
@@ -1334,19 +1334,31 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
   , Just inl <- maybeUnfoldingTemplate unf   = Just $! (extend_subst_with inl)
   | otherwise                                = Nothing
   where
+    mode       = seMode env
+    phase      = sm_phase mode
+    keep_exits = sm_keep_exits mode
+    pre_inline = sm_pre_inline mode
+
     unf = idUnfolding bndr
     extend_subst_with inl_rhs = extendIdSubst env bndr $! (mkContEx rhs_env inl_rhs)
 
     one_occ IAmDead = True -- Happens in ((\x.1) v)
+
     one_occ OneOcc{ occ_n_br   = 1
                   , occ_in_lam = NotInsideLam }   = isNotTopLevel top_lvl || early_phase
+
     one_occ OneOcc{ occ_n_br   = 1
                   , occ_in_lam = IsInsideLam
                   , occ_int_cxt = IsInteresting } = canInlineInLam rhs
-    one_occ _                                     = False
 
-    pre_inline_unconditionally = sePreInline env
-    active = isActive (sePhase env) (inlinePragmaActivation inline_prag)
+    one_occ OneOcc{ occ_n_br = 1 } -- Inline join point that are used once, even inside
+      | isJoinId bndr = True       -- lambdas (which are presumably other join points)
+        -- Also for join points the IsInteresting doesn't matter; a nullary
+        -- join point is just the same as an arity-2 one.
+
+    one_occ _ = False
+
+    active = isActive phase (inlinePragmaActivation inline_prag)
              -- See Note [pre/postInlineUnconditionally in gentle mode]
     inline_prag = idInlinePragma bndr
 
@@ -1378,7 +1390,7 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
       -- not ticks.  Counting ticks cannot be duplicated, and non-counting
       -- ticks around a Lam will disappear anyway.
 
-    early_phase = sePhase env /= FinalPhase
+    early_phase = phase /= FinalPhase
     -- If we don't have this early_phase test, consider
     --      x = length [1,2,3]
     -- The full laziness pass carefully floats all the cons cells to


=====================================
compiler/GHC/Core/Opt/SpecConstr.hs
=====================================
@@ -1512,8 +1512,10 @@ scExpr' env (Case scrut b ty alts)
                 scrut_occ = case con of
                                DataAlt dc -- See Note [Do not specialise evals]
                                   | not (single_alt && all deadArgOcc arg_occs)
-                                  -> ScrutOcc (unitUFM dc arg_occs)
-                               _  -> UnkOcc
+                                  -> -- pprTrace "sc_alt1" (ppr b' $$ ppr con $$ ppr bs $$ ppr arg_occs) $
+                                     ScrutOcc (unitUFM dc arg_occs)
+                               _  -> -- pprTrace "sc_alt1" (ppr b' $$ ppr con $$ ppr bs $$ ppr arg_occs) $
+                                     UnkOcc
           ; return (usg', b_occ `combineOcc` scrut_occ, Alt con bs2 rhs') }
 
 


=====================================
compiler/GHC/Driver/Config/Core/Opt/Simplify.hs
=====================================
@@ -2,7 +2,6 @@ module GHC.Driver.Config.Core.Opt.Simplify
   ( initSimplifyExprOpts
   , initSimplifyOpts
   , initSimplMode
-  , initGentleSimplMode
   ) where
 
 import GHC.Prelude
@@ -27,12 +26,13 @@ import GHC.Types.Var ( Var )
 initSimplifyExprOpts :: DynFlags -> InteractiveContext -> SimplifyExprOpts
 initSimplifyExprOpts dflags ic = SimplifyExprOpts
   { se_fam_inst = snd $ ic_instances ic
-  , se_mode = (initSimplMode dflags InitialPhase "GHCi")
-    { sm_inline = False
-      -- Do not do any inlining, in case we expose some
-      -- unboxed tuple stuff that confuses the bytecode
+
+  , se_mode = (initSimplMode dflags) { sm_names = ["GHCi"]
+                                     , sm_inline = False }
+      -- sm_inline: do not do any inlining, in case we expose
+      -- some unboxed tuple stuff that confuses the bytecode
       -- interpreter
-    }
+
   , se_top_env_cfg = TopEnvConfig
     { te_history_size = historySize dflags
     , te_tick_factor = simplTickFactor dflags
@@ -57,31 +57,25 @@ initSimplifyOpts dflags extra_vars iterations mode rule_base = let
     }
   in opts
 
-initSimplMode :: DynFlags -> CompilerPhase -> String -> SimplMode
-initSimplMode dflags phase name = SimplMode
-  { sm_names = [name]
-  , sm_phase = phase
-  , sm_rules = gopt Opt_EnableRewriteRules dflags
-  , sm_eta_expand = gopt Opt_DoLambdaEtaExpansion dflags
-  , sm_cast_swizzle = True
-  , sm_inline = True
-  , sm_uf_opts = unfoldingOpts dflags
-  , sm_case_case = True
-  , sm_pre_inline = gopt Opt_SimplPreInlining dflags
-  , sm_float_enable = floatEnable dflags
+initSimplMode :: DynFlags -> SimplMode
+initSimplMode dflags = SimplMode
+  { sm_names = ["Unknown simplifier run"]  -- Always overriden
+  , sm_phase = InitialPhase
+  , sm_rules            = gopt Opt_EnableRewriteRules dflags
+  , sm_eta_expand       = gopt Opt_DoLambdaEtaExpansion dflags
+  , sm_pre_inline       = gopt Opt_SimplPreInlining dflags
   , sm_do_eta_reduction = gopt Opt_DoEtaReduction dflags
-  , sm_arity_opts = initArityOpts dflags
-  , sm_rule_opts = initRuleOpts dflags
-  , sm_case_folding = gopt Opt_CaseFolding dflags
-  , sm_case_merge = gopt Opt_CaseMerge dflags
-  , sm_co_opt_opts = initOptCoercionOpts dflags
-  }
-
-initGentleSimplMode :: DynFlags -> SimplMode
-initGentleSimplMode dflags = (initSimplMode dflags InitialPhase "Gentle")
-  { -- Don't do case-of-case transformations.
-    -- This makes full laziness work better
-    sm_case_case = False
+  , sm_uf_opts          = unfoldingOpts dflags
+  , sm_float_enable     = floatEnable dflags
+  , sm_arity_opts       = initArityOpts dflags
+  , sm_rule_opts        = initRuleOpts dflags
+  , sm_case_folding     = gopt Opt_CaseFolding dflags
+  , sm_case_merge       = gopt Opt_CaseMerge dflags
+  , sm_co_opt_opts      = initOptCoercionOpts dflags
+  , sm_cast_swizzle = True
+  , sm_inline       = True
+  , sm_case_case    = True
+  , sm_keep_exits   = False
   }
 
 floatEnable :: DynFlags -> FloatEnable


=====================================
testsuite/tests/simplCore/should_compile/T21148.hs
=====================================
@@ -0,0 +1,12 @@
+module T211148 where
+
+-- The point of this test is that f should get a (nested)
+-- CPR property, with a worker of type
+-- $wf :: Int# -> State# RealWorld -> (# State# RealWorld, Int# #)
+
+{-# NOINLINE f #-}
+-- The NOINLINE makes GHC do a worker/wrapper split
+-- even though f is small
+f :: Int -> IO Int
+f x = return $! sum [0..x]
+


=====================================
testsuite/tests/simplCore/should_compile/T21148.stderr
=====================================
@@ -0,0 +1 @@
+ 
\ No newline at end of file


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -429,3 +429,4 @@ test('T21763', only_ways(['optasm']), compile, ['-O2 -ddump-rules'])
 test('T21763a', only_ways(['optasm']), compile, ['-O2 -ddump-rules'])
 test('T22028', normal, compile, ['-O -ddump-rule-firings'])
 test('T22114', normal, compile, ['-O'])
+test('T21148', [grep_errmsg(r'Cpr=') ], compile, ['-O -ddump-simpl'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bef9a55228552e85e26bba5ce782bbeb9404391c...ac9987f3ee3288fd2d0e4d5f6068323f9d9ccdd8

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bef9a55228552e85e26bba5ce782bbeb9404391c...ac9987f3ee3288fd2d0e4d5f6068323f9d9ccdd8
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20220929/40e57119/attachment-0001.html>


More information about the ghc-commits mailing list