[Git][ghc/ghc][wip/T17917] Avoid useless w/w split

Ben Gamari gitlab at gitlab.haskell.org
Fri May 8 20:00:00 UTC 2020



Ben Gamari pushed to branch wip/T17917 at Glasgow Haskell Compiler / GHC


Commits:
eb840476 by Simon Peyton Jones at 2020-05-08T15:58:47-04:00
Avoid useless w/w split

This patch is just a tidy-up for the post-strictness-analysis
worker wrapper split.  Consider

   f x = x

Strictnesss analysis does not lead to a w/w split, so the
obvious thing is to leave it 100% alone.  But actually, because
the RHS is small, we ended up adding a StableUnfolding for it.

There is some reason to do this if we choose /not/ do to w/w
on the grounds that the function is small.  See
Note [Don't w/w inline small non-loop-breaker things]

But there is no reason if we would not have done w/w anyway.

This patch just moves the conditional to later.  Easy.
This does move soem -ddump-simpl printouts around a bit.

I also discovered that the previous code was overwritten an
InlineCompulsory with InlineStable, which is utterly wrong.  That in
turn meant that some default methods (marked InlineCompulsory)
were getting their InlineCompulsory squashed. This patch fixes
that bug --- but of course that does mean a bit more inlining!

Metric Decrease:
    T9233
    T9675
Metric Increase:
    T12707
    T3064
    T4029
    T9872b
    T9872d
    haddock.Cabal

- - - - -


11 changed files:

- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Unfold.hs
- testsuite/tests/dependent/should_compile/dynamic-paper.stderr
- testsuite/tests/driver/inline-check.stderr
- testsuite/tests/numeric/should_compile/T14465.stdout
- testsuite/tests/perf/compiler/T16473.stdout
- testsuite/tests/simplCore/should_compile/T17901.stdout
- testsuite/tests/simplCore/should_compile/T17966.stdout
- testsuite/tests/simplCore/should_compile/T4201.stdout
- testsuite/tests/simplCore/should_compile/T5658b.stdout
- testsuite/tests/warnings/should_compile/T16282/T16282.stderr


Changes:

=====================================
compiler/GHC/Core/Opt/WorkWrap.hs
=====================================
@@ -26,6 +26,7 @@ import GHC.Types.Cpr
 import GHC.Core.Opt.WorkWrap.Utils
 import GHC.Utils.Misc
 import GHC.Utils.Outputable
+import GHC.Types.Unique
 import GHC.Core.FamInstEnv
 import GHC.Utils.Monad
 
@@ -203,6 +204,23 @@ unfolding to the *worker*.  So we will get something like this:
 How do we "transfer the unfolding"? Easy: by using the old one, wrapped
 in work_fn! See GHC.Core.Unfold.mkWorkerUnfolding.
 
+Note [No worker-wrapper for record selectors]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We sometimes generate a lot of record selectors, and generally the
+don't benefit from worker/wrapper.  Yes, mkWwBodies would find a w/w split,
+but it is then suppressed by the certainlyWillInline test in splitFun.
+
+The wasted effort in mkWwBodies makes a measurable difference in
+compile time (see MR !2873), so although it's a terribly ad-hoc test,
+we just check here for record selectors, and do a no-op in that case.
+
+I did look for a generalisation, so that it's not just record
+selectors that benefit.  But you'd need a cheap test for "this
+function will definitely get a w/w split" and that's hard to predict
+in advance...the logic in mkWwBodies is complex. So I've left the
+super-simple test, with this Note to explain.
+
+
 Note [Worker-wrapper for NOINLINE functions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We used to disable worker/wrapper for NOINLINE things, but it turns out
@@ -316,8 +334,8 @@ Note [Don't w/w inline small non-loop-breaker things]
 In general, we refrain from w/w-ing *small* functions, which are not
 loop breakers, because they'll inline anyway.  But we must take care:
 it may look small now, but get to be big later after other inlining
-has happened.  So we take the precaution of adding an INLINE pragma to
-any such functions.
+has happened.  So we take the precaution of adding a StableUnfolding
+for any such functions.
 
 I made this change when I observed a big function at the end of
 compilation with a useful strictness signature but no w-w.  (It was
@@ -457,11 +475,6 @@ tryWW   :: DynFlags
 tryWW dflags fam_envs is_rec fn_id rhs
   -- See Note [Worker-wrapper for NOINLINE functions]
 
-  | Just stable_unf <- certainlyWillInline dflags fn_info
-  = return [ (fn_id `setIdUnfolding` stable_unf, rhs) ]
-        -- See Note [Don't w/w INLINE things]
-        -- See Note [Don't w/w inline small non-loop-breaker things]
-
   | is_fun && is_eta_exp
   = splitFun dflags fam_envs new_fn_id fn_info wrap_dmds div cpr rhs
 
@@ -567,105 +580,125 @@ See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_192064.
 splitFun :: DynFlags -> FamInstEnvs -> Id -> IdInfo -> [Demand] -> Divergence -> CprResult -> CoreExpr
          -> UniqSM [(Id, CoreExpr)]
 splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
-  = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr cpr) ) do
-    -- The arity should match the signature
-    stuff <- mkWwBodies dflags fam_envs rhs_fvs fn_id wrap_dmds use_cpr_info
-    case stuff of
-      Just (work_demands, join_arity, wrap_fn, work_fn) -> do
-        work_uniq <- getUniqueM
-        let work_rhs = work_fn rhs
-            work_act = case fn_inline_spec of  -- See Note [Worker activation]
-                          NoInline -> fn_act
-                          _        -> wrap_act
-
-            work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
-                                     , inl_inline = fn_inline_spec
-                                     , inl_sat    = Nothing
-                                     , inl_act    = work_act
-                                     , inl_rule   = FunLike }
-              -- inl_inline: copy from fn_id; see Note [Worker-wrapper for INLINABLE functions]
-              -- inl_act:    see Note [Worker activation]
-              -- inl_rule:   it does not make sense for workers to be constructorlike.
-
-            work_join_arity | isJoinId fn_id = Just join_arity
-                            | otherwise      = Nothing
-              -- worker is join point iff wrapper is join point
-              -- (see Note [Don't w/w join points for CPR])
-
-            work_id  = mkWorkerId work_uniq fn_id (exprType work_rhs)
-                        `setIdOccInfo` occInfo fn_info
-                                -- Copy over occurrence info from parent
-                                -- Notably whether it's a loop breaker
-                                -- Doesn't matter much, since we will simplify next, but
-                                -- seems right-er to do so
-
-                        `setInlinePragma` work_prag
-
-                        `setIdUnfolding` mkWorkerUnfolding dflags work_fn fn_unfolding
-                                -- See Note [Worker-wrapper for INLINABLE functions]
-
-                        `setIdStrictness` mkClosedStrictSig work_demands div
-                                -- Even though we may not be at top level,
-                                -- it's ok to give it an empty DmdEnv
-
-                        `setIdCprInfo` mkCprSig work_arity work_cpr_info
-
-                        `setIdDemandInfo` worker_demand
-
-                        `setIdArity` work_arity
-                                -- Set the arity so that the Core Lint check that the
-                                -- arity is consistent with the demand type goes
-                                -- through
-                        `asJoinId_maybe` work_join_arity
-
-            work_arity = length work_demands
-
-            -- See Note [Demand on the Worker]
-            single_call = saturatedByOneShots arity (demandInfo fn_info)
-            worker_demand | single_call = mkWorkerDemand work_arity
-                          | otherwise   = topDmd
-
-            wrap_rhs  = wrap_fn work_id
-            wrap_act  = case fn_act of  -- See Note [Wrapper activation]
-                           ActiveAfter {} -> fn_act
-                           NeverActive    -> activeDuringFinal
-                           _              -> activeAfterInitial
-            wrap_prag = InlinePragma { inl_src    = SourceText "{-# INLINE"
-                                     , inl_inline = NoUserInline
-                                     , inl_sat    = Nothing
-                                     , inl_act    = wrap_act
-                                     , inl_rule   = rule_match_info }
-                -- inl_act:    see Note [Wrapper activation]
-                -- inl_inline: see Note [Wrapper NoUserInline]
-                -- inl_rule:   RuleMatchInfo is (and must be) unaffected
-
-            wrap_id   = fn_id `setIdUnfolding`  mkWwInlineRule dflags wrap_rhs arity
-                              `setInlinePragma` wrap_prag
-                              `setIdOccInfo`    noOccInfo
-                                -- Zap any loop-breaker-ness, to avoid bleating from Lint
-                                -- about a loop breaker with an INLINE rule
-
-
-
-        return $ [(work_id, work_rhs), (wrap_id, wrap_rhs)]
-            -- Worker first, because wrapper mentions it
-
-      Nothing -> return [(fn_id, rhs)]
+  | isRecordSelector fn_id  -- See Note [No worker/wrapper for record selectors]
+  = return [ (fn_id, rhs ) ]
+
+  | otherwise
+  = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr cpr) )
+          -- The arity should match the signature
+    do { mb_stuff <- mkWwBodies dflags fam_envs rhs_fvs fn_id wrap_dmds use_cpr_info
+       ; case mb_stuff of
+            Nothing -> return [(fn_id, rhs)]
+
+            Just stuff
+              | Just stable_unf <- certainlyWillInline dflags fn_info
+              ->  return [ (fn_id `setIdUnfolding` stable_unf, rhs) ]
+                  -- See Note [Don't w/w INLINE things]
+                  -- See Note [Don't w/w inline small non-loop-breaker things]
+
+              | otherwise
+              -> do { work_uniq <- getUniqueM
+                    ; return (mkWWBindPair dflags fn_id fn_info arity rhs
+                                           work_uniq div cpr stuff) } }
   where
-    rhs_fvs         = exprFreeVars rhs
+    rhs_fvs = exprFreeVars rhs
+    arity   = arityInfo fn_info
+            -- The arity is set by the simplifier using exprEtaExpandArity
+            -- So it may be more than the number of top-level-visible lambdas
+
+    -- use_cpr_info is the CPR we w/w for. Note that we kill it for join points,
+    -- see Note [Don't w/w join points for CPR].
+    use_cpr_info  | isJoinId fn_id = topCpr
+                  | otherwise      = cpr
+
+
+mkWWBindPair :: DynFlags -> Id -> IdInfo -> Arity
+             -> CoreExpr -> Unique -> Divergence -> CprResult
+             -> ([Demand], JoinArity, Id -> CoreExpr, Expr CoreBndr -> CoreExpr)
+             -> [(Id, CoreExpr)]
+mkWWBindPair dflags fn_id fn_info arity rhs work_uniq div cpr
+             (work_demands, join_arity, wrap_fn, work_fn)
+  = [(work_id, work_rhs), (wrap_id, wrap_rhs)]
+     -- Worker first, because wrapper mentions it
+  where
+    work_rhs = work_fn rhs
+    work_act = case fn_inline_spec of  -- See Note [Worker activation]
+                   NoInline -> fn_act
+                   _        -> wrap_act
+
+    work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
+                             , inl_inline = fn_inline_spec
+                             , inl_sat    = Nothing
+                             , inl_act    = work_act
+                             , inl_rule   = FunLike }
+      -- inl_inline: copy from fn_id; see Note [Worker-wrapper for INLINABLE functions]
+      -- inl_act:    see Note [Worker activation]
+      -- inl_rule:   it does not make sense for workers to be constructorlike.
+
+    work_join_arity | isJoinId fn_id = Just join_arity
+                    | otherwise      = Nothing
+      -- worker is join point iff wrapper is join point
+      -- (see Note [Don't w/w join points for CPR])
+
+    work_id  = mkWorkerId work_uniq fn_id (exprType work_rhs)
+                `setIdOccInfo` occInfo fn_info
+                        -- Copy over occurrence info from parent
+                        -- Notably whether it's a loop breaker
+                        -- Doesn't matter much, since we will simplify next, but
+                        -- seems right-er to do so
+
+                `setInlinePragma` work_prag
+
+                `setIdUnfolding` mkWorkerUnfolding dflags work_fn fn_unfolding
+                        -- See Note [Worker-wrapper for INLINABLE functions]
+
+                `setIdStrictness` mkClosedStrictSig work_demands div
+                        -- Even though we may not be at top level,
+                        -- it's ok to give it an empty DmdEnv
+
+                `setIdCprInfo` mkCprSig work_arity work_cpr_info
+
+                `setIdDemandInfo` worker_demand
+
+                `setIdArity` work_arity
+                        -- Set the arity so that the Core Lint check that the
+                        -- arity is consistent with the demand type goes
+                        -- through
+                `asJoinId_maybe` work_join_arity
+
+    work_arity = length work_demands
+
+    -- See Note [Demand on the Worker]
+    single_call = saturatedByOneShots arity (demandInfo fn_info)
+    worker_demand | single_call = mkWorkerDemand work_arity
+                  | otherwise   = topDmd
+
+    wrap_rhs  = wrap_fn work_id
+    wrap_act  = case fn_act of  -- See Note [Wrapper activation]
+                   ActiveAfter {} -> fn_act
+                   NeverActive    -> activeDuringFinal
+                   _              -> activeAfterInitial
+    wrap_prag = InlinePragma { inl_src    = SourceText "{-# INLINE"
+                             , inl_inline = NoUserInline
+                             , inl_sat    = Nothing
+                             , inl_act    = wrap_act
+                             , inl_rule   = rule_match_info }
+        -- inl_act:    see Note [Wrapper activation]
+        -- inl_inline: see Note [Wrapper NoUserInline]
+        -- inl_rule:   RuleMatchInfo is (and must be) unaffected
+
+    wrap_id   = fn_id `setIdUnfolding`  mkWwInlineRule dflags wrap_rhs arity
+                      `setInlinePragma` wrap_prag
+                      `setIdOccInfo`    noOccInfo
+                        -- Zap any loop-breaker-ness, to avoid bleating from Lint
+                        -- about a loop breaker with an INLINE rule
+
     fn_inl_prag     = inlinePragInfo fn_info
     fn_inline_spec  = inl_inline fn_inl_prag
     fn_act          = inl_act fn_inl_prag
     rule_match_info = inlinePragmaRuleMatchInfo fn_inl_prag
     fn_unfolding    = unfoldingInfo fn_info
-    arity           = arityInfo fn_info
-                    -- The arity is set by the simplifier using exprEtaExpandArity
-                    -- So it may be more than the number of top-level-visible lambdas
 
-    -- use_cpr_info is the CPR we w/w for. Note that we kill it for join points,
-    -- see Note [Don't w/w join points for CPR].
-    use_cpr_info  | isJoinId fn_id = topCpr
-                  | otherwise      = cpr
     -- Even if we don't w/w join points for CPR, we might still do so for
     -- strictness. In which case a join point worker keeps its original CPR
     -- property; see Note [Don't w/w join points for CPR]. Otherwise, the worker


=====================================
compiler/GHC/Core/Unfold.hs
=====================================
@@ -1121,11 +1121,21 @@ certainlyWillInline :: DynFlags -> IdInfo -> Maybe Unfolding
 -- ^ Sees if the unfolding is pretty certain to inline.
 -- If so, return a *stable* unfolding for it, that will always inline.
 certainlyWillInline dflags fn_info
-  = case unfoldingInfo fn_info of
-      CoreUnfolding { uf_tmpl = e, uf_guidance = g }
-        | loop_breaker -> Nothing      -- Won't inline, so try w/w
-        | noinline     -> Nothing      -- See Note [Worker-wrapper for NOINLINE functions]
-        | otherwise    -> do_cunf e g  -- Depends on size, so look at that
+  = case fn_unf of
+      CoreUnfolding { uf_tmpl = expr, uf_guidance = guidance, uf_src = src }
+        | loop_breaker -> Nothing       -- Won't inline, so try w/w
+        | noinline     -> Nothing       -- See Note [Worker-wrapper for NOINLINE functions]
+        | otherwise
+        -> case guidance of
+             UnfNever  -> Nothing
+             UnfWhen {} -> Just (fn_unf { uf_src = src' })
+                             -- INLINE functions have UnfWhen
+             UnfIfGoodArgs { ug_size = size, ug_args = args }
+               -> do_cunf expr size args src'
+        where
+          src' = case src of
+                   InlineRhs -> InlineStable
+                   _         -> src  -- Do not change InlineCompulsory!
 
       DFunUnfolding {} -> Just fn_unf  -- Don't w/w DFuns; it never makes sense
                                        -- to do so, and even if it is currently a
@@ -1138,17 +1148,12 @@ certainlyWillInline dflags fn_info
     noinline     = inlinePragmaSpec (inlinePragInfo fn_info) == NoInline
     fn_unf       = unfoldingInfo fn_info
 
-    do_cunf :: CoreExpr -> UnfoldingGuidance -> Maybe Unfolding
-    do_cunf _ UnfNever     = Nothing
-    do_cunf _ (UnfWhen {}) = Just (fn_unf { uf_src = InlineStable })
-                             -- INLINE functions have UnfWhen
-
         -- The UnfIfGoodArgs case seems important.  If we w/w small functions
         -- binary sizes go up by 10%!  (This is with SplitObjs.)
         -- I'm not totally sure why.
         -- INLINABLE functions come via this path
         --    See Note [certainlyWillInline: INLINABLE]
-    do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args })
+    do_cunf expr size args src'
       | arityInfo fn_info > 0  -- See Note [certainlyWillInline: be careful of thunks]
       , not (isBottomingSig (strictnessInfo fn_info))
               -- Do not unconditionally inline a bottoming functions even if
@@ -1156,7 +1161,7 @@ certainlyWillInline dflags fn_info
               -- so we don't want to re-inline it.
       , let unf_arity = length args
       , size - (10 * (unf_arity + 1)) <= ufUseThreshold dflags
-      = Just (fn_unf { uf_src      = InlineStable
+      = Just (fn_unf { uf_src      = src'
                      , uf_guidance = UnfWhen { ug_arity     = unf_arity
                                              , ug_unsat_ok  = unSaturatedOk
                                              , ug_boring_ok = inlineBoringOk expr } })


=====================================
testsuite/tests/dependent/should_compile/dynamic-paper.stderr
=====================================
@@ -12,4 +12,4 @@ Simplifier ticks exhausted
   simplifier non-termination has been judged acceptable.
    
   To see detailed counts use -ddump-simpl-stats
-  Total ticks: 138082
+  Total ticks: 157721


=====================================
testsuite/tests/driver/inline-check.stderr
=====================================
@@ -21,6 +21,7 @@ Considering inlining: foo
   some_benefit False
   is exp: True
   is work-free: True
-  guidance ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
+  guidance IF_ARGS [0] 30 0
+  discounted size = 20
   ANSWER = NO
 Inactive unfolding: foo1


=====================================
testsuite/tests/numeric/should_compile/T14465.stdout
=====================================
@@ -82,10 +82,8 @@ plusOne :: Natural -> Natural
 [GblId,
  Arity=1,
  Str=<S,U>,
- Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True,
-         Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
-         Tmpl= \ (n [Occ=Once] :: Natural) -> plusNatural n M.minusOne1}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 30 0}]
 plusOne = \ (n :: Natural) -> plusNatural n M.minusOne1
 
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}


=====================================
testsuite/tests/perf/compiler/T16473.stdout
=====================================
@@ -2,10 +2,10 @@ Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op liftA2 (BUILTIN)
 Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op <*> (BUILTIN)
-Rule fired: Class op <$ (BUILTIN)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op <*> (BUILTIN)
-Rule fired: Class op fmap (BUILTIN)
+Rule fired: Class op <$ (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)


=====================================
testsuite/tests/simplCore/should_compile/T17901.stdout
=====================================
@@ -1,14 +1,6 @@
-                 (wombat1 [Occ=Once*!] :: T -> p)
-                   A -> wombat1 T17901.A;
-                   B -> wombat1 T17901.B;
-                   C -> wombat1 T17901.C
   = \ (@p) (wombat1 :: T -> p) (x :: T) ->
       case x of wild { __DEFAULT -> wombat1 wild }
-         Tmpl= \ (@p) (wombat2 [Occ=Once!] :: S -> p) (x [Occ=Once] :: S) ->
-                 case x of wild [Occ=Once] { __DEFAULT -> wombat2 wild }}]
   = \ (@p) (wombat2 :: S -> p) (x :: S) ->
       case x of wild { __DEFAULT -> wombat2 wild }
-         Tmpl= \ (@p) (wombat3 [Occ=Once!] :: W -> p) (x [Occ=Once] :: W) ->
-                 case x of wild [Occ=Once] { __DEFAULT -> wombat3 wild }}]
   = \ (@p) (wombat3 :: W -> p) (x :: W) ->
       case x of wild { __DEFAULT -> wombat3 wild }


=====================================
testsuite/tests/simplCore/should_compile/T17966.stdout
=====================================
@@ -1,5 +1,3 @@
  RULES: "SPEC $cm @()" [0]
  RULES: "SPEC f @Bool @() @(Maybe Integer)" [0]
-"SPEC/T17966 $fShowMaybe_$cshow @Integer"
-"SPEC/T17966 $fShowMaybe_$cshowList @Integer"
 "SPEC/T17966 $fShowMaybe @Integer"


=====================================
testsuite/tests/simplCore/should_compile/T4201.stdout
=====================================
@@ -1,3 +1,3 @@
+  lift :: Foo -> T
   [HasNoCafRefs, Arity: 1, Strictness: <S,1*U>,
-   Unfolding: InlineRule (0, True, True)
-              bof `cast` (Sym (N:Foo[0]) ->_R <T>_R)]
+   Unfolding: (bof `cast` (Sym (N:Foo[0]) ->_R <T>_R))]


=====================================
testsuite/tests/simplCore/should_compile/T5658b.stdout
=====================================
@@ -1 +1 @@
-4
+2


=====================================
testsuite/tests/warnings/should_compile/T16282/T16282.stderr
=====================================
@@ -6,5 +6,4 @@ T16282.hs: warning: [-Wall-missed-specialisations]
 
 T16282.hs: warning: [-Wall-missed-specialisations]
     Could not specialise imported function ‘Data.Map.Internal.$w$cshowsPrec’
-      when specialising ‘Data.Map.Internal.$fShowMap_$cshowsPrec’
     Probable fix: add INLINABLE pragma on ‘Data.Map.Internal.$w$cshowsPrec’



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/eb840476520070f0adfcdd44f931ef921ba04f17

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/eb840476520070f0adfcdd44f931ef921ba04f17
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20200508/812cb552/attachment-0001.html>


More information about the ghc-commits mailing list