[Git][ghc/ghc][wip/spj-unf-size] Wibbles

Simon Peyton Jones (@simonpj) gitlab at gitlab.haskell.org
Sun Oct 22 22:05:47 UTC 2023



Simon Peyton Jones pushed to branch wip/spj-unf-size at Glasgow Haskell Compiler / GHC


Commits:
f5fbeeb2 by Simon Peyton Jones at 2023-10-22T23:05:29+01:00
Wibbles

- - - - -


4 changed files:

- compiler/GHC/Core/Opt/Simplify/Inline.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Unfold.hs
- compiler/GHC/Core/Unfold/Make.hs


Changes:

=====================================
compiler/GHC/Core/Opt/Simplify/Inline.hs
=====================================
@@ -10,7 +10,7 @@ This module contains inlining logic used by the simplifier.
 
 module GHC.Core.Opt.Simplify.Inline (
         -- * The smart inlining decisions are made by callSiteInline
-        callSiteInline, CallCtxt(..),
+        callSiteInline,
 
         exprSummary
     ) where
@@ -40,7 +40,7 @@ import Data.List (isPrefixOf)
 {-
 ************************************************************************
 *                                                                      *
-\subsection{callSiteInline}
+                  callSiteInline
 *                                                                      *
 ************************************************************************
 
@@ -510,55 +510,14 @@ This kind of thing can occur if you have
         foo = let x = e in (x,x)
 
 which Roman did.
-
-
 -}
 
-{-
-computeDiscount :: [Int] -> Int -> [ArgSummary] -> CallCtxt
-                -> Int
-computeDiscount arg_discounts res_discount arg_infos cont_info
-
-  = 10          -- Discount of 10 because the result replaces the call
-                -- so we count 10 for the function itself
-
-    + 10 * length actual_arg_discounts
-               -- Discount of 10 for each arg supplied,
-               -- because the result replaces the call
-
-    + total_arg_discount + res_discount'
-  where
-    actual_arg_discounts = zipWith mk_arg_discount arg_discounts arg_infos
-    total_arg_discount   = sum actual_arg_discounts
-
-    mk_arg_discount _        TrivArg    = 0
-    mk_arg_discount _        NonTrivArg = 10
-    mk_arg_discount discount ValueArg   = discount
 
-    res_discount'
-      | LT <- arg_discounts `compareLength` arg_infos
-      = res_discount   -- Over-saturated
-      | otherwise
-      = case cont_info of
-           BoringCtxt  -> 0
-           CaseCtxt    -> res_discount  -- Presumably a constructor
-           ValAppCtxt  -> res_discount  -- Presumably a function
-           _           -> 40 `min` res_discount
-                -- ToDo: this 40 `min` res_discount doesn't seem right
-                --   for DiscArgCtxt it shouldn't matter because the function will
-                --       get the arg discount for any non-triv arg
-                --   for RuleArgCtxt we do want to be keener to inline; but not only
-                --       constructor results
-                --   for RhsCtxt I suppose that exposing a data con is good in general
-                --   And 40 seems very arbitrary
-                --
-                -- res_discount can be very large when a function returns
-                -- constructors; but we only want to invoke that large discount
-                -- when there's a case continuation.
-                -- Otherwise we, rather arbitrarily, threshold it.  Yuk.
-                -- But we want to avoid inlining large functions that return
-                -- constructors into contexts that are simply "interesting"
--}
+{- *********************************************************************
+*                                                                      *
+                  Computing ArgSummary
+*                                                                      *
+********************************************************************* -}
 
 {- Note [Interesting arguments]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -29,6 +29,9 @@ module GHC.Core.Opt.Simplify.Utils (
         mkBoringStop, mkRhsStop, mkLazyArgStop,
         interestingCallContext,
 
+        -- The CallCtxt type
+        CallCtxt(..),
+        
         -- ArgInfo
         ArgInfo(..), ArgSpec(..), RewriteCall(..), mkArgInfo,
         addValArgTo, addCastTo, addTyArgTo,
@@ -516,6 +519,38 @@ contHoleType (Select { sc_dup = d, sc_bndr =  b, sc_env = se })
   = perhapsSubstTy d se (idType b)
 
 
+contHasRules :: SimplCont -> Bool
+-- If the argument has form (f x y), where x,y are boring,
+-- and f is marked INLINE, then we don't want to inline f.
+-- But if the context of the argument is
+--      g (f x y)
+-- where g has rules, then we *do* want to inline f, in case it
+-- exposes a rule that might fire.  Similarly, if the context is
+--      h (g (f x x))
+-- where h has rules, then we do want to inline f.  So contHasRules
+-- tries to see if the context of the f-call is a call to a function
+-- with rules.
+--
+-- The ai_encl flag makes this happen; if it's
+-- set, the inliner gets just enough keener to inline f
+-- regardless of how boring f's arguments are, if it's marked INLINE
+--
+-- The alternative would be to *always* inline an INLINE function,
+-- regardless of how boring its context is; but that seems overkill
+-- For example, it'd mean that wrapper functions were always inlined
+contHasRules cont
+  = go cont
+  where
+    go (ApplyToVal { sc_cont = cont }) = go cont
+    go (ApplyToTy  { sc_cont = cont }) = go cont
+    go (CastIt _ cont)                 = go cont
+    go (StrictArg { sc_fun = fun })    = ai_encl fun
+    go (Stop _ RuleArgCtxt _)          = True
+    go (TickIt _ c)                    = go c
+    go (Select {})                     = False
+    go (StrictBind {})                 = False      -- ??
+    go (Stop _ _ _)                    = False
+
 -- Computes the multiplicity scaling factor at the hole. That is, in (case [] of
 -- x ::(p) _ { … }) (respectively for arguments of functions), the scaling
 -- factor is p. And in E[G[]], the scaling factor is the product of the scaling
@@ -709,15 +744,36 @@ make use of the strictness info for the function.
 -}
 
 
-{-
-************************************************************************
+{- *********************************************************************
 *                                                                      *
-        Interesting arguments
+             CallCtxt: the context of a call
 *                                                                      *
-************************************************************************
+********************************************************************* -}
 
-Note [Interesting call context]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+data CallCtxt
+  = BoringCtxt
+  | RhsCtxt RecFlag     -- Rhs of a let-binding; see Note [RHS of lets]
+  | DiscArgCtxt         -- Argument of a function with non-zero arg discount
+  | RuleArgCtxt         -- We are somewhere in the argument of a function with rules
+
+  | ValAppCtxt          -- We're applied to at least one value arg
+                        -- This arises when we have ((f x |> co) y)
+                        -- Then the (f x) has argument 'x' but in a ValAppCtxt
+
+  | CaseCtxt            -- We're the scrutinee of a case
+                        -- that decomposes its scrutinee
+
+instance Outputable CallCtxt where
+  ppr CaseCtxt    = text "CaseCtxt"
+  ppr ValAppCtxt  = text "ValAppCtxt"
+  ppr BoringCtxt  = text "BoringCtxt"
+  ppr (RhsCtxt ir)= text "RhsCtxt" <> parens (ppr ir)
+  ppr DiscArgCtxt = text "DiscArgCtxt"
+  ppr RuleArgCtxt = text "RuleArgCtxt"
+
+
+{- Note [Interesting call context]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We want to avoid inlining an expression where there can't possibly be
 any gain, such as in an argument position.  Hence, if the continuation
 is interesting (eg. a case scrutinee that isn't just a seq, application etc.)
@@ -873,38 +929,6 @@ interestingCallContext env cont
         -- a build it's *great* to inline it here.  So we must ensure that
         -- the context for (f x) is not totally uninteresting.
 
-contHasRules :: SimplCont -> Bool
--- If the argument has form (f x y), where x,y are boring,
--- and f is marked INLINE, then we don't want to inline f.
--- But if the context of the argument is
---      g (f x y)
--- where g has rules, then we *do* want to inline f, in case it
--- exposes a rule that might fire.  Similarly, if the context is
---      h (g (f x x))
--- where h has rules, then we do want to inline f.  So contHasRules
--- tries to see if the context of the f-call is a call to a function
--- with rules.
---
--- The ai_encl flag makes this happen; if it's
--- set, the inliner gets just enough keener to inline f
--- regardless of how boring f's arguments are, if it's marked INLINE
---
--- The alternative would be to *always* inline an INLINE function,
--- regardless of how boring its context is; but that seems overkill
--- For example, it'd mean that wrapper functions were always inlined
-contHasRules cont
-  = go cont
-  where
-    go (ApplyToVal { sc_cont = cont }) = go cont
-    go (ApplyToTy  { sc_cont = cont }) = go cont
-    go (CastIt _ cont)                 = go cont
-    go (StrictArg { sc_fun = fun })    = ai_encl fun
-    go (Stop _ RuleArgCtxt _)          = True
-    go (TickIt _ c)                    = go c
-    go (Select {})                     = False
-    go (StrictBind {})                 = False      -- ??
-    go (Stop _ _ _)                    = False
-
 
 {-
 ************************************************************************


=====================================
compiler/GHC/Core/Unfold.hs
=====================================
@@ -2,7 +2,6 @@
 (c) The University of Glasgow 2006
 (c) The AQUA Project, Glasgow University, 1994-1998
 
-
 Core-syntax unfoldings
 
 Unfoldings (which can travel across module boundaries) are in Core
@@ -23,7 +22,7 @@ module GHC.Core.Unfold (
 
         ExprTree, exprTree, exprTreeSize,
         exprTreeWillInline, couldBeSmallEnoughToInline,
-        ArgSummary(..), CallCtxt(..), hasArgInfo,
+        ArgSummary(..), hasArgInfo,
         Size, leqSize, addSizeN, adjustSize,
         InlineContext(..),
 
@@ -49,7 +48,7 @@ import GHC.Types.Var.Env
 import GHC.Types.Literal
 import GHC.Types.Id.Info
 import GHC.Types.RepType ( isZeroBitTy )
-import GHC.Types.Basic  ( Arity, RecFlag )
+import GHC.Types.Basic  ( Arity )
 import GHC.Types.ForeignCall
 import GHC.Types.Tickish
 
@@ -65,6 +64,67 @@ import GHC.Data.Bag
 import qualified Data.ByteString as BS
 
 
+{- Note [Overview of inlining heuristics]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Key examples
+------------
+Example 1:
+
+   let f x = case x of
+               A -> True
+               B -> <big>
+   in ...(f A)....(f B)...
+
+Even though f's entire RHS is big, it collapses to something small when applied
+to A.  We'd like to spot this.
+
+Example 1:
+
+   let f x = case x of
+               (p,q) -> case p of
+                           A -> True
+                           B -> <big>
+   in ...(f (A,3))....
+
+This is similar to Example 1, but nested.
+
+Example 3:
+
+   let j x = case y of
+               A -> True
+               B -> <big>
+   in case y of
+         A -> ..(j 3)...(j 4)....
+         B -> ...
+
+Here we want to spot that although the free far `y` is unknown at j's definition
+site, we know that y=A at the two calls in the A-alternative of the body. If `y`
+had been an argument we'd have spotted this; we'd like to get the same goodness
+when `y` is a free variable.
+
+This kind of thing can occur a lot with join points.
+
+Design overview
+---------------
+The question is whethe or not to inline f = rhs.
+The key idea is to abstract `rhs` to an ExprTree, which gives a measure of
+size, but records structure for case-expressions.
+
+
+The moving parts
+-----------------
+* An unfolding is accompanied (in its UnfoldingGuidance) with its GHC.Core.ExprTree,
+  computed by GHC.Core.Unfold.exprTree.
+
+* At a call site, GHC.Core.Opt.Simplify.Inline.contArgs constructs an ArgSummary
+  for each value argument. This reflects any nested data construtors.
+
+* Then GHC.Core.Unfold.exprTreeSize takes information about the context of the
+  call (particularly the ArgSummary for each argument) and computes a final size
+  for the inlined body, taking account of case-of-known-consructor.
+
+-}
+
 {- *********************************************************************
 *                                                                      *
                      UnfoldingOpts
@@ -160,73 +220,7 @@ updateReportPrefix :: Maybe String -> UnfoldingOpts -> UnfoldingOpts
 updateReportPrefix n opts = opts { unfoldingReportPrefix = n }
 
 
-{- *********************************************************************
-*                                                                      *
-                    Argument summary
-*                                                                      *
-********************************************************************* -}
-
-data ArgSummary = ArgNoInfo
-                | ArgIsCon AltCon [ArgSummary]  -- Includes type args
-                | ArgIsNot [AltCon]
-                | ArgIsLam
-
-hasArgInfo :: ArgSummary -> Bool
-hasArgInfo ArgNoInfo = False
-hasArgInfo _         = True
-
-instance Outputable ArgSummary where
-  ppr ArgNoInfo       = text "ArgNoInfo"
-  ppr ArgIsLam        = text "ArgIsLam"
-  ppr (ArgIsCon c as) = ppr c <> ppr as
-  ppr (ArgIsNot cs)   = text "ArgIsNot" <> ppr cs
-
-data CallCtxt
-  = BoringCtxt
-  | RhsCtxt RecFlag     -- Rhs of a let-binding; see Note [RHS of lets]
-  | DiscArgCtxt         -- Argument of a function with non-zero arg discount
-  | RuleArgCtxt         -- We are somewhere in the argument of a function with rules
-
-  | ValAppCtxt          -- We're applied to at least one value arg
-                        -- This arises when we have ((f x |> co) y)
-                        -- Then the (f x) has argument 'x' but in a ValAppCtxt
-
-  | CaseCtxt            -- We're the scrutinee of a case
-                        -- that decomposes its scrutinee
-
-instance Outputable CallCtxt where
-  ppr CaseCtxt    = text "CaseCtxt"
-  ppr ValAppCtxt  = text "ValAppCtxt"
-  ppr BoringCtxt  = text "BoringCtxt"
-  ppr (RhsCtxt ir)= text "RhsCtxt" <> parens (ppr ir)
-  ppr DiscArgCtxt = text "DiscArgCtxt"
-  ppr RuleArgCtxt = text "RuleArgCtxt"
-
 {-
-Note [Calculate unfolding guidance on the non-occ-anal'd expression]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Notice that we give the non-occur-analysed expression to
-calcUnfoldingGuidance.  In some ways it'd be better to occur-analyse
-first; for example, sometimes during simplification, there's a large
-let-bound thing which has been substituted, and so is now dead; so
-'expr' contains two copies of the thing while the occurrence-analysed
-expression doesn't.
-
-Nevertheless, we *don't* and *must not* occ-analyse before computing
-the size because
-
-a) The size computation bales out after a while, whereas occurrence
-   analysis does not.
-
-b) Residency increases sharply if you occ-anal first.  I'm not
-   100% sure why, but it's a large effect.  Compiling Cabal went
-   from residency of 534M to over 800M with this one change.
-
-This can occasionally mean that the guidance is very pessimistic;
-it gets fixed up next round.  And it should be rare, because large
-let-bound things that are dead are usually caught by preInlineUnconditionally
-
-
 ************************************************************************
 *                                                                      *
 \subsection{The UnfoldingGuidance type}
@@ -295,21 +289,18 @@ calcUnfoldingGuidance opts is_top_bottoming expr
     is_case (CaseOf {})  = True
     is_case (ScrutOf {}) = False
 
-{- We use 'couldBeSmallEnoughToInline' to avoid exporting inlinings that
-   we ``couldn't possibly use'' on the other side.  Can be overridden w/
-   flaggery.  Just the same as smallEnoughToInline, except that it has no
-   actual arguments.
--}
 
 couldBeSmallEnoughToInline :: UnfoldingOpts -> Int -> CoreExpr -> Bool
+-- We use 'couldBeSmallEnoughToInline' to avoid exporting inlinings that
+-- we ``couldn't possibly use'' on the other side.  Can be overridden
+-- w/flaggery.  Just the same as smallEnoughToInline, except that it has no
+-- actual arguments.
 couldBeSmallEnoughToInline opts threshold rhs
   = exprTreeWillInline threshold $
     exprTree opts [] body
   where
     (_, body) = collectBinders rhs
 
-----------------
-
 
 {- Note [Inline unsafeCoerce]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1035,6 +1026,22 @@ data InlineContext
                                         --          so apply result discount
      }
 
+data ArgSummary = ArgNoInfo
+                | ArgIsCon AltCon [ArgSummary]  -- Includes type args
+                | ArgIsNot [AltCon]
+                | ArgIsLam
+
+hasArgInfo :: ArgSummary -> Bool
+hasArgInfo ArgNoInfo = False
+hasArgInfo _         = True
+
+instance Outputable ArgSummary where
+  ppr ArgNoInfo       = text "ArgNoInfo"
+  ppr ArgIsLam        = text "ArgIsLam"
+  ppr (ArgIsCon c as) = ppr c <> ppr as
+  ppr (ArgIsNot cs)   = text "ArgIsNot" <> ppr cs
+
+
 -------------------------
 exprTreeWillInline :: Int -> ExprTree -> Bool
 -- (cheapExprTreeSize limit et) takes an upper bound `n` on the
@@ -1057,6 +1064,8 @@ exprTreeWillInline limit et
     go_alt :: AltTree -> (Int -> Bool) -> Int -> Bool
     go_alt (AltTree _ _ et) k n = go et k (n+10)
 
+
+-------------------------
 exprTreeSize :: InlineContext -> ExprTree -> Size
 exprTreeSize _    TooBig = STooBig
 exprTreeSize !ic (SizeIs { et_size  = size


=====================================
compiler/GHC/Core/Unfold/Make.hs
=====================================
@@ -312,6 +312,29 @@ Note [Honour INLINE on 0-ary bindings].
 
 I'm a bit worried that it's possible for the same kind of problem
 to arise for non-0-ary functions too, but let's wait and see.
+
+Note [Calculate unfolding guidance on the non-occ-anal'd expression]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Notice that we give the non-occur-analysed expression to
+calcUnfoldingGuidance.  In some ways it'd be better to occur-analyse
+first; for example, sometimes during simplification, there's a large
+let-bound thing which has been substituted, and so is now dead; so
+'expr' contains two copies of the thing while the occurrence-analysed
+expression doesn't.
+
+Nevertheless, we *don't* and *must not* occ-analyse before computing
+the size because
+
+a) The size computation bales out after a while, whereas occurrence
+   analysis does not.
+
+b) Residency increases sharply if you occ-anal first.  I'm not
+   100% sure why, but it's a large effect.  Compiling Cabal went
+   from residency of 534M to over 800M with this one change.
+
+This can occasionally mean that the guidance is very pessimistic;
+it gets fixed up next round.  And it should be rare, because large
+let-bound things that are dead are usually caught by preInlineUnconditionally
 -}
 
 mkUnfolding :: UnfoldingOpts



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/f5fbeeb2fd05518f1afb2ee3225bc083b960a81a

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/f5fbeeb2fd05518f1afb2ee3225bc083b960a81a
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231022/7416f648/attachment-0001.html>


More information about the ghc-commits mailing list