[Git][ghc/ghc][wip/spj-unf-size] Wibbles
Simon Peyton Jones (@simonpj)
gitlab at gitlab.haskell.org
Sun Oct 22 22:05:47 UTC 2023
Simon Peyton Jones pushed to branch wip/spj-unf-size at Glasgow Haskell Compiler / GHC
Commits:
f5fbeeb2 by Simon Peyton Jones at 2023-10-22T23:05:29+01:00
Wibbles
- - - - -
4 changed files:
- compiler/GHC/Core/Opt/Simplify/Inline.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Unfold.hs
- compiler/GHC/Core/Unfold/Make.hs
Changes:
=====================================
compiler/GHC/Core/Opt/Simplify/Inline.hs
=====================================
@@ -10,7 +10,7 @@ This module contains inlining logic used by the simplifier.
module GHC.Core.Opt.Simplify.Inline (
-- * The smart inlining decisions are made by callSiteInline
- callSiteInline, CallCtxt(..),
+ callSiteInline,
exprSummary
) where
@@ -40,7 +40,7 @@ import Data.List (isPrefixOf)
{-
************************************************************************
* *
-\subsection{callSiteInline}
+ callSiteInline
* *
************************************************************************
@@ -510,55 +510,14 @@ This kind of thing can occur if you have
foo = let x = e in (x,x)
which Roman did.
-
-
-}
-{-
-computeDiscount :: [Int] -> Int -> [ArgSummary] -> CallCtxt
- -> Int
-computeDiscount arg_discounts res_discount arg_infos cont_info
-
- = 10 -- Discount of 10 because the result replaces the call
- -- so we count 10 for the function itself
-
- + 10 * length actual_arg_discounts
- -- Discount of 10 for each arg supplied,
- -- because the result replaces the call
-
- + total_arg_discount + res_discount'
- where
- actual_arg_discounts = zipWith mk_arg_discount arg_discounts arg_infos
- total_arg_discount = sum actual_arg_discounts
-
- mk_arg_discount _ TrivArg = 0
- mk_arg_discount _ NonTrivArg = 10
- mk_arg_discount discount ValueArg = discount
- res_discount'
- | LT <- arg_discounts `compareLength` arg_infos
- = res_discount -- Over-saturated
- | otherwise
- = case cont_info of
- BoringCtxt -> 0
- CaseCtxt -> res_discount -- Presumably a constructor
- ValAppCtxt -> res_discount -- Presumably a function
- _ -> 40 `min` res_discount
- -- ToDo: this 40 `min` res_discount doesn't seem right
- -- for DiscArgCtxt it shouldn't matter because the function will
- -- get the arg discount for any non-triv arg
- -- for RuleArgCtxt we do want to be keener to inline; but not only
- -- constructor results
- -- for RhsCtxt I suppose that exposing a data con is good in general
- -- And 40 seems very arbitrary
- --
- -- res_discount can be very large when a function returns
- -- constructors; but we only want to invoke that large discount
- -- when there's a case continuation.
- -- Otherwise we, rather arbitrarily, threshold it. Yuk.
- -- But we want to avoid inlining large functions that return
- -- constructors into contexts that are simply "interesting"
--}
+{- *********************************************************************
+* *
+ Computing ArgSummary
+* *
+********************************************************************* -}
{- Note [Interesting arguments]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -29,6 +29,9 @@ module GHC.Core.Opt.Simplify.Utils (
mkBoringStop, mkRhsStop, mkLazyArgStop,
interestingCallContext,
+ -- The CallCtxt type
+ CallCtxt(..),
+
-- ArgInfo
ArgInfo(..), ArgSpec(..), RewriteCall(..), mkArgInfo,
addValArgTo, addCastTo, addTyArgTo,
@@ -516,6 +519,38 @@ contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se })
= perhapsSubstTy d se (idType b)
+contHasRules :: SimplCont -> Bool
+-- If the argument has form (f x y), where x,y are boring,
+-- and f is marked INLINE, then we don't want to inline f.
+-- But if the context of the argument is
+-- g (f x y)
+-- where g has rules, then we *do* want to inline f, in case it
+-- exposes a rule that might fire. Similarly, if the context is
+-- h (g (f x x))
+-- where h has rules, then we do want to inline f. So contHasRules
+-- tries to see if the context of the f-call is a call to a function
+-- with rules.
+--
+-- The ai_encl flag makes this happen; if it's
+-- set, the inliner gets just enough keener to inline f
+-- regardless of how boring f's arguments are, if it's marked INLINE
+--
+-- The alternative would be to *always* inline an INLINE function,
+-- regardless of how boring its context is; but that seems overkill
+-- For example, it'd mean that wrapper functions were always inlined
+contHasRules cont
+ = go cont
+ where
+ go (ApplyToVal { sc_cont = cont }) = go cont
+ go (ApplyToTy { sc_cont = cont }) = go cont
+ go (CastIt _ cont) = go cont
+ go (StrictArg { sc_fun = fun }) = ai_encl fun
+ go (Stop _ RuleArgCtxt _) = True
+ go (TickIt _ c) = go c
+ go (Select {}) = False
+ go (StrictBind {}) = False -- ??
+ go (Stop _ _ _) = False
+
-- Computes the multiplicity scaling factor at the hole. That is, in (case [] of
-- x ::(p) _ { … }) (respectively for arguments of functions), the scaling
-- factor is p. And in E[G[]], the scaling factor is the product of the scaling
@@ -709,15 +744,36 @@ make use of the strictness info for the function.
-}
-{-
-************************************************************************
+{- *********************************************************************
* *
- Interesting arguments
+ CallCtxt: the context of a call
* *
-************************************************************************
+********************************************************************* -}
-Note [Interesting call context]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+data CallCtxt
+ = BoringCtxt
+ | RhsCtxt RecFlag -- Rhs of a let-binding; see Note [RHS of lets]
+ | DiscArgCtxt -- Argument of a function with non-zero arg discount
+ | RuleArgCtxt -- We are somewhere in the argument of a function with rules
+
+ | ValAppCtxt -- We're applied to at least one value arg
+ -- This arises when we have ((f x |> co) y)
+ -- Then the (f x) has argument 'x' but in a ValAppCtxt
+
+ | CaseCtxt -- We're the scrutinee of a case
+ -- that decomposes its scrutinee
+
+instance Outputable CallCtxt where
+ ppr CaseCtxt = text "CaseCtxt"
+ ppr ValAppCtxt = text "ValAppCtxt"
+ ppr BoringCtxt = text "BoringCtxt"
+ ppr (RhsCtxt ir)= text "RhsCtxt" <> parens (ppr ir)
+ ppr DiscArgCtxt = text "DiscArgCtxt"
+ ppr RuleArgCtxt = text "RuleArgCtxt"
+
+
+{- Note [Interesting call context]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We want to avoid inlining an expression where there can't possibly be
any gain, such as in an argument position. Hence, if the continuation
is interesting (eg. a case scrutinee that isn't just a seq, application etc.)
@@ -873,38 +929,6 @@ interestingCallContext env cont
-- a build it's *great* to inline it here. So we must ensure that
-- the context for (f x) is not totally uninteresting.
-contHasRules :: SimplCont -> Bool
--- If the argument has form (f x y), where x,y are boring,
--- and f is marked INLINE, then we don't want to inline f.
--- But if the context of the argument is
--- g (f x y)
--- where g has rules, then we *do* want to inline f, in case it
--- exposes a rule that might fire. Similarly, if the context is
--- h (g (f x x))
--- where h has rules, then we do want to inline f. So contHasRules
--- tries to see if the context of the f-call is a call to a function
--- with rules.
---
--- The ai_encl flag makes this happen; if it's
--- set, the inliner gets just enough keener to inline f
--- regardless of how boring f's arguments are, if it's marked INLINE
---
--- The alternative would be to *always* inline an INLINE function,
--- regardless of how boring its context is; but that seems overkill
--- For example, it'd mean that wrapper functions were always inlined
-contHasRules cont
- = go cont
- where
- go (ApplyToVal { sc_cont = cont }) = go cont
- go (ApplyToTy { sc_cont = cont }) = go cont
- go (CastIt _ cont) = go cont
- go (StrictArg { sc_fun = fun }) = ai_encl fun
- go (Stop _ RuleArgCtxt _) = True
- go (TickIt _ c) = go c
- go (Select {}) = False
- go (StrictBind {}) = False -- ??
- go (Stop _ _ _) = False
-
{-
************************************************************************
=====================================
compiler/GHC/Core/Unfold.hs
=====================================
@@ -2,7 +2,6 @@
(c) The University of Glasgow 2006
(c) The AQUA Project, Glasgow University, 1994-1998
-
Core-syntax unfoldings
Unfoldings (which can travel across module boundaries) are in Core
@@ -23,7 +22,7 @@ module GHC.Core.Unfold (
ExprTree, exprTree, exprTreeSize,
exprTreeWillInline, couldBeSmallEnoughToInline,
- ArgSummary(..), CallCtxt(..), hasArgInfo,
+ ArgSummary(..), hasArgInfo,
Size, leqSize, addSizeN, adjustSize,
InlineContext(..),
@@ -49,7 +48,7 @@ import GHC.Types.Var.Env
import GHC.Types.Literal
import GHC.Types.Id.Info
import GHC.Types.RepType ( isZeroBitTy )
-import GHC.Types.Basic ( Arity, RecFlag )
+import GHC.Types.Basic ( Arity )
import GHC.Types.ForeignCall
import GHC.Types.Tickish
@@ -65,6 +64,67 @@ import GHC.Data.Bag
import qualified Data.ByteString as BS
+{- Note [Overview of inlining heuristics]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Key examples
+------------
+Example 1:
+
+ let f x = case x of
+ A -> True
+ B -> <big>
+ in ...(f A)....(f B)...
+
+Even though f's entire RHS is big, it collapses to something small when applied
+to A. We'd like to spot this.
+
+Example 1:
+
+ let f x = case x of
+ (p,q) -> case p of
+ A -> True
+ B -> <big>
+ in ...(f (A,3))....
+
+This is similar to Example 1, but nested.
+
+Example 3:
+
+ let j x = case y of
+ A -> True
+ B -> <big>
+ in case y of
+ A -> ..(j 3)...(j 4)....
+ B -> ...
+
+Here we want to spot that although the free far `y` is unknown at j's definition
+site, we know that y=A at the two calls in the A-alternative of the body. If `y`
+had been an argument we'd have spotted this; we'd like to get the same goodness
+when `y` is a free variable.
+
+This kind of thing can occur a lot with join points.
+
+Design overview
+---------------
+The question is whethe or not to inline f = rhs.
+The key idea is to abstract `rhs` to an ExprTree, which gives a measure of
+size, but records structure for case-expressions.
+
+
+The moving parts
+-----------------
+* An unfolding is accompanied (in its UnfoldingGuidance) with its GHC.Core.ExprTree,
+ computed by GHC.Core.Unfold.exprTree.
+
+* At a call site, GHC.Core.Opt.Simplify.Inline.contArgs constructs an ArgSummary
+ for each value argument. This reflects any nested data construtors.
+
+* Then GHC.Core.Unfold.exprTreeSize takes information about the context of the
+ call (particularly the ArgSummary for each argument) and computes a final size
+ for the inlined body, taking account of case-of-known-consructor.
+
+-}
+
{- *********************************************************************
* *
UnfoldingOpts
@@ -160,73 +220,7 @@ updateReportPrefix :: Maybe String -> UnfoldingOpts -> UnfoldingOpts
updateReportPrefix n opts = opts { unfoldingReportPrefix = n }
-{- *********************************************************************
-* *
- Argument summary
-* *
-********************************************************************* -}
-
-data ArgSummary = ArgNoInfo
- | ArgIsCon AltCon [ArgSummary] -- Includes type args
- | ArgIsNot [AltCon]
- | ArgIsLam
-
-hasArgInfo :: ArgSummary -> Bool
-hasArgInfo ArgNoInfo = False
-hasArgInfo _ = True
-
-instance Outputable ArgSummary where
- ppr ArgNoInfo = text "ArgNoInfo"
- ppr ArgIsLam = text "ArgIsLam"
- ppr (ArgIsCon c as) = ppr c <> ppr as
- ppr (ArgIsNot cs) = text "ArgIsNot" <> ppr cs
-
-data CallCtxt
- = BoringCtxt
- | RhsCtxt RecFlag -- Rhs of a let-binding; see Note [RHS of lets]
- | DiscArgCtxt -- Argument of a function with non-zero arg discount
- | RuleArgCtxt -- We are somewhere in the argument of a function with rules
-
- | ValAppCtxt -- We're applied to at least one value arg
- -- This arises when we have ((f x |> co) y)
- -- Then the (f x) has argument 'x' but in a ValAppCtxt
-
- | CaseCtxt -- We're the scrutinee of a case
- -- that decomposes its scrutinee
-
-instance Outputable CallCtxt where
- ppr CaseCtxt = text "CaseCtxt"
- ppr ValAppCtxt = text "ValAppCtxt"
- ppr BoringCtxt = text "BoringCtxt"
- ppr (RhsCtxt ir)= text "RhsCtxt" <> parens (ppr ir)
- ppr DiscArgCtxt = text "DiscArgCtxt"
- ppr RuleArgCtxt = text "RuleArgCtxt"
-
{-
-Note [Calculate unfolding guidance on the non-occ-anal'd expression]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Notice that we give the non-occur-analysed expression to
-calcUnfoldingGuidance. In some ways it'd be better to occur-analyse
-first; for example, sometimes during simplification, there's a large
-let-bound thing which has been substituted, and so is now dead; so
-'expr' contains two copies of the thing while the occurrence-analysed
-expression doesn't.
-
-Nevertheless, we *don't* and *must not* occ-analyse before computing
-the size because
-
-a) The size computation bales out after a while, whereas occurrence
- analysis does not.
-
-b) Residency increases sharply if you occ-anal first. I'm not
- 100% sure why, but it's a large effect. Compiling Cabal went
- from residency of 534M to over 800M with this one change.
-
-This can occasionally mean that the guidance is very pessimistic;
-it gets fixed up next round. And it should be rare, because large
-let-bound things that are dead are usually caught by preInlineUnconditionally
-
-
************************************************************************
* *
\subsection{The UnfoldingGuidance type}
@@ -295,21 +289,18 @@ calcUnfoldingGuidance opts is_top_bottoming expr
is_case (CaseOf {}) = True
is_case (ScrutOf {}) = False
-{- We use 'couldBeSmallEnoughToInline' to avoid exporting inlinings that
- we ``couldn't possibly use'' on the other side. Can be overridden w/
- flaggery. Just the same as smallEnoughToInline, except that it has no
- actual arguments.
--}
couldBeSmallEnoughToInline :: UnfoldingOpts -> Int -> CoreExpr -> Bool
+-- We use 'couldBeSmallEnoughToInline' to avoid exporting inlinings that
+-- we ``couldn't possibly use'' on the other side. Can be overridden
+-- w/flaggery. Just the same as smallEnoughToInline, except that it has no
+-- actual arguments.
couldBeSmallEnoughToInline opts threshold rhs
= exprTreeWillInline threshold $
exprTree opts [] body
where
(_, body) = collectBinders rhs
-----------------
-
{- Note [Inline unsafeCoerce]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1035,6 +1026,22 @@ data InlineContext
-- so apply result discount
}
+data ArgSummary = ArgNoInfo
+ | ArgIsCon AltCon [ArgSummary] -- Includes type args
+ | ArgIsNot [AltCon]
+ | ArgIsLam
+
+hasArgInfo :: ArgSummary -> Bool
+hasArgInfo ArgNoInfo = False
+hasArgInfo _ = True
+
+instance Outputable ArgSummary where
+ ppr ArgNoInfo = text "ArgNoInfo"
+ ppr ArgIsLam = text "ArgIsLam"
+ ppr (ArgIsCon c as) = ppr c <> ppr as
+ ppr (ArgIsNot cs) = text "ArgIsNot" <> ppr cs
+
+
-------------------------
exprTreeWillInline :: Int -> ExprTree -> Bool
-- (cheapExprTreeSize limit et) takes an upper bound `n` on the
@@ -1057,6 +1064,8 @@ exprTreeWillInline limit et
go_alt :: AltTree -> (Int -> Bool) -> Int -> Bool
go_alt (AltTree _ _ et) k n = go et k (n+10)
+
+-------------------------
exprTreeSize :: InlineContext -> ExprTree -> Size
exprTreeSize _ TooBig = STooBig
exprTreeSize !ic (SizeIs { et_size = size
=====================================
compiler/GHC/Core/Unfold/Make.hs
=====================================
@@ -312,6 +312,29 @@ Note [Honour INLINE on 0-ary bindings].
I'm a bit worried that it's possible for the same kind of problem
to arise for non-0-ary functions too, but let's wait and see.
+
+Note [Calculate unfolding guidance on the non-occ-anal'd expression]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Notice that we give the non-occur-analysed expression to
+calcUnfoldingGuidance. In some ways it'd be better to occur-analyse
+first; for example, sometimes during simplification, there's a large
+let-bound thing which has been substituted, and so is now dead; so
+'expr' contains two copies of the thing while the occurrence-analysed
+expression doesn't.
+
+Nevertheless, we *don't* and *must not* occ-analyse before computing
+the size because
+
+a) The size computation bales out after a while, whereas occurrence
+ analysis does not.
+
+b) Residency increases sharply if you occ-anal first. I'm not
+ 100% sure why, but it's a large effect. Compiling Cabal went
+ from residency of 534M to over 800M with this one change.
+
+This can occasionally mean that the guidance is very pessimistic;
+it gets fixed up next round. And it should be rare, because large
+let-bound things that are dead are usually caught by preInlineUnconditionally
-}
mkUnfolding :: UnfoldingOpts
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/f5fbeeb2fd05518f1afb2ee3225bc083b960a81a
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/f5fbeeb2fd05518f1afb2ee3225bc083b960a81a
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231022/7416f648/attachment-0001.html>
More information about the ghc-commits
mailing list