[Git][ghc/ghc][wip/T24124] Lower seq# early, in CorePrep (#24124)
Sebastian Graf (@sgraf812)
gitlab at gitlab.haskell.org
Sun Dec 10 10:32:42 UTC 2023
Sebastian Graf pushed to branch wip/T24124 at Glasgow Haskell Compiler / GHC
Commits:
97c2463f by Sebastian Graf at 2023-12-10T11:32:27+01:00
Lower seq# early, in CorePrep (#24124)
We can save many explanations in Tag Inference and StgToCmm in doing so.
See the updated `Note [seq# magic]`.
I also implemented a new `Note [Flatten case-bind]`
to get better code for otherwise nested case scrutinees.
Fixes #24124.
- - - - -
6 changed files:
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Stg/InferTags.hs
- compiler/GHC/StgToCmm/Expr.hs
- testsuite/tests/simplStg/should_compile/T15226b.stderr
Changes:
=====================================
compiler/GHC/Builtin/primops.txt.pp
=====================================
@@ -3646,6 +3646,7 @@ primop SeqOp "seq#" GenPrimOp
with
effect = ThrowsException
work_free = True -- seq# does work iff its lifted arg does work
+ -- no strictness signature: See Note [seq# magic], (SEQ2)
primop GetSparkOp "getSpark#" GenPrimOp
State# s -> (# State# s, Int#, a #)
=====================================
compiler/GHC/Core/Opt/ConstantFold.hs
=====================================
@@ -2054,7 +2054,8 @@ The semantics of seq# is
Things to note
-* Why do we need a primop at all? That is, instead of
+(SEQ1)
+ Why do we need a primop at all? That is, instead of
case seq# x s of (# x, s #) -> blah
why not instead say this?
case x of { DEFAULT -> blah }
@@ -2069,7 +2070,16 @@ Things to note
In short, we /always/ evaluate the first argument and never
just discard it.
-* Why return the value? So that we can control sharing of seq'd
+(SEQ2)
+ `seq#` evaluates its argument, but does /not/ expose that strictness
+ in its strictness signature. Why not? Because `seq#` is intended to mean
+ "evaluate this argument now -- not earlier". For example:
+ do { evaluate x; evaluate y }
+ should evaluate `x` and then `y`. If `seq#` was visibly strict, they
+ might be evaluated in the opposite order.
+
+(SEQ3)
+ Why return the value? So that we can control sharing of seq'd
values: in
let x = e in x `seq` ... x ...
We don't want to inline x, so better to represent it as
@@ -2080,14 +2090,35 @@ Implementing seq#. The compiler has magic for SeqOp in
- GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <whnf> s)
-- GHC.StgToCmm.Expr.cgExpr, and cgCase: special case for seq#
-
- Simplify.addEvals records evaluated-ness for the result; see
Note [Adding evaluatedness info to pattern-bound variables]
in GHC.Core.Opt.Simplify.Iteration
-- Likewise, GHC.Stg.InferTags.inferTagExpr knows that seq# returns a
- properly-tagged pointer inside of its unboxed-tuple result.
+- GHC.CoreToStg.Prep: Lower seq# to a Case, e.g.,
+
+ case seq# (f 13) s of (# s', r #) -> rhs
+ ==>
+ case f 13 of sat of __DEFAULT -> rhs[sat/r,s/s']
+
+ this is implemented in two steps, not unlike Note [runRW magic], but
+ unfortunately not entirely local to `cpeApp`:
+
+ 1. In `cpeApp`, lower the application
+ seq# (f 13) s
+ ==>
+ case f 13 of sat __DEFAULT -> (# s, sat #)
+ 2. In `cpeRhsE Case{}`, catch the opportunity for beta reducing
+ case (# s, sat #) of (# s', r #) -> rhs
+ ==>
+ rhs[sat/r,s/s']
+
+ While (2) would be done by Unarise, it is not optional, because
+ substituting here allows us to carry over demand info and evaluatedness
+ to detect more values in `rhs`; see Note [Pin demand info on floats].
+
+ Note that CorePrep really allocates a strict Float for `f 13`.
+ That's OK, because the telescope of Floats always stays in the same order,
+ so all guarantees of evaluation order provided by seq# are upheld.
-}
seqRule :: RuleM CoreExpr
=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -30,6 +30,7 @@ import GHC.Unit
import GHC.Builtin.Names
import GHC.Builtin.Types
+import GHC.Builtin.PrimOps
import GHC.Core.Utils
import GHC.Core.Opt.Arity
@@ -159,7 +160,7 @@ Here is the syntax of the Core produced by CorePrep:
Trivial expressions
arg ::= lit | var
| arg ty | /\a. arg
- | truv co | /\c. arg | arg |> co
+ | arg co | /\c. arg | arg |> co
Applications
app ::= lit | var | app arg | app ty | app co | app |> co
@@ -179,7 +180,7 @@ with the corresponding name produce a result in that syntax.
-}
type CpeArg = CoreExpr -- Non-terminal 'arg'
-type CpeApp = CoreExpr -- Non-terminal 'app'
+type AIApp = CoreExpr -- Non-terminal 'app'
type CpeBody = CoreExpr -- Non-terminal 'body'
type CpeRhs = CoreExpr -- Non-terminal 'rhs'
@@ -839,16 +840,38 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con bs _])
cpeRhsE env (Case scrut bndr ty alts)
= do { (floats, scrut') <- cpeBody env scrut
+ -- See Note [seq# magic]. This is step (2) for CorePrep
+ ; case alts of
+ [Alt (DataAlt dc) [token,thing] rhs]
+ | isTupleDataCon dc
+ , isDeadBinder bndr
+ , Var v `App` Type{} `App` Type{} `App` Type{} `App` Type{} `App` Var token' `App` Var thing' <- scrut'
+ , Just dc' <- isDataConWorkId_maybe v, dc' == dc
+ -> do { rhs' <- cpeBodyNF (extendCorePrepEnvList env [(token,token'), (thing,thing')]) rhs
+ ; return (floats, rhs') }
+ _ -> do {
+ -- End of seq# magic
; (env', bndr2) <- cpCloneBndr env bndr
; let alts'
| cp_catchNonexhaustiveCases $ cpe_config env
+ -- Suppose the alternatives do not cover all the data constructors of the type.
+ -- That may be fine: perhaps an earlier case has dealt with the missing cases.
+ -- But this is a relatively sophisticated property, so we provide a GHC-debugging flag
+ -- `-fcatch-nonexhaustive-cases` which adds a DEFAULT alternative to such cases
+ -- (This alternative will only be taken if there is a bug in GHC.)
, not (altsAreExhaustive alts)
= addDefault alts (Just err)
| otherwise = alts
where err = mkImpossibleExpr ty "cpeRhsE: missing case alternative"
; alts'' <- mapM (sat_alt env') alts'
- ; return (floats, Case scrut' bndr2 ty alts'') }
+ ; case alts'' of
+ [Alt DEFAULT _ rhs] -- See Note [Flatten case-binds]
+ | let is_unlifted = mightBeUnliftedType (idType bndr2)
+ , let float = mkCaseFloat is_unlifted bndr2 scrut'
+ -- evalDmd states that this is a strict float
+ -> return (snocFloat floats float, rhs)
+ _ -> return (floats, Case scrut' bndr2 ty alts'') }}
where
sat_alt env (Alt con bs rhs)
= do { (env2, bs') <- cpCloneBndrs env bs
@@ -937,14 +960,14 @@ and it's extra work.
-- CpeApp: produces a result satisfying CpeApp
-- ---------------------------------------------------------------------------
-data ArgInfo = CpeApp CoreArg
- | CpeCast Coercion
- | CpeTick CoreTickish
+data ArgInfo = AIApp CoreArg -- NB: Not a CpeApp yet
+ | AICast Coercion
+ | AITick CoreTickish
instance Outputable ArgInfo where
- ppr (CpeApp arg) = text "app" <+> ppr arg
- ppr (CpeCast co) = text "cast" <+> ppr co
- ppr (CpeTick tick) = text "tick" <+> ppr tick
+ ppr (AIApp arg) = text "app" <+> ppr arg
+ ppr (AICast co) = text "cast" <+> ppr co
+ ppr (AITick tick) = text "tick" <+> ppr tick
{- Note [Ticks and mandatory eta expansion]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1005,9 +1028,9 @@ cpeApp top_env expr
collect_args e = go e []
where
go (App fun arg) as
- = go fun (CpeApp arg : as)
+ = go fun (AIApp arg : as)
go (Cast fun co) as
- = go fun (CpeCast co : as)
+ = go fun (AICast co : as)
go (Tick tickish fun) as
-- Profiling ticks are slightly less strict so we expand their scope
-- if they cover partial applications of things like primOps.
@@ -1020,7 +1043,7 @@ cpeApp top_env expr
, etaExpansionTick head' tickish
= (head,as')
where
- (head,as') = go fun (CpeTick tickish : as)
+ (head,as') = go fun (AITick tickish : as)
-- Terminal could still be an app if it's wrapped by a tick.
-- E.g. Tick<foo> (f x) can give us (f x) as terminal.
@@ -1030,7 +1053,7 @@ cpeApp top_env expr
-> CoreExpr -- The thing we are calling
-> [ArgInfo]
-> UniqSM (Floats, CpeRhs)
- cpe_app env (Var f) (CpeApp Type{} : CpeApp arg : args)
+ cpe_app env (Var f) (AIApp Type{} : AIApp arg : args)
| f `hasKey` lazyIdKey -- Replace (lazy a) with a, and
-- See Note [lazyId magic] in GHC.Types.Id.Make
|| f `hasKey` noinlineIdKey || f `hasKey` noinlineConstraintIdKey
@@ -1056,24 +1079,36 @@ cpeApp top_env expr
in cpe_app env terminal (args' ++ args)
-- runRW# magic
- cpe_app env (Var f) (CpeApp _runtimeRep at Type{} : CpeApp _type at Type{} : CpeApp arg : rest)
+ cpe_app env (Var f) (AIApp _runtimeRep at Type{} : AIApp _type at Type{} : AIApp arg : rest)
| f `hasKey` runRWKey
-- N.B. While it may appear that n == 1 in the case of runRW#
-- applications, keep in mind that we may have applications that return
- , has_value_arg (CpeApp arg : rest)
+ , has_value_arg (AIApp arg : rest)
-- See Note [runRW magic]
-- Replace (runRW# f) by (f realWorld#), beta reducing if possible (this
-- is why we return a CorePrepEnv as well)
= case arg of
Lam s body -> cpe_app (extendCorePrepEnv env s realWorldPrimId) body rest
- _ -> cpe_app env arg (CpeApp (Var realWorldPrimId) : rest)
+ _ -> cpe_app env arg (AIApp (Var realWorldPrimId) : rest)
-- TODO: What about casts?
where
has_value_arg [] = False
- has_value_arg (CpeApp arg:_rest)
+ has_value_arg (AIApp arg:_rest)
| not (isTyCoArg arg) = True
has_value_arg (_:rest) = has_value_arg rest
+ -- See Note [seq# magic]. This is step (1) for CorePrep
+ cpe_app env (Var f) [AIApp (Type ty), AIApp _st_ty at Type{}, AIApp thing, AIApp (Var token)]
+ | PrimOpId SeqOp _ <- idDetails f
+ -- seq# thing token ==> case thing of res { __DEFAULT -> (# token, res#) },
+ -- allocating a Float for (case thing of res { __DEFAULT -> _ })
+ = do { (floats, thing) <- cpeBody env thing
+ ; case_bndr <- newVar ty
+ ; let tup = mkCoreUnboxedTuple [lookupCorePrepEnv env token, Var case_bndr]
+ ; let is_unlifted = False -- otherwise seq# would not type-check
+ ; let float = mkCaseFloat is_unlifted case_bndr thing
+ ; return (floats `snocFloat` float, tup) }
+
cpe_app env (Var v) args
= do { v1 <- fiddleCCall v
; let e2 = lookupCorePrepEnv env v1
@@ -1120,13 +1155,13 @@ cpeApp top_env expr
go [] !n = n
go (info:infos) n =
case info of
- CpeCast {} -> go infos n
- CpeTick tickish
+ AICast {} -> go infos n
+ AITick tickish
| tickishFloatable tickish -> go infos n
-- If we can't guarantee a tick will be floated out of the application
-- we can't guarantee the value args following it will be applied.
| otherwise -> n
- CpeApp e -> go infos n'
+ AIApp e -> go infos n'
where
!n'
| isTypeArg e = n
@@ -1148,12 +1183,12 @@ cpeApp top_env expr
rebuild_app
:: CorePrepEnv
-> [ArgInfo] -- The arguments (inner to outer)
- -> CpeApp -- The function
+ -> AIApp -- The function
-> Floats -- INVARIANT: These floats don't bind anything that is in the CpeApp!
-- Just stuff floated out from the head of the application.
-> [Demand]
-> Maybe Arity
- -> UniqSM (CpeApp
+ -> UniqSM (AIApp
,Floats
,[CoreTickish] -- Underscoped ticks. See Note [Ticks and mandatory eta expansion]
)
@@ -1163,12 +1198,12 @@ cpeApp top_env expr
rebuild_app'
:: CorePrepEnv
-> [ArgInfo] -- The arguments (inner to outer)
- -> CpeApp
+ -> AIApp
-> Floats
-> [Demand]
-> [CoreTickish]
-> Int -- Number of arguments required to satisfy minimal tick scopes.
- -> UniqSM (CpeApp, Floats, [CoreTickish])
+ -> UniqSM (AIApp, Floats, [CoreTickish])
rebuild_app' _ [] app floats ss rt_ticks !_req_depth
= assertPpr (null ss) (ppr ss)-- make sure we used all the strictness info
return (app, floats, rt_ticks)
@@ -1182,13 +1217,13 @@ cpeApp top_env expr
let tick_fun = foldr mkTick fun' rt_ticks
in rebuild_app' env (a : as) tick_fun floats ss rt_ticks req_depth
- CpeApp (Type arg_ty)
+ AIApp (Type arg_ty)
-> rebuild_app' env as (App fun' (Type arg_ty)) floats ss rt_ticks req_depth
- CpeApp (Coercion co)
+ AIApp (Coercion co)
-> rebuild_app' env as (App fun' (Coercion co)) floats (drop 1 ss) rt_ticks req_depth
- CpeApp arg -> do
+ AIApp arg -> do
let (ss1, ss_rest) -- See Note [lazyId magic] in GHC.Types.Id.Make
= case (ss, isLazyExpr arg) of
(_ : ss_rest, True) -> (topDmd, ss_rest)
@@ -1197,10 +1232,10 @@ cpeApp top_env expr
(fs, arg') <- cpeArg top_env ss1 arg
rebuild_app' env as (App fun' arg') (fs `zipFloats` floats) ss_rest rt_ticks (req_depth-1)
- CpeCast co
+ AICast co
-> rebuild_app' env as (Cast fun' co) floats ss rt_ticks req_depth
-- See Note [Ticks and mandatory eta expansion]
- CpeTick tickish
+ AITick tickish
| tickishPlace tickish == PlaceRuntime
, req_depth > 0
-> assert (isProfTick tickish) $
@@ -1536,7 +1571,7 @@ applications here as well but due to this fragility (see #16846) we now deal
with this another way, as described in Note [Primop wrappers] in GHC.Builtin.PrimOps.
-}
-maybeSaturate :: Id -> CpeApp -> Int -> [CoreTickish] -> UniqSM CpeRhs
+maybeSaturate :: Id -> AIApp -> Int -> [CoreTickish] -> UniqSM CpeRhs
maybeSaturate fn expr n_args unsat_ticks
| hasNoBinding fn -- There's no binding
= return $ wrapLamBody (\body -> foldr mkTick body unsat_ticks) sat_expr
@@ -1704,6 +1739,27 @@ Note [Pin demand info on floats]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We pin demand info on floated lets, so that we can see the one-shot thunks.
+Note [Flatten case-binds]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider the following program involving seq#:
+
+ data T a = T !a
+ ... case seq# (case x of y { __DEFAULT -> T y }) s of (# s', x' #) -> rhs
+ ==> {ANFise, lowering seq# as in Note [seq# magic]}
+ ... case (case x of y { __DEFAULT -> T y }) of sat { __DEFAULT -> rhs[s/s',sat/x'] }
+
+(Why didn't the Simplifier float out `case x of y`? Because `seq#` is lazy;
+see Note [seq# magic].)
+Note the case-of-case. This is not bad per sé, but we can easily flatten
+this situation by calling `mkNonRecFloat` to create strict binding `y=x`:
+
+ ... case x of y { __DEFAULT -> let sat = T y in rhs[s/s',sat/x'] } ...
+
+where `T y` is simply let-bound, thus far less likely to confuse passes
+downstream. We simply achieve this by calling `mkNonRecFloat` in the `Case`
+equation of `cpeRhsE` to create a strict float (`evalDmd`). This mirrors what we
+do for let-bindings, when we create a LetBound float: see `cpeBind`.
+
Note [Speculative evaluation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Since call-by-value is much cheaper than call-by-need, we case-bind arguments
@@ -1982,12 +2038,27 @@ zipFloats = appFloats
zipManyFloats :: [Floats] -> Floats
zipManyFloats = foldr zipFloats emptyFloats
+mkCaseFloat :: Bool -> Id -> CpeRhs -> FloatingBind
+mkCaseFloat is_unlifted bndr scrut = Float (NonRec bndr scrut) bound info
+ where
+ (bound, info)
+ -- See the comments in mkNonRecFloat for the classification
+ | is_lifted, is_hnf = (LetBound, TopLvlFloatable)
+ | is_data_con bndr = (LetBound, TopLvlFloatable)
+ | exprIsTickedString scrut = (CaseBound, TopLvlFloatable)
+ | otherwise = (CaseBound, StrictContextFloatable)
+ -- For a Case, we never want to drop the eval; hence no need to test
+ -- for ok-for-spec-eval
+ is_lifted = not is_unlifted
+ is_hnf = exprIsHNF scrut
+ is_data_con = isJust . isDataConId_maybe
+
mkNonRecFloat :: CorePrepEnv -> Demand -> Bool -> Id -> CpeRhs -> FloatingBind
mkNonRecFloat env dmd is_unlifted bndr rhs = -- pprTraceWith "mkNonRecFloat" ppr $
Float (NonRec bndr' rhs) bound info
where
bndr' = setIdDemandInfo bndr dmd -- See Note [Pin demand info on floats]
- (bound,info)
+ (bound, info)
| is_lifted, is_hnf = (LetBound, TopLvlFloatable)
-- is_lifted: We currently don't allow unlifted values at the
-- top-level or inside letrecs
@@ -2012,7 +2083,7 @@ mkNonRecFloat env dmd is_unlifted bndr rhs = -- pprTraceWith "mkNonRecFloat" ppr
is_lifted = not is_unlifted
is_hnf = exprIsHNF rhs
- is_strict = isStrUsedDmd dmd
+ is_strict = isStrUsedDmd dmd || isEvaldUnfolding (idUnfolding bndr)
ok_for_spec = exprOkForSpecEval (not . is_rec_call) rhs
is_rec_call = (`elemUnVarSet` cpe_rec_ids env)
is_data_con = isJust . isDataConId_maybe
=====================================
compiler/GHC/Stg/InferTags.hs
=====================================
@@ -19,7 +19,6 @@ import GHC.Types.Basic ( CbvMark (..) )
import GHC.Types.Unique.Supply (mkSplitUniqSupply)
import GHC.Types.RepType (dataConRuntimeRepStrictness)
import GHC.Core (AltCon(..))
-import GHC.Builtin.PrimOps ( PrimOp(..) )
import Data.List (mapAccumL)
import GHC.Utils.Outputable
import GHC.Utils.Misc( zipWithEqual, zipEqual, notNull )
@@ -333,21 +332,7 @@ inferTagExpr env (StgTick tick body)
(info, body') = inferTagExpr env body
inferTagExpr _ (StgOpApp op args ty)
- | StgPrimOp SeqOp <- op
- -- Recall seq# :: a -> State# s -> (# State# s, a #)
- -- However the output State# token has been unarised away,
- -- so we now effectively have
- -- seq# :: a -> State# s -> (# a #)
- -- The key point is the result of `seq#` is guaranteed evaluated and properly
- -- tagged (because that result comes directly from evaluating the arg),
- -- and we want tag inference to reflect that knowledge (#15226).
- -- Hence `TagTuple [TagProper]`.
- -- See Note [seq# magic] in GHC.Core.Opt.ConstantFold
- = (TagTuple [TagProper], StgOpApp op args ty)
- -- Do any other primops guarantee to return a properly tagged value?
- -- Probably not, and that is the conservative assumption anyway.
- -- (And foreign calls definitely need not make promises.)
- | otherwise = (TagDunno, StgOpApp op args ty)
+ = (TagDunno, StgOpApp op args ty)
inferTagExpr env (StgLet ext bind body)
= (info, StgLet ext bind' body')
=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -68,11 +68,6 @@ cgExpr :: CgStgExpr -> FCode ReturnKind
cgExpr (StgApp fun args) = cgIdApp fun args
--- seq# a s ==> a
--- See Note [seq# magic] in GHC.Core.Opt.ConstantFold
-cgExpr (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _res_ty) =
- cgIdApp a []
-
-- dataToTagLarge# :: a_levpoly -> Int#
-- See Note [DataToTag overview] in GHC.Tc.Instance.Class
-- TODO: There are some more optimization ideas for this code path
@@ -553,27 +548,6 @@ cgCase scrut@(StgApp v []) _ (PrimAlt _) _
; return AssignedDirectly
}
-{- Note [Handle seq#]
-~~~~~~~~~~~~~~~~~~~~~
-See Note [seq# magic] in GHC.Core.Opt.ConstantFold.
-The special case for seq# in cgCase does this:
-
- case seq# a s of v
- (# s', a' #) -> e
-==>
- case a of v
- (# s', a' #) -> e
-
-(taking advantage of the fact that the return convention for (# State#, a #)
-is the same as the return convention for just 'a')
--}
-
-cgCase (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _) bndr alt_type alts
- = -- Note [Handle seq#]
- -- And see Note [seq# magic] in GHC.Core.Opt.ConstantFold
- -- Use the same return convention as vanilla 'a'.
- cgCase (StgApp a []) bndr alt_type alts
-
cgCase scrut bndr alt_type alts
= -- the general case
do { platform <- getPlatform
=====================================
testsuite/tests/simplStg/should_compile/T15226b.stderr
=====================================
@@ -19,16 +19,16 @@ T15226b.testFun1
-> (# GHC.Prim.State# GHC.Prim.RealWorld, T15226b.StrictPair a b #)
[GblId, Arity=3, Str=<L><ML><L>, Unf=OtherCon []] =
{} \r [x y void]
- case seq# [x GHC.Prim.void#] of ds1 {
- Solo# ipv1 [Occ=Once1] ->
+ case x of sat {
+ __DEFAULT ->
+ case y of conrep {
+ __DEFAULT ->
let {
sat [Occ=Once1] :: T15226b.StrictPair a b
[LclId] =
- {ipv1, y} \u []
- case y of conrep {
- __DEFAULT -> T15226b.MkStrictPair [ipv1 conrep];
- };
- } in seq# [sat GHC.Prim.void#];
+ T15226b.MkStrictPair! [sat conrep];
+ } in Solo# [sat];
+ };
};
T15226b.testFun
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/97c2463f2499cdff416bb59f4a02029cb481229e
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/97c2463f2499cdff416bb59f4a02029cb481229e
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231210/391eaeae/attachment-0001.html>
More information about the ghc-commits
mailing list