[Git][ghc/ghc][wip/T24124] Lower seq# early, in CorePrep (#24124)

Sun Dec 10 10:42:03 UTC 2023

Sebastian Graf pushed to branch wip/T24124 at Glasgow Haskell Compiler / GHC


Commits:
041da341 by Sebastian Graf at 2023-12-10T11:34:07+01:00
Lower seq# early, in CorePrep (#24124)

We can save many explanations in Tag Inference and StgToCmm in doing so.
See the updated `Note [seq# magic]`.
I also implemented a new `Note [Flatten case-bind]`
to get better code for otherwise nested case scrutinees.

I renamed the contructors of `ArgInfo` to use an `AI` prefix in order to
resolve the clash between `type CpeApp = CoreExpr` and the data constructor of
`ArgInfo`. Fixes #24252.

Fixes #24124.

- - - - -


6 changed files:

- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Stg/InferTags.hs
- compiler/GHC/StgToCmm/Expr.hs
- testsuite/tests/simplStg/should_compile/T15226b.stderr


Changes:

=====================================
compiler/GHC/Builtin/primops.txt.pp
=====================================
@@ -3646,6 +3646,7 @@ primop SeqOp "seq#" GenPrimOp
    with
    effect = ThrowsException
    work_free = True -- seq# does work iff its lifted arg does work
+   -- no strictness signature: See Note [seq# magic], (SEQ2)
 
 primop GetSparkOp "getSpark#" GenPrimOp
    State# s -> (# State# s, Int#, a #)


=====================================
compiler/GHC/Core/Opt/ConstantFold.hs
=====================================
@@ -2054,7 +2054,8 @@ The semantics of seq# is
 
 Things to note
 
-* Why do we need a primop at all?  That is, instead of
+(SEQ1)
+  Why do we need a primop at all?  That is, instead of
       case seq# x s of (# x, s #) -> blah
   why not instead say this?
       case x of { DEFAULT -> blah }
@@ -2069,7 +2070,16 @@ Things to note
   In short, we /always/ evaluate the first argument and never
   just discard it.
 
-* Why return the value?  So that we can control sharing of seq'd
+(SEQ2)
+  `seq#` evaluates its argument, but does /not/ expose that strictness
+  in its strictness signature. Why not? Because `seq#` is intended to mean
+  "evaluate this argument now -- not earlier". For example:
+    do { evaluate x; evaluate y }
+  should evaluate `x` and then `y`.  If `seq#` was visibly strict, they
+  might be evaluated in the opposite order.
+
+(SEQ3)
+  Why return the value?  So that we can control sharing of seq'd
   values: in
      let x = e in x `seq` ... x ...
   We don't want to inline x, so better to represent it as
@@ -2080,14 +2090,35 @@ Implementing seq#.  The compiler has magic for SeqOp in
 
 - GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <whnf> s)
 
-- GHC.StgToCmm.Expr.cgExpr, and cgCase: special case for seq#
-
 - Simplify.addEvals records evaluated-ness for the result; see
   Note [Adding evaluatedness info to pattern-bound variables]
   in GHC.Core.Opt.Simplify.Iteration
 
-- Likewise, GHC.Stg.InferTags.inferTagExpr knows that seq# returns a
-  properly-tagged pointer inside of its unboxed-tuple result.
+- GHC.CoreToStg.Prep: Lower seq# to a Case, e.g.,
+
+    case seq# (f 13) s of (# s', r #) -> rhs
+    ==>
+    case f 13 of sat of __DEFAULT -> rhs[sat/r,s/s']
+
+  this is implemented in two steps, not unlike Note [runRW magic], but
+  unfortunately not entirely local to `cpeApp`:
+
+    1. In `cpeApp`, lower the application
+         seq# (f 13) s
+         ==>
+         case f 13 of sat __DEFAULT -> (# s, sat #)
+    2. In `cpeRhsE Case{}`, catch the opportunity for beta reducing
+         case (# s, sat #) of (# s', r #) -> rhs
+         ==>
+         rhs[sat/r,s/s']
+
+  While (2) would be done by Unarise, it is not optional, because
+  substituting here allows us to carry over demand info and evaluatedness
+  to detect more values in `rhs`; see Note [Pin demand info on floats].
+
+  Note that CorePrep really allocates a strict Float for `f 13`.
+  That's OK, because the telescope of Floats always stays in the same order,
+  so all guarantees of evaluation order provided by seq# are upheld.
 -}
 
 seqRule :: RuleM CoreExpr


=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -30,6 +30,7 @@ import GHC.Unit
 
 import GHC.Builtin.Names
 import GHC.Builtin.Types
+import GHC.Builtin.PrimOps
 
 import GHC.Core.Utils
 import GHC.Core.Opt.Arity
@@ -159,7 +160,7 @@ Here is the syntax of the Core produced by CorePrep:
     Trivial expressions
        arg ::= lit |  var
               | arg ty  |  /\a. arg
-              | truv co  |  /\c. arg  |  arg |> co
+              | arg co  |  /\c. arg  |  arg |> co
 
     Applications
        app ::= lit  |  var  |  app arg  |  app ty  | app co | app |> co
@@ -167,7 +168,7 @@ Here is the syntax of the Core produced by CorePrep:
     Expressions
        body ::= app
               | let(rec) x = rhs in body     -- Boxed only
-              | case app of pat -> body
+              | case body of pat -> body
               | /\a. body | /\c. body
               | body |> co
 
@@ -839,16 +840,38 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con bs _])
 
 cpeRhsE env (Case scrut bndr ty alts)
   = do { (floats, scrut') <- cpeBody env scrut
+       -- See Note [seq# magic]. This is step (2) for CorePrep
+       ; case alts of
+           [Alt (DataAlt dc) [token,thing] rhs]
+             | isTupleDataCon dc
+             , isDeadBinder bndr
+             , Var v `App` Type{} `App` Type{} `App` Type{} `App` Type{} `App` Var token' `App` Var thing' <- scrut'
+             , Just dc' <- isDataConWorkId_maybe v, dc' == dc
+             -> do { rhs' <- cpeBodyNF (extendCorePrepEnvList env [(token,token'), (thing,thing')]) rhs
+                   ; return (floats, rhs') }
+           _ -> do {
+       -- End of seq# magic
        ; (env', bndr2) <- cpCloneBndr env bndr
        ; let alts'
                | cp_catchNonexhaustiveCases $ cpe_config env
+                 -- Suppose the alternatives do not cover all the data constructors of the type.
+                 -- That may be fine: perhaps an earlier case has dealt with the missing cases.
+                 -- But this is a relatively sophisticated property, so we provide a GHC-debugging flag
+                 -- `-fcatch-nonexhaustive-cases` which adds a DEFAULT alternative to such cases
+                 -- (This alternative will only be taken if there is a bug in GHC.)
                , not (altsAreExhaustive alts)
                = addDefault alts (Just err)
                | otherwise = alts
                where err = mkImpossibleExpr ty "cpeRhsE: missing case alternative"
        ; alts'' <- mapM (sat_alt env') alts'
 
-       ; return (floats, Case scrut' bndr2 ty alts'') }
+       ; case alts'' of
+           [Alt DEFAULT _ rhs] -- See Note [Flatten case-binds]
+             | let is_unlifted = mightBeUnliftedType (idType bndr2)
+             , let float = mkCaseFloat is_unlifted bndr2 scrut'
+                 -- evalDmd states that this is a strict float
+             -> return (snocFloat floats float, rhs)
+           _ -> return (floats, Case scrut' bndr2 ty alts'') }}
   where
     sat_alt env (Alt con bs rhs)
        = do { (env2, bs') <- cpCloneBndrs env bs
@@ -937,14 +960,14 @@ and it's extra work.
 --              CpeApp: produces a result satisfying CpeApp
 -- ---------------------------------------------------------------------------
 
-data ArgInfo = CpeApp  CoreArg
-             | CpeCast Coercion
-             | CpeTick CoreTickish
+data ArgInfo = AIApp  CoreArg -- NB: Not a CpeApp yet
+             | AICast Coercion
+             | AITick CoreTickish
 
 instance Outputable ArgInfo where
-  ppr (CpeApp arg) = text "app" <+> ppr arg
-  ppr (CpeCast co) = text "cast" <+> ppr co
-  ppr (CpeTick tick) = text "tick" <+> ppr tick
+  ppr (AIApp arg) = text "app" <+> ppr arg
+  ppr (AICast co) = text "cast" <+> ppr co
+  ppr (AITick tick) = text "tick" <+> ppr tick
 
 {- Note [Ticks and mandatory eta expansion]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -986,7 +1009,7 @@ cpe_app filters out the tick as a underscoped tick on the expression
 body of the eta-expansion lambdas. Giving us `\x -> Tick<foo> (tagToEnum# @Bool x)`.
 -}
 cpeApp :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs)

--- May return a CpeRhs because of saturating primops
+-- May return a CpeRhs (instead of CpeApp) because of saturating primops
 cpeApp top_env expr
   = do { let (terminal, args) = collect_args expr
       --  ; pprTraceM "cpeApp" $ (ppr expr)
@@ -1005,9 +1028,9 @@ cpeApp top_env expr
     collect_args e = go e []
       where
         go (App fun arg)      as
-            = go fun (CpeApp arg : as)
+            = go fun (AIApp arg : as)
         go (Cast fun co)      as
-            = go fun (CpeCast co : as)
+            = go fun (AICast co : as)
         go (Tick tickish fun) as
             -- Profiling ticks are slightly less strict so we expand their scope
             -- if they cover partial applications of things like primOps.
@@ -1020,7 +1043,7 @@ cpeApp top_env expr
             , etaExpansionTick head' tickish
             = (head,as')
             where
-              (head,as') = go fun (CpeTick tickish : as)
+              (head,as') = go fun (AITick tickish : as)
 
         -- Terminal could still be an app if it's wrapped by a tick.
         -- E.g. Tick<foo> (f x) can give us (f x) as terminal.
@@ -1030,7 +1053,7 @@ cpeApp top_env expr
             -> CoreExpr -- The thing we are calling
             -> [ArgInfo]
             -> UniqSM (Floats, CpeRhs)
-    cpe_app env (Var f) (CpeApp Type{} : CpeApp arg : args)
+    cpe_app env (Var f) (AIApp Type{} : AIApp arg : args)
         | f `hasKey` lazyIdKey          -- Replace (lazy a) with a, and
             -- See Note [lazyId magic] in GHC.Types.Id.Make
        || f `hasKey` noinlineIdKey || f `hasKey` noinlineConstraintIdKey
@@ -1056,24 +1079,36 @@ cpeApp top_env expr
           in cpe_app env terminal (args' ++ args)
 
     -- runRW# magic
-    cpe_app env (Var f) (CpeApp _runtimeRep at Type{} : CpeApp _type at Type{} : CpeApp arg : rest)
+    cpe_app env (Var f) (AIApp _runtimeRep at Type{} : AIApp _type at Type{} : AIApp arg : rest)
         | f `hasKey` runRWKey
         -- N.B. While it may appear that n == 1 in the case of runRW#
         -- applications, keep in mind that we may have applications that return
-        , has_value_arg (CpeApp arg : rest)
+        , has_value_arg (AIApp arg : rest)
         -- See Note [runRW magic]
         -- Replace (runRW# f) by (f realWorld#), beta reducing if possible (this
         -- is why we return a CorePrepEnv as well)
         = case arg of
             Lam s body -> cpe_app (extendCorePrepEnv env s realWorldPrimId) body rest
-            _          -> cpe_app env arg (CpeApp (Var realWorldPrimId) : rest)
+            _          -> cpe_app env arg (AIApp (Var realWorldPrimId) : rest)
              -- TODO: What about casts?
         where
           has_value_arg [] = False
-          has_value_arg (CpeApp arg:_rest)
+          has_value_arg (AIApp arg:_rest)
             | not (isTyCoArg arg) = True
           has_value_arg (_:rest) = has_value_arg rest
 
+    -- See Note [seq# magic]. This is step (1) for CorePrep
+    cpe_app env (Var f) [AIApp (Type ty), AIApp _st_ty at Type{}, AIApp thing, AIApp (Var token)]
+        | PrimOpId SeqOp _ <- idDetails f
+        -- seq# thing token   ==>   case thing of res { __DEFAULT -> (# token, res#) },
+        -- allocating a Float for (case thing of res { __DEFAULT -> _ })
+        = do { (floats, thing) <- cpeBody env thing
+             ; case_bndr <- newVar ty
+             ; let tup = mkCoreUnboxedTuple [lookupCorePrepEnv env token, Var case_bndr]
+             ; let is_unlifted = False -- otherwise seq# would not type-check
+             ; let float = mkCaseFloat is_unlifted case_bndr thing
+             ; return (floats `snocFloat` float, tup) }
+
     cpe_app env (Var v) args
       = do { v1 <- fiddleCCall v
            ; let e2 = lookupCorePrepEnv env v1
@@ -1120,13 +1155,13 @@ cpeApp top_env expr
         go [] !n = n
         go (info:infos) n =
           case info of
-            CpeCast {} -> go infos n
-            CpeTick tickish
+            AICast {} -> go infos n
+            AITick tickish
               | tickishFloatable tickish                 -> go infos n
               -- If we can't guarantee a tick will be floated out of the application
               -- we can't guarantee the value args following it will be applied.
               | otherwise                             -> n
-            CpeApp e                                  -> go infos n'
+            AIApp e                                  -> go infos n'
               where
                 !n'
                   | isTypeArg e = n
@@ -1182,13 +1217,13 @@ cpeApp top_env expr
             let tick_fun = foldr mkTick fun' rt_ticks
             in rebuild_app' env (a : as) tick_fun floats ss rt_ticks req_depth
 
-      CpeApp (Type arg_ty)
+      AIApp (Type arg_ty)
         -> rebuild_app' env as (App fun' (Type arg_ty)) floats ss rt_ticks req_depth
 
-      CpeApp (Coercion co)
+      AIApp (Coercion co)
         -> rebuild_app' env as (App fun' (Coercion co)) floats (drop 1 ss) rt_ticks req_depth
 
-      CpeApp arg -> do
+      AIApp arg -> do
         let (ss1, ss_rest)  -- See Note [lazyId magic] in GHC.Types.Id.Make
                = case (ss, isLazyExpr arg) of
                    (_   : ss_rest, True)  -> (topDmd, ss_rest)
@@ -1197,10 +1232,10 @@ cpeApp top_env expr
         (fs, arg') <- cpeArg top_env ss1 arg
         rebuild_app' env as (App fun' arg') (fs `zipFloats` floats) ss_rest rt_ticks (req_depth-1)
 
-      CpeCast co
+      AICast co
         -> rebuild_app' env as (Cast fun' co) floats ss rt_ticks req_depth
       -- See Note [Ticks and mandatory eta expansion]
-      CpeTick tickish
+      AITick tickish
         | tickishPlace tickish == PlaceRuntime
         , req_depth > 0
         -> assert (isProfTick tickish) $
@@ -1704,6 +1739,27 @@ Note [Pin demand info on floats]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We pin demand info on floated lets, so that we can see the one-shot thunks.
 
+Note [Flatten case-binds]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider the following program involving seq#:
+
+  data T a = T !a
+  ... case seq# (case x of y { __DEFAULT -> T y }) s of (# s', x' #) -> rhs
+  ==> {ANFise, lowering seq# as in Note [seq# magic]}
+  ... case (case x of y { __DEFAULT -> T y }) of sat { __DEFAULT -> rhs[s/s',sat/x'] }
+
+(Why didn't the Simplifier float out `case x of y`? Because `seq#` is lazy;
+see Note [seq# magic].)
+Note the case-of-case. This is not bad per sé, but we can easily flatten
+this situation by calling `mkNonRecFloat` to create strict binding `y=x`:
+
+  ... case x of y { __DEFAULT -> let sat = T y in rhs[s/s',sat/x'] } ...
+
+where `T y` is simply let-bound, thus far less likely to confuse passes
+downstream. We simply achieve this by calling `mkNonRecFloat` in the `Case`
+equation of `cpeRhsE` to create a strict float (`evalDmd`). This mirrors what we
+do for let-bindings, when we create a LetBound float: see `cpeBind`.
+
 Note [Speculative evaluation]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Since call-by-value is much cheaper than call-by-need, we case-bind arguments
@@ -1982,12 +2038,27 @@ zipFloats = appFloats
 zipManyFloats :: [Floats] -> Floats
 zipManyFloats = foldr zipFloats emptyFloats
 
+mkCaseFloat :: Bool -> Id -> CpeRhs -> FloatingBind
+mkCaseFloat is_unlifted bndr scrut = Float (NonRec bndr scrut) bound info
+  where
+    (bound, info)
+      -- See the comments in mkNonRecFloat for the classification
+      | is_lifted, is_hnf        = (LetBound,  TopLvlFloatable)
+      | is_data_con bndr         = (LetBound,  TopLvlFloatable)
+      | exprIsTickedString scrut = (CaseBound, TopLvlFloatable)
+      | otherwise                = (CaseBound, StrictContextFloatable)
+         -- For a Case, we never want to drop the eval; hence no need to test
+         -- for ok-for-spec-eval
+    is_lifted   = not is_unlifted
+    is_hnf      = exprIsHNF scrut
+    is_data_con = isJust . isDataConId_maybe
+
 mkNonRecFloat :: CorePrepEnv -> Demand -> Bool -> Id -> CpeRhs -> FloatingBind
 mkNonRecFloat env dmd is_unlifted bndr rhs = -- pprTraceWith "mkNonRecFloat" ppr $
   Float (NonRec bndr' rhs) bound info
   where
     bndr' = setIdDemandInfo bndr dmd -- See Note [Pin demand info on floats]
-    (bound,info)
+    (bound, info)
       | is_lifted, is_hnf        = (LetBound, TopLvlFloatable)
           -- is_lifted: We currently don't allow unlifted values at the
           --            top-level or inside letrecs
@@ -2012,7 +2083,7 @@ mkNonRecFloat env dmd is_unlifted bndr rhs = -- pprTraceWith "mkNonRecFloat" ppr
 
     is_lifted   = not is_unlifted
     is_hnf      = exprIsHNF rhs
-    is_strict   = isStrUsedDmd dmd
+    is_strict   = isStrUsedDmd dmd || isEvaldUnfolding (idUnfolding bndr)
     ok_for_spec = exprOkForSpecEval (not . is_rec_call) rhs
     is_rec_call = (`elemUnVarSet` cpe_rec_ids env)
     is_data_con = isJust . isDataConId_maybe


=====================================
compiler/GHC/Stg/InferTags.hs
=====================================
@@ -19,7 +19,6 @@ import GHC.Types.Basic ( CbvMark (..) )
 import GHC.Types.Unique.Supply (mkSplitUniqSupply)
 import GHC.Types.RepType (dataConRuntimeRepStrictness)
 import GHC.Core (AltCon(..))
-import GHC.Builtin.PrimOps ( PrimOp(..) )
 import Data.List (mapAccumL)
 import GHC.Utils.Outputable
 import GHC.Utils.Misc( zipWithEqual, zipEqual, notNull )
@@ -333,21 +332,7 @@ inferTagExpr env (StgTick tick body)
     (info, body') = inferTagExpr env body
 
 inferTagExpr _ (StgOpApp op args ty)
-  | StgPrimOp SeqOp <- op
-  -- Recall seq# :: a -> State# s -> (# State# s, a #)
-  -- However the output State# token has been unarised away,
-  -- so we now effectively have
-  --    seq# :: a -> State# s -> (# a #)
-  -- The key point is the result of `seq#` is guaranteed evaluated and properly
-  -- tagged (because that result comes directly from evaluating the arg),
-  -- and we want tag inference to reflect that knowledge (#15226).
-  -- Hence `TagTuple [TagProper]`.
-  -- See Note [seq# magic] in GHC.Core.Opt.ConstantFold
-  = (TagTuple [TagProper], StgOpApp op args ty)
-  -- Do any other primops guarantee to return a properly tagged value?
-  -- Probably not, and that is the conservative assumption anyway.
-  -- (And foreign calls definitely need not make promises.)
-  | otherwise = (TagDunno, StgOpApp op args ty)
+  = (TagDunno, StgOpApp op args ty)
 
 inferTagExpr env (StgLet ext bind body)
   = (info, StgLet ext bind' body')


=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -68,11 +68,6 @@ cgExpr  :: CgStgExpr -> FCode ReturnKind
 
 cgExpr (StgApp fun args)     = cgIdApp fun args
 
--- seq# a s ==> a
--- See Note [seq# magic] in GHC.Core.Opt.ConstantFold
-cgExpr (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _res_ty) =
-  cgIdApp a []
-
 -- dataToTagLarge# :: a_levpoly -> Int#
 -- See Note [DataToTag overview] in GHC.Tc.Instance.Class
 -- TODO: There are some more optimization ideas for this code path
@@ -553,27 +548,6 @@ cgCase scrut@(StgApp v []) _ (PrimAlt _) _
        ; return AssignedDirectly
        }
 
-{- Note [Handle seq#]
-~~~~~~~~~~~~~~~~~~~~~
-See Note [seq# magic] in GHC.Core.Opt.ConstantFold.
-The special case for seq# in cgCase does this:
-
-  case seq# a s of v
-    (# s', a' #) -> e
-==>
-  case a of v
-    (# s', a' #) -> e
-
-(taking advantage of the fact that the return convention for (# State#, a #)
-is the same as the return convention for just 'a')
--}
-
-cgCase (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _) bndr alt_type alts
-  = -- Note [Handle seq#]
-    -- And see Note [seq# magic] in GHC.Core.Opt.ConstantFold
-    -- Use the same return convention as vanilla 'a'.
-    cgCase (StgApp a []) bndr alt_type alts
-
 cgCase scrut bndr alt_type alts
   = -- the general case
     do { platform <- getPlatform


=====================================
testsuite/tests/simplStg/should_compile/T15226b.stderr
=====================================
@@ -19,16 +19,16 @@ T15226b.testFun1
      -> (# GHC.Prim.State# GHC.Prim.RealWorld, T15226b.StrictPair a b #)
 [GblId, Arity=3, Str=<L><ML><L>, Unf=OtherCon []] =
     {} \r [x y void]
-        case seq# [x GHC.Prim.void#] of ds1 {
-        Solo# ipv1 [Occ=Once1] ->
+        case x of sat {
+        __DEFAULT ->
+        case y of conrep {
+        __DEFAULT ->
         let {
           sat [Occ=Once1] :: T15226b.StrictPair a b
           [LclId] =
-              {ipv1, y} \u []
-                  case y of conrep {
-                  __DEFAULT -> T15226b.MkStrictPair [ipv1 conrep];
-                  };
-        } in  seq# [sat GHC.Prim.void#];
+              T15226b.MkStrictPair! [sat conrep];
+        } in  Solo# [sat];
+        };
         };
 
 T15226b.testFun



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/041da341e39d9ae762c80895f5a9728e1f5688b8

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/041da341e39d9ae762c80895f5a9728e1f5688b8
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231210/8b33a3d8/attachment-0001.html>