[Git][ghc/ghc][wip/T20749] Make DataCon workers strict in strict fields (#20749)

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Tue Jan 31 17:35:26 UTC 2023



Sebastian Graf pushed to branch wip/T20749 at Glasgow Haskell Compiler / GHC


Commits:
07e6142f by Sebastian Graf at 2023-01-31T18:34:46+01:00
Make DataCon workers strict in strict fields (#20749)

This patch tweaks `exprIsConApp_maybe`, `exprIsHNF` and frieds, and Demand
Analysis so that they exploit and maintain strictness of DataCon workers. See
`Note [Strict fields in Core]` for details.

Very little needed to change, and it puts field seq insertion done by Tag
Inference into a new perspective: That of *implementing* strict field semantics.
Before Tag Inference, DataCon workers are strict. Afterwards they are
effectively lazy and field seqs happen around use sites. History has shown
that there is no other way to guarantee taggedness and thus the STG Strict Field
Invariant.

Unfortunately, T21392 does no longer reproduce after this patch, so I marked it
as "not broken" in order to track whether we regress again in the future.

Fixes #20749, the satisfying conclusion of an annoying saga (cf. the ideas
in #21497 and #22475).

- - - - -


14 changed files:

- compiler/GHC/Core.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/Stg/InferTags.hs
- compiler/GHC/Stg/InferTags/Rewrite.hs
- compiler/GHC/Types/Demand.hs
- compiler/GHC/Types/Id/Info.hs
- compiler/GHC/Types/Id/Make.hs
- testsuite/tests/simplCore/should_compile/T18013.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core.hs
=====================================
@@ -954,6 +954,58 @@ tail position: A cast changes the type, but the type must be the same. But
 operationally, casts are vacuous, so this is a bit unfortunate! See #14610 for
 ideas how to fix this.
 
+Note [Strict fields in Core]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Evaluating a data constructor worker evaluates its strict fields.
+
+In other words, if `MkT` is strict in its first field and `xs` reduces to
+`error "boom"`, then `MkT xs b` will throw that error.
+Conversely, it is sound to seq the field before the call to the constructor,
+e.g., with `case xs of xs' { __DEFAULT -> MkT xs' b }`.
+Let's call this transformation "field seq insertion".
+
+Note in particular that the data constructor application `MkT xs b` above is
+*not* a value, unless `xs` is!
+
+This has pervasive effect on the Core pipeline:
+
+  * `exprIsHNF`/`exprIsConLike`/`exprOkForSpeculation` need to assert that the
+    strict arguments of a DataCon worker are values/ok-for-spec themselves.
+
+  * `exprIsConApp_maybe` inserts field seqs in the `FloatBind`s it returns, so
+    that the Simplifier, Constant-folding, the pattern-match checker, etc.
+    all see the insert field seqs when they match on strict workers. Often this
+    is just to emphasise strict semantics, but for case-of-known constructor
+    and case-to-let field insertion is *vital*, otherwise these transformations
+    would lose field seqs.
+
+  * The demand signature of a data constructor is strict in strict field
+    position, whereas is it's normally lazy. Likewise the demand *transformer*
+    of a DataCon worker can add stricten up demands on strict field args.
+    See Note [Demand transformer for data constructors].
+
+  * In the absence of `-fpedantic-bottoms`, it is still possible that some seqs
+    are ultimately dropped or delayed due to eta-expansion.
+    See Note [Dealing with Bottom].
+
+Strict field semantics is exploited in STG by Note [Tag inference]:
+It performs field seq insertion to statically guarantee *taggedness* of strict
+fields, establishing the Note [STG Strict Field Invariant].
+(Happily, most of those seqs are immediately detected as redundant by tag
+inference and are omitted.)
+
+Historical Note:
+The delightfully simple description of strict field semantics is the result of
+a long saga (#20749, the bits about strict data constructors in #21497, #22475),
+where we tried a more lenient (but actually not) semantics first that would
+allow both strict and lazy implementations of DataCon workers. This was favoured
+because the "pervasive effect" throughout the compiler was deemed too large
+(when it really turned out to be very modest).
+Alas, this semantics would require us to implement `exprIsHNF` in *exactly* the
+same way as above, otherwise the analysis would not be conservative wrt. the
+lenient semantics (which includes the strict one). It is also much harder to
+explain and maintain, as it turned out.
+
 ************************************************************************
 *                                                                      *
             In/Out type synonyms


=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -48,7 +48,8 @@ module GHC.Core.DataCon (
         dataConIsInfix,
         dataConWorkId, dataConWrapId, dataConWrapId_maybe,
         dataConImplicitTyThings,
-        dataConRepStrictness, dataConImplBangs, dataConBoxer,
+        dataConRepStrictness, dataConRepStrictnessWithTyArgs,
+        dataConImplBangs, dataConBoxer,
 
         splitDataProductType_maybe,
 
@@ -777,7 +778,8 @@ data DataConRep
                                           -- and *including* all evidence args
 
         , dcr_stricts :: [StrictnessMark]  -- 1-1 with dcr_arg_tys
-                -- See also Note [Data-con worker strictness]
+                -- See also Note [Strict fields in Core] in GHC.Core
+                -- for the effect on the strictness signature
 
         , dcr_bangs :: [HsImplBang]  -- The actual decisions made (including failures)
                                      -- about the original arguments; 1-1 with orig_arg_tys
@@ -848,43 +850,8 @@ eqSpecPreds spec = [ mkPrimEqPred (mkTyVarTy tv) ty
 instance Outputable EqSpec where
   ppr (EqSpec tv ty) = ppr (tv, ty)
 
-{- Note [Data-con worker strictness]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Notice that we do *not* say the worker Id is strict even if the data
-constructor is declared strict
-     e.g.    data T = MkT ![Int] Bool
-Even though most often the evals are done by the *wrapper* $WMkT, there are
-situations in which tag inference will re-insert evals around the worker.
-So for all intents and purposes the *worker* MkT is strict, too!
-
-Unfortunately, if we exposed accurate strictness of DataCon workers, we'd
-see the following transformation:
-
-  f xs = case xs of xs' { __DEFAULT -> ... case MkT xs b of x { __DEFAULT -> [x] } } -- DmdAnal: Strict in xs
-  ==> { drop-seq, binder swap on xs' }
-  f xs = case MkT xs b of x { __DEFAULT -> [x] } -- DmdAnal: Still strict in xs
-  ==> { case-to-let }
-  f xs = let x = MkT xs' b in [x] -- DmdAnal: No longer strict in xs!
-
-I.e., we are ironically losing strictness in `xs` by dropping the eval on `xs`
-and then doing case-to-let. The issue is that `exprIsHNF` currently says that
-every DataCon worker app is a value. The implicit assumption is that surrounding
-evals will have evaluated strict fields like `xs` before! But now that we had
-just dropped the eval on `xs`, that assumption is no longer valid.
-
-Long story short: By keeping the demand signature lazy, the Simplifier will not
-drop the eval on `xs` and using `exprIsHNF` to decide case-to-let and others
-remains sound.
-
-Similarly, during demand analysis in dmdTransformDataConSig, we bump up the
-field demand with `C_01`, *not* `C_11`, because the latter exposes too much
-strictness that will drop the eval on `xs` above.
-
-This issue is discussed at length in
-"Failed idea: no wrappers for strict data constructors" in #21497 and #22475.
-
-Note [Bangs on data constructor arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+{- Note [Bangs on data constructor arguments]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider
   data T = MkT !Int {-# UNPACK #-} !Int Bool
 
@@ -1373,12 +1340,24 @@ isNullaryRepDataCon :: DataCon -> Bool
 isNullaryRepDataCon dc = dataConRepArity dc == 0
 
 dataConRepStrictness :: DataCon -> [StrictnessMark]
--- ^ Give the demands on the arguments of a
--- Core constructor application (Con dc args)
+-- ^ Give the demands on the runtime arguments of a Core constructor
+-- application.
+-- NB: In a DataCon app `C @ty_args other_args` you have to prepend the
+-- 'NotMarkedStrict' marks for `ty_args` yourself!
+-- Use 'dataConRepStrictnessWithTyArgs' if you want to have a prefix matching
+-- `ty_args`.
 dataConRepStrictness dc = case dcRep dc of
                             NoDataConRep -> [NotMarkedStrict | _ <- dataConRepArgTys dc]
                             DCR { dcr_stricts = strs } -> strs
 
+dataConRepStrictnessWithTyArgs :: DataCon -> [StrictnessMark]
+-- ^ Give the demands on the runtime arguments of a Core constructor
+-- application, including a prefix of 'NotMarkedStrict's corresponding
+-- to universal and existential type arguments.
+dataConRepStrictnessWithTyArgs dc =
+  map (const NotMarkedStrict) (dataConUnivAndExTyCoVars dc)
+  ++ dataConRepStrictness dc
+
 dataConImplBangs :: DataCon -> [HsImplBang]
 -- The implementation decisions about the strictness/unpack of each
 -- source program argument to the data constructor


=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -814,6 +814,10 @@ to the Divergence lattice, but in practice it turned out to be hard to untaint
 from 'topDiv' to 'conDiv', leading to bugs, performance regressions and
 complexity that didn't justify the single fixed testcase T13380c.
 
+You might think that we should check for side-effects rather than just for
+precise exceptions. Right you are! See Note [Side-effects and strictness]
+for why we unfortunately do not.
+
 Note [Demand analysis for recursive data constructors]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 T11545 features a single-product, recursive data type


=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -8,14 +8,13 @@
 
 module GHC.Core.Opt.Simplify.Env (
         -- * The simplifier mode
-        SimplMode(..), updMode,
-        smPedanticBottoms, smPlatform,
+        SimplMode(..), updMode, smPlatform,
 
         -- * Environments
         SimplEnv(..), pprSimplEnv,   -- Temp not abstract
         seArityOpts, seCaseCase, seCaseFolding, seCaseMerge, seCastSwizzle,
         seDoEtaReduction, seEtaExpand, seFloatEnable, seInline, seNames,
-        seOptCoercionOpts, sePedanticBottoms, sePhase, sePlatform, sePreInline,
+        seOptCoercionOpts, sePhase, sePlatform, sePreInline,
         seRuleOpts, seRules, seUnfoldingOpts,
         mkSimplEnv, extendIdSubst,
         extendTvSubst, extendCvSubst,
@@ -216,9 +215,6 @@ seNames env = sm_names (seMode env)
 seOptCoercionOpts :: SimplEnv -> OptCoercionOpts
 seOptCoercionOpts env = sm_co_opt_opts (seMode env)
 
-sePedanticBottoms :: SimplEnv -> Bool
-sePedanticBottoms env = smPedanticBottoms (seMode env)
-
 sePhase :: SimplEnv -> CompilerPhase
 sePhase env = sm_phase (seMode env)
 
@@ -276,9 +272,6 @@ instance Outputable SimplMode where
          where
            pp_flag f s = ppUnless f (text "no") <+> s
 
-smPedanticBottoms :: SimplMode -> Bool
-smPedanticBottoms opts = ao_ped_bot (sm_arity_opts opts)
-
 smPlatform :: SimplMode -> Platform
 smPlatform opts = roPlatform (sm_rule_opts opts)
 


=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -2063,8 +2063,8 @@ For applications of a data constructor worker, the full glory of
 rebuildCall is a waste of effort;
 * They never inline, obviously
 * They have no rewrite rules
-* They are not strict (see Note [Data-con worker strictness]
-  in GHC.Core.DataCon)
+* Though they might be strict (see Note [Strict fields in Core] in GHC.Core),
+  we will exploit that strictness through their demand signature
 So it's fine to zoom straight to `rebuild` which just rebuilds the
 call in a very straightforward way.
 
@@ -3273,7 +3273,7 @@ a case pattern.  This is *important*.  Consider
 
 We really must record that b is already evaluated so that we don't
 go and re-evaluate it when constructing the result.
-See Note [Data-con worker strictness] in GHC.Core.DataCon
+See Note [Strict fields in Core] in GHC.Core.
 
 NB: simplLamBndrs preserves this eval info
 


=====================================
compiler/GHC/Core/SimpleOpt.hs
=====================================
@@ -1224,11 +1224,8 @@ exprIsConApp_maybe (in_scope, id_unf) expr
        --       simplifier produces rhs[exp/a], changing semantics if exp is not ok-for-spec
        -- Good: returning (Mk#, [x]) with a float of  case exp of x { DEFAULT -> [] }
        --       simplifier produces case exp of a { DEFAULT -> exp[x/a] }
-       = let arg' = subst_expr subst arg
-             bndr = uniqAway (subst_in_scope subst) (mkWildValBinder ManyTy arg_type)
-             float = FloatCase arg' bndr DEFAULT []
-             subst' = subst_extend_in_scope subst bndr
-         in go subst' (float:floats) fun (CC (Var bndr : args) co)
+       , (subst', float, bndr) <- case_bind subst arg arg_type
+       = go subst' (float:floats) fun (CC (Var bndr : args) co)
        | otherwise
        = go subst floats fun (CC (subst_expr subst arg : args) co)
 
@@ -1267,8 +1264,9 @@ exprIsConApp_maybe (in_scope, id_unf) expr
 
         | Just con <- isDataConWorkId_maybe fun
         , count isValArg args == idArity fun
-        = succeedWith in_scope floats $
-          pushCoDataCon con args co
+        , (Left in_scope', seq_floats, args') <- mkFieldSeqFloats (Left in_scope) con args
+        = succeedWith in_scope' (seq_floats ++ floats) $
+          pushCoDataCon con args' co
 
         -- Look through data constructor wrappers: they inline late (See Note
         -- [Activation for data constructor wrappers]) but we want to do
@@ -1354,6 +1352,24 @@ exprIsConApp_maybe (in_scope, id_unf) expr
     extend (Left in_scope) v e = Right (extendSubst (mkEmptySubst in_scope) v e)
     extend (Right s)       v e = Right (extendSubst s v e)
 
+    case_bind :: Either InScopeSet Subst -> CoreExpr -> Type -> (Either InScopeSet Subst, FloatBind, Id)
+    case_bind subst expr expr_ty = (subst', float, bndr)
+      where
+        bndr   = uniqAway (subst_in_scope subst) (mkWildValBinder ManyTy expr_ty)
+        subst' = subst_extend_in_scope subst bndr
+        expr'  = subst_expr subst expr
+        float  = FloatCase expr' bndr DEFAULT []
+
+    mkFieldSeqFloats :: Either InScopeSet Subst -> DataCon -> [CoreExpr] -> (Either InScopeSet Subst, [FloatBind], [CoreExpr])
+    mkFieldSeqFloats subst dc args
+      = foldr do_one (subst, [], []) $ zipEqual "mkFieldSeqFloats" str_marks args
+      where
+        str_marks = dataConRepStrictnessWithTyArgs dc
+        do_one (str, arg) (subst,floats,args) = case str of
+          NotMarkedStrict -> (subst, floats, arg:args)
+          MarkedStrict    -> (subst', float:floats, Var bndr:args)
+            where
+              (subst', float, bndr) = case_bind subst arg (exprType arg)
 
 -- See Note [exprIsConApp_maybe on literal strings]
 dealWithStringLiteral :: Var -> BS.ByteString -> Coercion


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -1569,10 +1569,8 @@ app_ok fun_ok primop_ok fun args
          -- DFuns terminate, unless the dict is implemented
          -- with a newtype in which case they may not
 
-      DataConWorkId {} -> True
-                -- The strictness of the constructor has already
-                -- been expressed by its "wrapper", so we don't need
-                -- to take the arguments into account
+      DataConWorkId dc -> and (zipWith3 field_ok str_marks arg_tys args)
+        where str_marks = dataConRepStrictnessWithTyArgs dc
 
       PrimOpId op _
         | primOpIsDiv op
@@ -1630,6 +1628,16 @@ app_ok fun_ok primop_ok fun args
        | otherwise
        = expr_ok fun_ok primop_ok arg
 
+    -- Used for DataCon worker arguments
+    field_ok :: StrictnessMark -> PiTyVarBinder -> CoreExpr -> Bool
+    field_ok _   (Named _) _ = True   -- A type argument
+    field_ok str (Anon ty _) arg      -- A term argument
+       | NotMarkedStrict <- str                           -- iff it's a lazy field
+       , Just Lifted <- typeLevity_maybe (scaledThing ty) -- and its type is lifted
+       = True                                             -- then the worker app does not eval
+       | otherwise
+       = expr_ok fun_ok primop_ok arg
+
 -----------------------------
 altsAreExhaustive :: [Alt b] -> Bool
 -- True  <=> the case alternatives are definitely exhaustive
@@ -1871,9 +1879,9 @@ exprIsHNFlike :: HasDebugCallStack => (Var -> Bool) -> (Unfolding -> Bool) -> Co
 exprIsHNFlike is_con is_con_unf = is_hnf_like
   where
     is_hnf_like (Var v) -- NB: There are no value args at this point
-      =  id_app_is_value v 0 -- Catches nullary constructors,
-                             --      so that [] and () are values, for example
-                             -- and (e.g.) primops that don't have unfoldings
+      =  id_app_is_value v [] -- Catches nullary constructors,
+                              --      so that [] and () are values, for example
+                              -- and (e.g.) primops that don't have unfoldings
       || is_con_unf (idUnfolding v)
         -- Check the thing's unfolding; it might be bound to a value
         --   or to a guaranteed-evaluated variable (isEvaldUnfolding)
@@ -1896,31 +1904,49 @@ exprIsHNFlike is_con is_con_unf = is_hnf_like
                                       -- See Note [exprIsHNF Tick]
     is_hnf_like (Cast e _)       = is_hnf_like e
     is_hnf_like (App e a)
-      | isValArg a               = app_is_value e 1
+      | isValArg a               = app_is_value e [a]
       | otherwise                = is_hnf_like e
     is_hnf_like (Let _ e)        = is_hnf_like e  -- Lazy let(rec)s don't affect us
     is_hnf_like _                = False
 
-    -- 'n' is the number of value args to which the expression is applied
-    -- And n>0: there is at least one value argument
-    app_is_value :: CoreExpr -> Int -> Bool
-    app_is_value (Var f)    nva = id_app_is_value f nva
-    app_is_value (Tick _ f) nva = app_is_value f nva
-    app_is_value (Cast f _) nva = app_is_value f nva
-    app_is_value (App f a)  nva
-      | isValArg a              =
-        app_is_value f (nva + 1) &&
-        not (needsCaseBinding (exprType a) a)
-          -- For example  f (x /# y)  where f has arity two, and the first
-          -- argument is unboxed. This is not a value!
-          -- But  f 34#  is a value.
-          -- NB: Check app_is_value first, the arity check is cheaper
-      | otherwise               = app_is_value f nva
-    app_is_value _          _   = False
-
-    id_app_is_value id n_val_args
-       = is_con id
-       || idArity id > n_val_args
+    -- Collect arguments through Casts and Ticks and call id_app_is_value
+    app_is_value :: CoreExpr -> [CoreArg] -> Bool
+    app_is_value (Var f)    as = id_app_is_value f as
+    app_is_value (Tick _ f) as = app_is_value f as
+    app_is_value (Cast f _) as = app_is_value f as
+    app_is_value (App f a)  as = app_is_value f (a:as)
+    app_is_value _          _  = False
+
+    id_app_is_value id args
+       | Just dc <- isDataConWorkId_maybe id
+       = and $ zipWithEqual "id_app_is_value" check_field
+                 (dataConRepStrictnessWithTyArgs dc) args
+
+       |  idArity id > count isValArg args
+            -- PAP: Check unlifted args
+       || is_con id && not (isDataConWorkId id)
+            -- The CONLIKE case: Assume the CONLIKE is lazy in its args,
+            -- hence only need to check unlifted args
+       = all check_arg args
+
+       | otherwise
+       = False
+       where
+         check_arg a = unlifted a ==> is_hnf_like a
+         check_field str a = str == MarkedStrict || unlifted a ==> is_hnf_like a
+         unlifted a = isValArg a && mightBeUnliftedType (exprType a)
+          -- Check unliftedness; for example f (x /# y) where f has arity two,
+          -- and the first argument is unboxed. This is not a value!
+          -- But  f 34#  is a value, so check args for HNFs.
+          -- NB: We check arity (and CONLIKEness) first because it's cheaper
+          --     and we reject quickly on saturated apps.
+          --
+          -- TODO: we used to check ok-for-spec through `needsCaseBinding`, not
+          -- HNF-ness, so we'd consider f (12# +# n#) a value (I think) and
+          -- duplciate freely. That seems a bit bogus?? Hence the is_hnf_like
+          -- check.
+         a ==> b = not a || b
+         infixr 1 ==>
 
 {-
 Note [exprIsHNF Tick]
@@ -2482,7 +2508,7 @@ This means the seqs on x and y both become no-ops and compared to the first vers
 
 The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated.
 But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated
-already at the call site because of the Strict Field Invariant! See Note [Strict Field Invariant] for more in this.
+already at the call site because of the Strict Field Invariant! See Note [STG Strict Field Invariant] for more in this.
 This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good.
 
 We only apply this when we think there is a benefit in doing so however. There are a number of cases in which


=====================================
compiler/GHC/Stg/InferTags.hs
=====================================
@@ -64,8 +64,8 @@ With nofib being ~0.3% faster as well.
 
 See Note [Tag inference passes] for how we proceed to generate and use this information.
 
-Note [Strict Field Invariant]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [STG Strict Field Invariant]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As part of tag inference we introduce the Strict Field Invariant.
 Which consists of us saying that:
 
@@ -124,15 +124,33 @@ Note that there are similar constraints around Note [CBV Function Ids].
 
 Note [How untagged pointers can end up in strict fields]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Since the resolution of #20749 where Core passes assume that DataCon workers
+evaluate their strict fields, it is pretty simple to see how the Simplifier
+might exploit that knowledge to drop evals. Example:
+
+  data MkT a = MkT !a
+  f :: [Int] -> T [Int]
+  f xs = xs `seq` MkT xs
+
+in Core we will have
+
+  f = \xs -> MkT @[Int] xs
+
+No eval left there.
+
 Consider
   data Set a = Tip | Bin !a (Set a) (Set a)
 
 We make a wrapper for Bin that evaluates its arguments
   $WBin x a b = case x of xv -> Bin xv a b
 Here `xv` will always be evaluated and properly tagged, just as the
-Strict Field Invariant requires.
+Note [STG Strict Field Invariant] requires.
+
+But alas, the Simplifier can destroy the invariant: see #15696.
+Indeed, as Note [Strict fields in Core] explains, Core passes
+assume that Data constructor workers evaluate their strict fields,
+so the Simplifier will drop seqs freely.
 
-But alas the Simplifier can destroy the invariant: see #15696.
 We start with
   thk = f ()
   g x = ...(case thk of xv -> Bin xv Tip Tip)...
@@ -153,7 +171,7 @@ Now you can see that the argument of Bin, namely thk, points to the
 thunk, not to the value as it did before.
 
 In short, although it may be rare, the output of optimisation passes
-cannot guarantee to obey the Strict Field Invariant. For this reason
+cannot guarantee to obey the Note [STG Strict Field Invariant]. For this reason
 we run tag inference. See Note [Tag inference passes].
 
 Note [Tag inference passes]
@@ -163,7 +181,7 @@ Tag inference proceeds in two passes:
   The result is then attached to /binders/.
   This is implemented by `inferTagsAnal` in GHC.Stg.InferTags
 * The second pass walks over the AST checking if the Strict Field Invariant is upheld.
-  See Note [Strict Field Invariant].
+  See Note [STG Strict Field Invariant].
   If required this pass modifies the program to uphold this invariant.
   Tag information is also moved from /binders/ to /occurrences/ during this pass.
   This is done by `GHC.Stg.InferTags.Rewrite (rewriteTopBinds)`.


=====================================
compiler/GHC/Stg/InferTags/Rewrite.hs
=====================================
@@ -65,7 +65,7 @@ The work of this pass is simple:
 * For any strict field we check if the argument is known to be properly tagged.
 * If it's not known to be properly tagged, we wrap the whole thing in a case,
   which will force the argument before allocation.
-This is described in detail in Note [Strict Field Invariant].
+This is described in detail in Note [STG Strict Field Invariant].
 
 The only slight complication is that we have to make sure not to invalidate free
 variable analysis in the process.
@@ -218,7 +218,7 @@ When compiling bytecode we call myCoreToStg to get STG code first.
 myCoreToStg in turn calls out to stg2stg which runs the STG to STG
 passes followed by free variables analysis and the tag inference pass including
 it's rewriting phase at the end.
-Running tag inference is important as it upholds Note [Strict Field Invariant].
+Running tag inference is important as it upholds Note [STG Strict Field Invariant].
 While code executed by GHCi doesn't take advantage of the SFI it can call into
 compiled code which does. So it must still make sure that the SFI is upheld.
 See also #21083 and #22042.


=====================================
compiler/GHC/Types/Demand.hs
=====================================
@@ -1390,33 +1390,8 @@ arguments.  That is the job of dmdTransformDataConSig.  More precisely,
  * it returns the demands on the arguments;
    in the above example that is [SL, A]
 
-Nasty wrinkle. Consider this code (#22475 has more realistic examples but
-assume this is what the demand analyser sees)
-
-   data T = MkT !Int Bool
-   get :: T -> Bool
-   get (MkT _ b) = b
-
-   foo = let v::Int = I# 7
-             t::T   = MkT v True
-         in get t
-
-Now `v` is unused by `get`, /but/ we can't give `v` an Absent demand,
-else we'll drop the binding and replace it with an error thunk.
-Then the code generator (more specifically GHC.Stg.InferTags.Rewrite)
-will add an extra eval of MkT's argument to give
-   foo = let v::Int = error "absent"
-             t::T   = case v of v' -> MkT v' True
-         in get t
-
-Boo!  Because of this extra eval (added in STG-land), the truth is that `MkT`
-may (or may not) evaluate its arguments (as established in #21497). Hence the
-use of `bump` in dmdTransformDataConSig, which adds in a `C_01` eval. The
-`C_01` says "may or may not evaluate" which is absolutely faithful to what
-InferTags.Rewrite does.
-
-In particular it is very important /not/ to make that a `C_11` eval,
-see Note [Data-con worker strictness].
+When the data constructor worker has strict fields, they act as additional
+seqs; hence we add an additional `C_11` eval.
 -}
 
 {- *********************************************************************
@@ -1616,6 +1591,29 @@ a bad fit because
    expression may not throw a precise exception (increasing precision of the
    analysis), but that's just a favourable guess.
 
+Note [Side-effects and strictness]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Due to historic reasons and the continued effort not to cause performance
+regressions downstream, Strictness Analysis is currently prone to discarding
+observable side-effects (other than precise exceptions, see
+Note [Precise exceptions and strictness]) in some cases. For example,
+  f :: MVar () -> Int -> IO Int
+  f mv x = putMVar mv () >> (x `seq` return x)
+The call to `putMVar` is an observable side-effect. Yet, Strictness Analysis
+currently concludes that `f` is strict in `x` and uses call-by-value.
+That means `f mv (error "boom")` will error out with the imprecise exception
+rather performing the side-effect.
+
+This is a conscious violation of the semantics described in the paper
+"a semantics for imprecise exceptions"; so it would be great if we could
+identify the offending primops and extend the idea in
+Note [Which scrutinees may throw precise exceptions] to general side-effects.
+
+Unfortunately, the existing has-side-effects classification for primops is
+too conservative, listing `writeMutVar#` and even `readMutVar#` as
+side-effecting. That is due to #3207. A possible way forward is described in
+#17900, but no effort has been so far towards a resolution.
+
 Note [Exceptions and strictness]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We used to smart about catching exceptions, but we aren't anymore.
@@ -2326,7 +2324,7 @@ dmdTransformDataConSig str_marks sd = case viewProd arity body_sd of
     mk_body_ty n dmds = DmdType emptyDmdEnv (zipWith (bump n) str_marks dmds) topDiv
     bump n str dmd | isMarkedStrict str = multDmd n (plusDmd str_field_dmd dmd)
                    | otherwise          = multDmd n dmd
-    str_field_dmd = C_01 :* seqSubDmd -- Why not C_11? See Note [Data-con worker strictness]
+    str_field_dmd = C_11 :* seqSubDmd -- See Note [Strict fields in Core]
 
 -- | A special 'DmdTransformer' for dictionary selectors that feeds the demand
 -- on the result into the indicated dictionary component (if saturated).


=====================================
compiler/GHC/Types/Id/Info.hs
=====================================
@@ -215,7 +215,7 @@ The invariants around the arguments of call by value function like Ids are then:
   * Any `WorkerLikeId`
   * Some `JoinId` bindings.
 
-This works analogous to the Strict Field Invariant. See also Note [Strict Field Invariant].
+This works analogous to the Strict Field Invariant. See also Note [STG Strict Field Invariant].
 
 To make this work what we do is:
 * During W/W and SpecConstr any worker/specialized binding we introduce


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -581,8 +581,12 @@ mkDataConWorkId wkr_name data_con
   = mkGlobalId (DataConWorkId data_con) wkr_name wkr_ty alg_wkr_info
 
   where
-    tycon  = dataConTyCon data_con  -- The representation TyCon
-    wkr_ty = dataConRepType data_con
+    tycon     = dataConTyCon data_con  -- The representation TyCon
+    wkr_ty    = dataConRepType data_con
+    univ_tvs  = dataConUnivTyVars data_con
+    ex_tcvs   = dataConExTyCoVars data_con
+    arg_tys   = dataConRepArgTys  data_con  -- Should be same as dataConOrigArgTys
+    str_marks = dataConRepStrictness data_con
 
     ----------- Workers for data types --------------
     alg_wkr_info = noCafIdInfo
@@ -590,15 +594,19 @@ mkDataConWorkId wkr_name data_con
                    `setInlinePragInfo`     wkr_inline_prag
                    `setUnfoldingInfo`      evaldUnfolding  -- Record that it's evaluated,
                                                            -- even if arity = 0
-          -- No strictness: see Note [Data-con worker strictness] in GHC.Core.DataCon
+                   `setDmdSigInfo`         wkr_sig
+                      -- Workers eval their strict fields
+                      -- See Note [Strict fields in Core]
 
     wkr_inline_prag = defaultInlinePragma { inl_rule = ConLike }
     wkr_arity = dataConRepArity data_con
 
+    wkr_sig = mkClosedDmdSig wkr_dmds topDiv
+    wkr_dmds = map mk_dmd str_marks
+    mk_dmd MarkedStrict    = evalDmd
+    mk_dmd NotMarkedStrict = topDmd
+
     ----------- Workers for newtypes --------------
-    univ_tvs = dataConUnivTyVars data_con
-    ex_tcvs  = dataConExTyCoVars data_con
-    arg_tys  = dataConRepArgTys  data_con  -- Should be same as dataConOrigArgTys
     nt_work_info = noCafIdInfo          -- The NoCaf-ness is set by noCafIdInfo
                   `setArityInfo` 1      -- Arity 1
                   `setInlinePragInfo`     dataConWrapperInlinePragma


=====================================
testsuite/tests/simplCore/should_compile/T18013.stderr
=====================================
@@ -138,9 +138,9 @@ mapMaybeRule [InlPrag=[2]]
   :: forall a b. Rule IO a b -> Rule IO (Maybe a) (Maybe b)
 [GblId,
  Arity=1,
- Str=<1!P(L,LC(S,C(1,C(1,P(L,1L)))))>,
- Unf=Unf{Src=StableSystem, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True,
+ Str=<1!P(SL,LC(S,C(1,C(1,P(L,1L)))))>,
+ Unf=Unf{Src=StableSystem, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
          Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
          Tmpl= \ (@a) (@b) (f [Occ=Once1!] :: Rule IO a b) ->
                  case f of { Rule @s ww ww1 [Occ=OnceL1!] ->
@@ -219,36 +219,41 @@ mapMaybeRule
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule4 :: GHC.Prim.Addr#
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 0}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 20 0}]
 T18013.$trModule4 = "main"#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule3 :: GHC.Types.TrName
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule3 = GHC.Types.TrNameS T18013.$trModule4
 
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule2 :: GHC.Prim.Addr#
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 0}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 30 0}]
 T18013.$trModule2 = "T18013"#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule1 :: GHC.Types.TrName
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule1 = GHC.Types.TrNameS T18013.$trModule2
 
 -- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule :: GHC.Types.Module
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule
   = GHC.Types.Module T18013.$trModule3 T18013.$trModule1
 


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -417,7 +417,10 @@ test('T21391', normal, compile, ['-O -dcore-lint'])
 test('T22112', normal, compile, ['-O -dsuppress-uniques -dno-typeable-binds -ddump-simpl'])
 test('T21391a', normal, compile, ['-O -dcore-lint'])
 # We don't want to see a thunk allocation for the insertBy expression after CorePrep.
-test('T21392', [ grep_errmsg(r'sat.* :: \[\(.*Unique, .*Int\)\]'), expect_broken(21392) ], compile, ['-O -ddump-prep -dno-typeable-binds -dsuppress-uniques'])
+# Unfortunately, this test is no longer broken after we made workers strict in strict fields,
+# so it is no longer a reproducer for T21392. Still, it doesn't hurt if we test that we don't
+# regress again.
+test('T21392', [ grep_errmsg(r'sat.* :: \[\(.*Unique, .*Int\)\]') ], compile, ['-O -ddump-prep -dno-typeable-binds -dsuppress-uniques'])
 test('T21689', [extra_files(['T21689a.hs'])], multimod_compile, ['T21689', '-v0 -O'])
 test('T21801', normal, compile, ['-O -dcore-lint'])
 test('T21848', [grep_errmsg(r'SPEC wombat') ], compile, ['-O -ddump-spec'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/07e6142fb15bac5a4059635c70639bc104e67d6b

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/07e6142fb15bac5a4059635c70639bc104e67d6b
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230131/0bbfb26e/attachment-0001.html>


More information about the ghc-commits mailing list