[Git][ghc/ghc][wip/T22475] DmdAnal: Reflect the `seq` of strict fields of a DataCon worker (#22475)

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Mon Nov 28 12:27:56 UTC 2022



Sebastian Graf pushed to branch wip/T22475 at Glasgow Haskell Compiler / GHC


Commits:
08108154 by Sebastian Graf at 2022-11-28T13:26:47+01:00
DmdAnal: Reflect the `seq` of strict fields of a DataCon worker (#22475)

See the updated `Note [Data-con worker strictness]`
and the new `Note [Demand transformer for data constructors]`.

Fixes #22475.

- - - - -


8 changed files:

- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Types/Demand.hs
- compiler/GHC/Types/Id/Make.hs
- + testsuite/tests/stranal/should_run/T22475.hs
- + testsuite/tests/stranal/should_run/T22475b.hs
- + testsuite/tests/stranal/should_run/T22475b.stdout
- testsuite/tests/stranal/should_run/all.T


Changes:

=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -901,19 +901,36 @@ instance Outputable EqSpec where
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Notice that we do *not* say the worker Id is strict even if the data
 constructor is declared strict
-     e.g.    data T = MkT !(Int,Int)
-Why?  Because the *wrapper* $WMkT is strict (and its unfolding has case
-expressions that do the evals) but the *worker* MkT itself is not. If we
-pretend it is strict then when we see
-     case x of y -> MkT y
-the simplifier thinks that y is "sure to be evaluated" (because the worker MkT
-is strict) and drops the case.  No, the workerId MkT is not strict.
-
-However, the worker does have StrictnessMarks.  When the simplifier sees a
-pattern
-     case e of MkT x -> ...
-it uses the dataConRepStrictness of MkT to mark x as evaluated; but that's
-fine... dataConRepStrictness comes from the data con not from the worker Id.
+     e.g.    data T = MkT ![Int] Bool
+Even though most often the evals are done by the *wrapper* $WMkT, there are
+situations in which tag inference will re-insert evals around the worker.
+So for all intents and purposes the *worker* MkT is strict, too!
+
+Unfortunately, if we exposed accurate strictness of DataCon workers, we'd
+see the following transformation:
+
+  f xs = case xs of xs' { __DEFAULT -> ... case MkT xs b of x { __DEFAULT -> [x] } } -- DmdAnal: Strict in xs
+  ==> { drop-seq, binder swap on xs' }
+  f xs = case MkT xs b of x { __DEFAULT -> [x] } -- DmdAnal: Still strict in xs
+  ==> { case-to-let }
+  f xs = let x = MkT xs' b in [x] -- DmdAnal: No longer strict in xs!
+
+I.e., we are ironically losing strictness in `xs` by dropping the eval on `xs`
+and then doing case-to-let. The issue is that `exprIsHNF` currently says that
+every DataCon worker app is a value. The implicit assumption is that surrounding
+evals will have evaluated strict fields like `xs` before! But now that we had
+just dropped the eval on `xs`, that assumption is no longer valid.
+
+Long story short: By keeping the demand signature lazy, the Simplifier will not
+drop the eval on `xs` and using `exprIsHNF` to decide case-to-let and others
+remains sound.
+
+Similarly, during demand analysis in dmdTransformDataConSig, we bump up the
+field demand with `C_01`, *not* `C_11`, because the latter exposes too much
+strictness that will drop the eval on `xs` above.
+
+This issue is discussed at length in
+"Failed idea: no wrappers for strict data constructors" in #21497 and #22475.
 
 Note [Bangs on data constructor arguments]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -982,9 +982,9 @@ dmdTransform :: AnalEnv   -- ^ The analysis environment
 -- See Note [What are demand signatures?] in "GHC.Types.Demand"
 dmdTransform env var sd
   -- Data constructors
-  | isDataConWorkId var
-  = -- pprTraceWith "dmdTransform:DataCon" (\ty -> ppr var $$ ppr sd $$ ppr ty) $
-    dmdTransformDataConSig (idArity var) sd
+  | Just con <- isDataConWorkId_maybe var
+  = -- pprTraceWith "dmdTransform:DataCon" (\ty -> ppr con $$ ppr sd $$ ppr ty) $
+    dmdTransformDataConSig (dataConRepStrictness con) sd
   -- See Note [DmdAnal for DataCon wrappers]
   | isDataConWrapId var, let rhs = uf_tmpl (realIdUnfolding var)
   , WithDmdType dmd_ty _rhs' <- dmdAnal env sd rhs


=====================================
compiler/GHC/Types/Demand.hs
=====================================
@@ -94,7 +94,7 @@ import GHC.Data.Maybe   ( orElse )
 
 import GHC.Core.Type    ( Type )
 import GHC.Core.TyCon   ( isNewTyCon, isClassTyCon )
-import GHC.Core.DataCon ( splitDataProductType_maybe )
+import GHC.Core.DataCon ( splitDataProductType_maybe, StrictnessMark, isMarkedStrict )
 import GHC.Core.Multiplicity    ( scaledThing )
 
 import GHC.Utils.Binary
@@ -1032,10 +1032,13 @@ peelCallDmd sd = viewCall sd `orElse` (topCard, topSubDmd)
 -- whether it was unsaturated in the form of a 'Card'inality, denoting
 -- how many times the lambda body was entered.
 -- See Note [Demands from unsaturated function calls].
-peelManyCalls :: Int -> SubDemand -> Card
-peelManyCalls 0 _                          = C_11
-peelManyCalls n (viewCall -> Just (m, sd)) = m `multCard` peelManyCalls (n-1) sd
-peelManyCalls _ _                          = C_0N
+peelManyCalls :: Arity -> SubDemand -> (Card, SubDemand)
+peelManyCalls k sd = go k C_11 sd
+  where
+    go 0 !n !sd                        = (n, sd)
+    go k !n (viewCall -> Just (m, sd)) = go (k-1) (n `multCard` m) sd
+    go _ _  _                          = (topCard, topSubDmd)
+{-# INLINE peelManyCalls #-} -- so that the pair cancels away in a `fst _` context
 
 -- | Extract the 'SubDemand' of a 'Demand'.
 -- PRECONDITION: The SubDemand must be used in a context where the expression
@@ -1085,7 +1088,7 @@ argOneShots (_ :* sd) = go sd
 saturatedByOneShots :: Int -> Demand -> Bool
 saturatedByOneShots _ AbsDmd    = True
 saturatedByOneShots _ BotDmd    = True
-saturatedByOneShots n (_ :* sd) = isUsedOnce (peelManyCalls n sd)
+saturatedByOneShots n (_ :* sd) = isUsedOnce $ fst $ peelManyCalls n sd
 
 {- Note [Strict demands]
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1374,6 +1377,46 @@ Note [Demand transformer for a dictionary selector] explains. Annoyingly,
 the boxity info has to be stored in the *sub-demand* `sd`! There's no demand
 to store the boxity in. So we bit the bullet and now we store Boxity in
 'SubDemand', both in 'Prod' *and* 'Poly'. See also Note [Boxity in Poly].
+
+Note [Demand transformer for data constructors]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider the expression (x,y) with sub-demand P(SL,A).  What is the demand on
+x,y?  Obviously `x` is used strictly, and `y` not at all. So we want to
+decompose a product demand, and feed its components demands into the
+arguments.  That is the job of dmdTransformDataConSig.  More precisely,
+
+ * it gets the demand on the data constructor itself;
+   in the above example that is C(1,C(1,P(SL,A)))
+ * it returns the demands on the argumetns;
+   in the above example that is [SL, A]
+
+Nasty wrinkle. Consider this code (#22475 has more realistic examples but
+assume this is what the demand analyser sees)
+
+   data T = MkT !Int Bool
+   get :: T -> Bool
+   get (MkT _ b) = b
+
+   foo = let v::Int = I# 7
+             t::T   = MkT v True
+         in get t
+
+Now `v` is unused by `get`, /but/ we can't give `v` an Absent demand,
+else we'll drop the binding and replace it with an error thunk.
+Then the code generator (more specifically GHC.Stg.InferTags.Rewrite)
+will add an extra eval of MkT's argument to give
+   foo = let v::Int = error "absent"
+             t::T   = case v of v' -> MkT v' True
+         in get t
+
+Boo!  Because of this extra eval (added in STG-land), the truth is that `MkT`
+may (or may not) evaluate its arguments. Hence the use of `bump` in
+dmdTransformDataConSig, which adds in a `C_01` eval.  The `C_01` says "may or
+may not evaluate" which is absolutely faithful to what InferTags.Rewrite
+does.
+
+In particular it is very important /not/ to make that a `C_11` eval,
+see Note [Data-con worker strictness].
 -}
 
 {- *********************************************************************
@@ -2266,20 +2309,26 @@ type DmdTransformer = SubDemand -> DmdType
 -- return how the function evaluates its free variables and arguments.
 dmdTransformSig :: DmdSig -> DmdTransformer
 dmdTransformSig (DmdSig dmd_ty@(DmdType _ arg_ds _)) sd
-  = multDmdType (peelManyCalls (length arg_ds) sd) dmd_ty
+  = multDmdType (fst $ peelManyCalls (length arg_ds) sd) dmd_ty
     -- see Note [Demands from unsaturated function calls]
     -- and Note [What are demand signatures?]
 
 -- | A special 'DmdTransformer' for data constructors that feeds product
 -- demands into the constructor arguments.
-dmdTransformDataConSig :: Arity -> DmdTransformer
-dmdTransformDataConSig arity sd = case go arity sd of
-  Just dmds -> DmdType emptyDmdEnv dmds topDiv
-  Nothing   -> nopDmdType -- Not saturated
+dmdTransformDataConSig :: [StrictnessMark] -> DmdTransformer
+dmdTransformDataConSig str_marks sd = case viewProd arity body_sd of
+  Just (_, dmds) -> multDmdType n (body_ty dmds)
+  Nothing        -> nopDmdType
   where
-    go 0 sd             = snd <$> viewProd arity sd
-    go n (Call C_11 sd) = go (n-1) sd  -- strict calls only!
-    go _ _              = Nothing
+    arity = length str_marks
+    (n, body_sd) = peelManyCalls arity sd
+    body_ty dmds = DmdType emptyDmdEnv (zipWith bump str_marks dmds) topDiv
+    bump str dmd | isMarkedStrict str = plusDmd str_field_dmd dmd
+                 | otherwise          = dmd
+    str_field_dmd = C_01 :* seqSubDmd -- Better would be C_11 (seqDmd), but
+                                      -- that breaks other passes such as the
+                                      -- Simplifier that assume DataCon workers
+                                      -- to be lazy. See #21497.
 
 -- | A special 'DmdTransformer' for dictionary selectors that feeds the demand
 -- on the result into the indicated dictionary component (if saturated).


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -700,7 +700,7 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
                              -- applications are treated as values
                          `setInlinePragInfo`    wrap_prag
                          `setUnfoldingInfo`     wrap_unf
-                         `setDmdSigInfo`    wrap_sig
+                         `setDmdSigInfo`        wrap_sig
                              -- We need to get the CAF info right here because GHC.Iface.Tidy
                              -- does not tidy the IdInfo of implicit bindings (like the wrapper)
                              -- so it not make sure that the CAF info is sane


=====================================
testsuite/tests/stranal/should_run/T22475.hs
=====================================
@@ -0,0 +1,47 @@
+{-# OPTIONS_GHC -O -fforce-recomp #-}
+{-# OPTIONS_GHC -fmax-worker-args=0 #-}
+{-# LANGUAGE StrictData #-}
+
+import Control.Monad
+import Control.Exception
+import GHC.Conc
+import System.Timeout
+
+twiceIO :: (Int -> IO ()) -> IO ()
+twiceIO f = f 0 >> f 1
+{-# NOINLINE twiceIO #-}
+
+data Config
+  = Cfg
+  { a :: Integer
+  , b :: Integer
+  , c :: Integer
+  , params :: (Maybe Int)
+  , e :: Integer
+  , f :: Integer
+  , g :: Integer
+  , h :: Integer
+  , i :: Integer
+  , j :: Integer
+  , k :: Integer
+  }
+
+rnf :: Config -> ()
+rnf (Cfg a b c _ e f g h i j k) = a + b + c + e + f + g + h + i + j + k `seq` ()
+
+handshakeServer' :: Config -> Int -> IO ()
+handshakeServer' cfg 0 = rnf cfg `seq` return ()
+handshakeServer' _   _ = return ()
+{-# NOINLINE handshakeServer' #-}
+
+run :: Config -> Int -> IO ()
+run conf n = do
+  tv <- rnf conf `seq` params conf `seq` newTVarIO 0
+  forever $ do
+    acc <- rnf conf `seq` readTVarIO tv
+    let conf' = conf{params=Just acc}
+    forkIO $ twiceIO (\eta -> handshakeServer' conf' (eta+acc))
+{-# NOINLINE run #-}
+
+-- The crash should happen instantly, within the first 10ms. 100ms is a safe bet
+main = timeout 100 $ run (Cfg 0 1 2 (Just 3) 4 5 6 7 8 9 10) 13


=====================================
testsuite/tests/stranal/should_run/T22475b.hs
=====================================
@@ -0,0 +1,21 @@
+{-# OPTIONS_GHC -O -fforce-recomp #-}
+{-# OPTIONS_GHC -fmax-worker-args=0 #-}
+
+data Config
+  = Cfg
+  { a      :: Integer
+  , params :: !(Maybe Int)
+  }
+
+use :: Bool -> Config -> Int
+use True cfg = a cfg `seq` 42
+use _    _   = 0
+{-# NOINLINE use #-}
+
+run :: Config -> Int -> Int
+run conf n =
+  let !conf' = conf{params=Just n}
+  in use True conf' + use False conf'
+{-# NOINLINE run #-}
+
+main = print $ run (Cfg 0 (Just 1)) 13


=====================================
testsuite/tests/stranal/should_run/T22475b.stdout
=====================================
@@ -0,0 +1 @@
+42


=====================================
testsuite/tests/stranal/should_run/all.T
=====================================
@@ -28,3 +28,5 @@ test('T14285', normal, multimod_compile_and_run, ['T14285', ''])
 test('T17676', normal, compile_and_run, [''])
 test('T19053', normal, compile_and_run, [''])
 test('T21717b', normal, compile_and_run, [''])
+test('T22475', normal, compile_and_run, [''])
+test('T22475b', normal, compile_and_run, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/08108154acd79762f3ebedfa48e1f4f1ae34908a

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/08108154acd79762f3ebedfa48e1f4f1ae34908a
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221128/f7ba8778/attachment-0001.html>


More information about the ghc-commits mailing list