[Git][ghc/ghc][wip/T22241] 2 commits: DmdAnal: Look through unfoldings of DataCon wrappers (#22241)

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Fri Oct 14 15:28:33 UTC 2022



Sebastian Graf pushed to branch wip/T22241 at Glasgow Haskell Compiler / GHC


Commits:
9c5882cf by Sebastian Graf at 2022-10-14T17:28:24+02:00
DmdAnal: Look through unfoldings of DataCon wrappers (#22241)

Previously, we failed to detect that `f` can unbox `y`
in the following program (T22241)
```hs
data D = D !Int

f :: Bool -> Int -> D
f x y = D (go x)
  where
    go False = y
    go True  = go False
{-# NOINLINE f #-}
```
That is because the demand signature for `$WD` is computed upfront and didn't
include any boxity information, yet provides useful information to other
passes such as the Simplifier.

In this patch we adopt the solution to look through unfoldings of DataCon
wrappers during Demand Analysis, but keep the signatures for other passes.
See `Note [DmdAnal for DataCon wrappers]` for more details.

Fixes #22241.

- - - - -
fba8a684 by Sebastian Graf at 2022-10-14T17:28:24+02:00
Tweaks to postIU

- - - - -


7 changed files:

- compiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Types/Id/Make.hs
- + testsuite/tests/stranal/sigs/T22241.hs
- + testsuite/tests/stranal/sigs/T22241.stderr
- testsuite/tests/stranal/sigs/all.T


Changes:

=====================================
compiler/GHC/Core/Opt/CprAnal.hs
=====================================
@@ -553,13 +553,16 @@ analysing their unfolding. A few reasons for the change:
      *workers*, because their transformers need to adapt to CPR for their
      arguments in 'cprTransformDataConWork' to enable Note [Nested CPR].
      Better keep it all in this module! The alternative would be that
-     'GHC.Types.Id.Make' depends on DmdAnal.
+     'GHC.Types.Id.Make' depends on CprAnal.
   3. In the future, Nested CPR could take a better account of incoming args
      in cprAnalApp and do some beta-reduction on the fly, like !1866 did. If
      any of those args had the CPR property, then we'd even get Nested CPR for
      DataCon wrapper calls, for free. Not so if we simply give the wrapper a
      single CPR sig in 'GHC.Types.Id.Make.mkDataConRep'!
 
+DmdAnal also looks through the wrapper's unfolding:
+See Note [DmdAnal for DataCon wrappers].
+
 Note [Trimming to mAX_CPR_SIZE]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We do not treat very big tuples as CPR-ish:


=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -985,6 +985,10 @@ dmdTransform env var sd
   | isDataConWorkId var
   = -- pprTraceWith "dmdTransform:DataCon" (\ty -> ppr var $$ ppr sd $$ ppr ty) $
     dmdTransformDataConSig (idArity var) sd
+  -- See Note [CPR for DataCon wrappers]
+  | isDataConWrapId var, let rhs = uf_tmpl (realIdUnfolding var)
+  , WithDmdType dmd_ty _rhs' <- dmdAnal env sd rhs
+  = dmd_ty
   -- Dictionary component selectors
   -- Used to be controlled by a flag.
   -- See #18429 for some perf measurements.
@@ -1388,6 +1392,37 @@ Now f's optimised RHS will be \x.a, but if we change g to (error "..")
 disaster.  But regardless, #18638 was a more complicated version of
 this, that actually happened in practice.
 
+Note [DmdAnal for DataCon wrappers]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We give DataCon wrappers a (necessarily flat) demand signature in
+'GHC.Types.Id.Make.mkDataConRep', so that passes such as the Simplifier can
+exploit it. But during DmdAnal, we *ignore* the demand signature of DataCon
+wrappers and analyse their unfolding instead.
+
+Part of the reason is that the ad-hoc signatures in mkDataConRep don't have
+accurate Boxity information.
+It's pretty similar to the situation in Note [CPR for DataCon wrappers].
+
+Another reason is that DataCon *worker*s have very precise demand transformers,
+computed by 'dmdTransformDataConSig'. It would be awkward if DataCon *wrappers*
+would behave much less precisely during DmdAnal. Example:
+
+  data T = MkT { get_x :: Int, get_y :: !Bool }
+  let t = join j x = $WMkT x True in ... j (f 13) ... j (f 14) ...
+  in ... get_y t ... get_y t ...
+
+Here, we would like to detect that `x` is absent, so that we can completely
+discard the `f 13` and `f 14` arguments of `j`. With a single-point demand
+signature computed for top sub-demand L, we would be out of luck. But if we push
+the P(A,1L) sub-demand into the unfolding of `$WMkT`, we see its action on the
+DataCon worker's demand transformer and see that `get_x` is absent.
+
+Besides, DataCon wrappers are generally inlined in the Final phase (so before
+DmdAnal), all leftover occurrences are in a boring context like
+`f x y = $WMkT y x`. The arguments will be pretty simple, so analysing the
+unfolding will be cheap to analyse, too. Also DataCon wrappers occur seldom
+enough that performance-wise it doesn't matter.
+
 Note [Boxity for bottoming functions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider (A)


=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -1454,44 +1454,43 @@ postInlineUnconditionally env bind_cxt bndr occ_info rhs
   | exprIsTrivial rhs           = True
   | BC_Join {} <- bind_cxt              -- See point (1) of Note [Duplicating join points]
   , not (phase == FinalPhase)   = False -- in Simplify.hs
-  | otherwise
-  = case occ_info of
-      OneOcc { occ_in_lam = in_lam, occ_int_cxt = int_cxt, occ_n_br = n_br }
-        -- See Note [Inline small things to avoid creating a thunk]
-
-        -> n_br < 100  -- See Note [Suppress exponential blowup]
+  | IAmDead <- occ_info         = True  -- This happens; for example, the case_bndr during case of
+                                        -- known constructor:  case (a,b) of x { (p,q) -> ... }
+                                        -- Here x isn't mentioned in the RHS, so we don't want to
+                                        -- create the (dead) let-binding  let x = (a,b) in ...
+
+  -- See Note [Inline small things to avoid creating a thunk]
+  | OneOcc{occ_in_lam=in_lam, occ_int_cxt=int_cxt, occ_n_br=n_br} <- occ_info
+  , n_br < 100 -- See Note [Suppress exponential blowup]
+  , BC_Let {} <- bind_cxt
+      -- Inlining a join point does not avoid a thunk
+  , smallEnoughToInline uf_opts unfolding
+      -- Small enough to dup
+      -- ToDo: consider discount on smallEnoughToInline if int_cxt is true
+      --
+      -- NB: Do NOT inline arbitrarily big things, even if occ_n_br=1
+      -- Reason: doing so risks exponential behaviour.  We simplify a big
+      --         expression, inline it, and simplify it again.  But if the
+      --         very same thing happens in the big expression, we get
+      --         exponential cost!
+      -- PRINCIPLE: when we've already simplified an expression once,
+      -- make sure that we only inline it if it's reasonably small.
+  , in_lam == NotInsideLam
+      -- Outside a lambda, we want to be reasonably aggressive
+      -- about inlining into multiple branches of case
+      -- e.g. let x = <non-value>
+      --      in case y of { C1 -> ..x..; C2 -> ..x..; C3 -> ... }
+      -- Inlining can be a big win if C3 is the hot-spot, even if
+      -- the uses in C1, C2 are not 'interesting'
+      -- An example that gets worse if you add int_cxt here is 'clausify'
+  || (isCheapUnfolding unfolding && int_cxt == IsInteresting)
+      -- isCheap => acceptable work duplication; in_lam may be true
+      -- int_cxt to prevent us inlining inside a lambda without some
+      -- good reason.  See the notes on int_cxt in preInlineUnconditionally
+  = True
 
-           && smallEnoughToInline uf_opts unfolding     -- Small enough to dup
-                        -- ToDo: consider discount on smallEnoughToInline if int_cxt is true
-                        --
-                        -- NB: Do NOT inline arbitrarily big things, even if occ_n_br=1
-                        -- Reason: doing so risks exponential behaviour.  We simplify a big
-                        --         expression, inline it, and simplify it again.  But if the
-                        --         very same thing happens in the big expression, we get
-                        --         exponential cost!
-                        -- PRINCIPLE: when we've already simplified an expression once,
-                        -- make sure that we only inline it if it's reasonably small.
-
-           && (in_lam == NotInsideLam ||
-                        -- Outside a lambda, we want to be reasonably aggressive
-                        -- about inlining into multiple branches of case
-                        -- e.g. let x = <non-value>
-                        --      in case y of { C1 -> ..x..; C2 -> ..x..; C3 -> ... }
-                        -- Inlining can be a big win if C3 is the hot-spot, even if
-                        -- the uses in C1, C2 are not 'interesting'
-                        -- An example that gets worse if you add int_cxt here is 'clausify'
-
-                (isCheapUnfolding unfolding && int_cxt == IsInteresting))
-                        -- isCheap => acceptable work duplication; in_lam may be true
-                        -- int_cxt to prevent us inlining inside a lambda without some
-                        -- good reason.  See the notes on int_cxt in preInlineUnconditionally
-
-      IAmDead -> True   -- This happens; for example, the case_bndr during case of
-                        -- known constructor:  case (a,b) of x { (p,q) -> ... }
-                        -- Here x isn't mentioned in the RHS, so we don't want to
-                        -- create the (dead) let-binding  let x = (a,b) in ...
-
-      _ -> False
+  | otherwise
+  = False
 
 -- Here's an example that we don't handle well:
 --      let f = if b then Left (\x.BIG) else Right (\y.BIG)


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -477,9 +477,9 @@ mkDictSelId name clas
                -- See Note [Type classes and linear types]
 
     base_info = noCafIdInfo
-                `setArityInfo`          1
-                `setDmdSigInfo`     strict_sig
-                `setCprSigInfo`            topCprSig
+                `setArityInfo`  1
+                `setDmdSigInfo` strict_sig
+                `setCprSigInfo` topCprSig
 
     info | new_tycon
          = base_info `setInlinePragInfo` alwaysInlinePragma
@@ -697,6 +697,8 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
                              -- does not tidy the IdInfo of implicit bindings (like the wrapper)
                              -- so it not make sure that the CAF info is sane
 
+             -- The signature is purely for passes like the Simplifier, not for
+             -- DmdAnal itself; see Note [DmdAnal for DataCon wrappers].
              wrap_sig = mkClosedDmdSig wrap_arg_dmds topDiv
 
              wrap_arg_dmds =
@@ -1321,9 +1323,9 @@ mkFCallId uniq fcall ty
     name = mkFCallName uniq occ_str
 
     info = noCafIdInfo
-           `setArityInfo`          arity
-           `setDmdSigInfo`     strict_sig
-           `setCprSigInfo`            topCprSig
+           `setArityInfo`  arity
+           `setDmdSigInfo` strict_sig
+           `setCprSigInfo` topCprSig
 
     (bndrs, _) = tcSplitPiTys ty
     arity      = count isAnonTyCoBinder bndrs


=====================================
testsuite/tests/stranal/sigs/T22241.hs
=====================================
@@ -0,0 +1,31 @@
+module T22241 where
+
+data D = D { unD :: !Int }
+
+-- We should unbox y here, which only happens if DmdAnal sees that $WD will
+-- unbox it.
+f :: Bool -> Int -> D
+f x y = D (go x)
+  where
+    go False = y
+    go True  = go False
+{-# NOINLINE f #-}
+
+
+
+data T a = T Int !a
+get (T _ x) = x
+
+-- Here, the goal is to discard `unD (f True z)` and thus `z` as absent by
+-- looking through $WT in `j` *during the first pass of DmdAnal*!
+g :: Bool -> Int -> Int -> Bool
+g x y z | even y    = get (fst t)
+        | y > 13    = not (get (fst t))
+        | otherwise = False
+  where
+    t | x         = j (unD (f True z))
+      | otherwise = j (unD (f False z))
+      where
+        j a = (T a x, True)
+        {-# NOINLINE j #-}
+{-# NOINLINE g #-}


=====================================
testsuite/tests/stranal/sigs/T22241.stderr
=====================================
@@ -0,0 +1,24 @@
+
+==================== Strictness signatures ====================
+T22241.f: <1L><S!P(L)>
+T22241.g: <L><1!P(L)><A>
+T22241.get: <1!P(A,1L)>
+T22241.unD: <1!P(L)>
+
+
+
+==================== Cpr signatures ====================
+T22241.f: 1
+T22241.g:
+T22241.get:
+T22241.unD: 1
+
+
+
+==================== Strictness signatures ====================
+T22241.f: <1L><1!P(SL)>
+T22241.g: <ML><1!P(L)><A>
+T22241.get: <1!P(A,1L)>
+T22241.unD: <1!P(L)>
+
+


=====================================
testsuite/tests/stranal/sigs/all.T
=====================================
@@ -37,3 +37,4 @@ test('T21717', normal, compile, [''])
 test('T21754', normal, compile, [''])
 test('T21888', normal, compile, [''])
 test('T21888a', normal, compile, [''])
+test('T22241', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/88955c825dba0b546d91110ec35366548a4b586b...fba8a684b348cec5565d06c26479b2f6e3a58cd8

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/88955c825dba0b546d91110ec35366548a4b586b...fba8a684b348cec5565d06c26479b2f6e3a58cd8
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221014/8056102b/attachment-0001.html>


More information about the ghc-commits mailing list