[Git][ghc/ghc][master] Fix demand signatures for join points

Thu Jun 20 11:23:53 UTC 2024

Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
9a757a27 by Simon Peyton Jones at 2024-06-20T07:22:40-04:00
Fix demand signatures for join points

This MR tackles #24623 and #23113

The main change is to give a clearer notion of "worker/wrapper arity", esp
for join points. See GHC.Core.Opt.DmdAnal
     Note [Worker/wrapper arity and join points]
This Note is a good summary of what this MR does:

(1) The "worker/wrapper arity" of an Id is
    * For non-join-points: idArity
    * The join points: the join arity (Id part only of course)
    This is the number of args we will use in worker/wrapper.
    See `ww_arity` in `dmdAnalRhsSig`, and the function `workWrapArity`.

(2) A join point's demand-signature arity may exceed the Id's worker/wrapper
    arity.  See the `arity_ok` assertion in `mkWwBodies`.

(3) In `finaliseArgBoxities`, do trimBoxity on any argument demands beyond
    the worker/wrapper arity.

(4) In WorkWrap.splitFun, make sure we split based on the worker/wrapper
    arity (re)-computed by workWrapArity.

- - - - -


6 changed files:

- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Opt/WorkWrap/Utils.hs
- compiler/GHC/Types/Demand.hs
- + testsuite/tests/dmdanal/should_compile/T24623.hs
- testsuite/tests/dmdanal/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -1008,7 +1008,7 @@ dmdTransform :: AnalEnv   -- ^ The analysis environment
              -> DmdType   -- ^ The demand type unleashed by the variable in this
                           -- context. The returned DmdEnv includes the demand on
                           -- this function plus demand on its free variables

--- See Note [What are demand signatures?] in "GHC.Types.Demand"
+-- See Note [DmdSig: demand signatures, and demand-sig arity] in "GHC.Types.Demand"
 dmdTransform env var sd
   -- Data constructors
   | Just con <- isDataConWorkId_maybe var
@@ -1081,31 +1081,33 @@ dmdAnalRhsSig
 -- Process the RHS of the binding, add the strictness signature
 -- to the Id, and augment the environment with the signature as well.
 -- See Note [NOINLINE and strictness]
-dmdAnalRhsSig top_lvl rec_flag env let_dmd id rhs
+dmdAnalRhsSig top_lvl rec_flag env let_sd id rhs
   = -- pprTrace "dmdAnalRhsSig" (ppr id $$ ppr let_dmd $$ ppr rhs_dmds $$ ppr sig $$ ppr weak_fvs) $
     (final_env, weak_fvs, final_id, final_rhs)
   where
-    threshold_arity = thresholdArity id rhs
+    ww_arity = workWrapArity id rhs
+      -- See Note [Worker/wrapper arity and join points] point (1)
 
-    rhs_dmd = mkCalledOnceDmds threshold_arity body_dmd
-
-    body_dmd
-      | isJoinId id
+    body_sd | isJoinId id = let_sd
+            | otherwise   = topSubDmd
       -- See Note [Demand analysis for join points]
       -- See Note [Invariants on join points] invariant 2b, in GHC.Core
-      --     threshold_arity matches the join arity of the join point
-      -- See Note [Unboxed demand on function bodies returning small products]
-      = unboxedWhenSmall env rec_flag (resultType_maybe id) let_dmd
-      | otherwise
+      --     ww_arity matches the join arity of the join point
+
+    adjusted_body_sd = unboxedWhenSmall env rec_flag (resultType_maybe id) body_sd
       -- See Note [Unboxed demand on function bodies returning small products]
-      = unboxedWhenSmall env rec_flag (resultType_maybe id) topSubDmd
 
-    WithDmdType rhs_dmd_ty rhs' = dmdAnal env rhs_dmd rhs
+    rhs_sd = mkCalledOnceDmds ww_arity adjusted_body_sd
+
+    WithDmdType rhs_dmd_ty rhs' = dmdAnal env rhs_sd rhs
     DmdType rhs_env rhs_dmds = rhs_dmd_ty
-    (final_rhs_dmds, final_rhs) = finaliseArgBoxities env id threshold_arity
+    (final_rhs_dmds, final_rhs) = finaliseArgBoxities env id ww_arity
                                                       rhs_dmds (de_div rhs_env) rhs'
 
-    sig = mkDmdSigForArity threshold_arity (DmdType sig_env final_rhs_dmds)
+    dmd_sig_arity = ww_arity + strictCallArity body_sd
+    sig = mkDmdSigForArity dmd_sig_arity (DmdType sig_env final_rhs_dmds)
+          -- strictCallArity is > 0 only for join points
+          -- See Note [mkDmdSigForArity]
 
     opts       = ae_opts env
     final_id   = setIdDmdAndBoxSig opts id sig
@@ -1137,13 +1139,6 @@ splitWeakDmds :: DmdEnv -> (DmdEnv, WeakDmds)
 splitWeakDmds (DE fvs div) = (DE sig_fvs div, weak_fvs)
   where (!weak_fvs, !sig_fvs) = partitionVarEnv isWeakDmd fvs
 
-thresholdArity :: Id -> CoreExpr -> Arity
--- See Note [Demand signatures are computed for a threshold arity based on idArity]
-thresholdArity fn rhs
-  = case idJoinPointHood fn of
-      JoinPoint join_arity -> count isId $ fst $ collectNBinders join_arity rhs
-      NotJoinPoint         -> idArity fn
-
 -- | The result type after applying 'idArity' many arguments. Returns 'Nothing'
 -- when the type doesn't have exactly 'idArity' many arrows.
 resultType_maybe :: Id -> Maybe Type
@@ -1243,47 +1238,97 @@ Consider
                    B -> j 4
                    C -> (p,7))
 
-If j was a vanilla function definition, we'd analyse its body with
-evalDmd, and think that it was lazy in p.  But for join points we can
-do better!  We know that j's body will (if called at all) be evaluated
-with the demand that consumes the entire join-binding, in this case
-the argument demand from g.  Whizzo!  g evaluates both components of
-its argument pair, so p will certainly be evaluated if j is called.
+If j was a vanilla function definition, we'd analyse its body with evalDmd, and
+think that it was lazy in p.  But for join points we can do better!  We know
+that j's body will (if called at all) be evaluated with the demand that consumes
+the entire join-binding, in this case the argument demand from g.  Whizzo!  g
+evaluates both components of its argument pair, so p will certainly be evaluated
+if j is called.
 
-For f to be strict in p, we need /all/ paths to evaluate p; in this
-case the C branch does so too, so we are fine.  So, as usual, we need
-to transport demands on free variables to the call site(s).  Compare
-Note [Lazy and unleashable free variables].
+For f to be strict in p, we need /all/ paths to evaluate p; in this case the C
+branch does so too, so we are fine.  So, as usual, we need to transport demands
+on free variables to the call site(s).  Compare Note [Lazy and unleashable free
+variables].
 
-The implementation is easy.  When analysing a join point, we can
-analyse its body with the demand from the entire join-binding (written
-let_dmd here).
+The implementation is easy: see `body_sd` in`dmdAnalRhsSig`.  When analysing
+a join point, we can analyse its body (after stripping off the join binders,
+here just 'y') with the demand from the entire join-binding (written `let_sd`
+here).
 
 Another win for join points!  #13543.
 
-However, note that the strictness signature for a join point can
-look a little puzzling.  E.g.
+BUT see Note [Worker/wrapper arity and join points].
+
+Note we may analyse the rhs of a join point with a demand that is either
+bigger than, or smaller than, the number of lambdas syntactically visible.
+* More lambdas than call demands:
+       join j x = \p q r -> blah in ...
+  in a context with demand Top.
+
+* More call demands than lambdas:
+       (join j x = h in ..(j 2)..(j 3)) a b c
 
+Note [Worker/wrapper arity and join points]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
     (join j x = \y. error "urk")
     (in case v of              )
     (     A -> j 3             )  x
     (     B -> j 4             )
     (     C -> \y. blah        )
 
-The entire thing is in a C(1,L) context, so j's strictness signature
-will be    [A]b
-meaning one absent argument, returns bottom.  That seems odd because
-there's a \y inside.  But it's right because when consumed in a C(1,L)
-context the RHS of the join point is indeed bottom.
+The entire thing is in a C(1,L) context, so we will analyse j's body, namely
+   \y. error "urk"
+with demand C(C(1,L)).  See `rhs_sd` in `dmdAnalRhsSig`.  That will produce
+a demand signature of <A><A>b: and indeed `j` diverges when given two arguments.
+
+BUT we do /not/ want to worker/wrapper `j` with two arguments.  Suppose we have
+     join j2 :: Int -> Int -> blah
+          j2 x = rhs
+     in ...(j2 3)...(j2 4)...
+
+where j2's join-arity is 1, so calls to `j` will all have /one/ argument.
+Suppose the entire expression is in a called context (like `j` above) and `j2`
+gets the demand signature <1!P(L)><1!P(L)>, that is, strict in both arguments.
+
+we worker/wrapper'd `j2` with two args we'd get
+     join $wj2 x# y# = let x = I# x#; y = I# y# in rhs
+          j2 x = \y. case x of I# x# -> case y of I# y# -> $wj2 x# y#
+     in ...(j2 3)...(j2 4)...
+But now `$wj2`is no longer a join point. Boo.
+
+Instead if we w/w at all, we want to do so only with /one/ argument:
+     join $wj2 x# = let x = I# x# in rhs
+          j2 x = case x of I# x# -> $wj2 x#
+     in ...(j2 3)...(j2 4)...
+Now all is fine.  BUT in `finaliseArgBoxities` we should trim y's boxity,
+to reflect the fact tta we aren't going to unbox `y` at all.
+
+Conclusion:
+
+(1) The "worker/wrapper arity" of an Id is
+    * For non-join-points: idArity
+    * The join points: the join arity (Id part only of course)
+    This is the number of args we will use in worker/wrapper.
+    See `ww_arity` in `dmdAnalRhsSig`, and the function `workWrapArity`.
 
-Note [Demand signatures are computed for a threshold arity based on idArity]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Given a binding { f = rhs }, we compute a "theshold arity", and do demand
-analysis based on a call with that many value arguments.
+(2) A join point's demand-signature arity may exceed the Id's worker/wrapper
+    arity.  See the `arity_ok` assertion in `mkWwBodies`.
 
-The threshold we use is
+(3) In `finaliseArgBoxities`, do trimBoxity on any argument demands beyond
+    the worker/wrapper arity.
 
-* Ordinary bindings: idArity f.
+(4) In WorkWrap.splitFun, make sure we split based on the worker/wrapper
+    arity (re)-computed by workWrapArity.
+
+Note [The demand for the RHS of a binding]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Given a binding { f = rhs }, in `dmdAnalRhsSig` we compute a `rhs_sd` in
+which to analyse `rhs`.
+
+The demand we use is:
+
+* Ordinary bindings: a call-demand of depth (idArity f).
   Why idArity arguments? Because that's a conservative estimate of how many
   arguments we must feed a function before it does anything interesting with
   them.  Also it elegantly subsumes the trivial RHS and PAP case.  E.g. for
@@ -1293,22 +1338,17 @@ The threshold we use is
   idArity is /at least/ the number of manifest lambdas, but might be higher for
   PAPs and trivial RHS (see Note [Demand analysis for trivial right-hand sides]).
 
-* Join points: the value-binder subset of the JoinArity.  This can
-  be less than the number of visible lambdas; e.g.
-     join j x = \y. blah
-     in ...(jump j 2)....(jump j 3)....
-  We know that j will never be applied to more than 1 arg (its join
-  arity, and we don't eta-expand join points, so here a threshold
-  of 1 is the best we can do.
+* Join points: a call-demand of depth (value-binder subset of JoinArity),
+  wrapped around the incoming demand for the entire expression; see
+  Note [Demand analysis for join points]
 
 Note that the idArity of a function varies independently of its cardinality
 properties (cf. Note [idArity varies independently of dmdTypeDepth]), so we
-implicitly encode the arity for when a demand signature is sound to unleash
-in its 'dmdTypeDepth', not in its idArity (cf. Note [Understanding DmdType
-and DmdSig] in GHC.Types.Demand). It is unsound to unleash a demand
-signature when the incoming number of arguments is less than that. See
-GHC.Types.Demand Note [What are demand signatures?]  for more details on
-soundness.
+implicitly encode the arity for when a demand signature is sound to unleash in
+its 'dmdTypeDepth', not in its idArity (cf. Note [Understanding DmdType and
+DmdSig] in GHC.Types.Demand). It is unsound to unleash a demand signature when
+the incoming number of arguments is less than that. See GHC.Types.Demand
+Note [DmdSig: demand signatures, and demand-sig arity].
 
 Note that there might, in principle, be functions for which we might want to
 analyse for more incoming arguments than idArity. Example:
@@ -1339,6 +1379,30 @@ signatures for different arities (i.e., polyvariance) would be entirely
 possible, if it weren't for the additional runtime and implementation
 complexity.
 
+Note [mkDmdSigForArity]
+~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+   f x = if expensive x
+         then \y. blah1
+         else \y. blah2
+We will analyse the body with demand C(1L), reflecting the single visible
+argument x.  But dmdAnal will return a DmdType looking like
+    DmdType fvs [x-dmd, y-dmd]
+because it has seen two lambdas, \x and \y. Since the length of the argument
+demands in a DmdSig gives the "threshold" for applying the signature
+(see Note [DmdSig: demand signatures, and demand-sig arity] in GHC.Types.Demand)
+we must trim that DmdType to just
+    DmdSig (DmdTypte fvs [x-dmd])
+when making that DmdType into the DmdSig for f.  This trimming is the job of
+`mkDmdSigForArity`.
+
+Alternative.  An alternative would be be to ensure that if
+    (dmd_ty, e') = dmdAnal env subdmd e
+then the length dmds in dmd_ty is always less than (or maybe equal to?) the
+call-depth of subdmd.  To do that we'd need to adjust the Lam case of dmdAnal.
+Probably not hard, but a job for another day; see discussion on !12873, #23113,
+and #21392.
+
 Note [idArity varies independently of dmdTypeDepth]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 In general, an Id `f` has two independently varying attributes:
@@ -1932,30 +1996,35 @@ positiveTopBudget (MkB n _) = n >= 0
 finaliseArgBoxities :: AnalEnv -> Id -> Arity
                     -> [Demand] -> Divergence
                     -> CoreExpr -> ([Demand], CoreExpr)
-finaliseArgBoxities env fn threshold_arity rhs_dmds div rhs
+-- POSTCONDITION:
+-- If:    (dmds', rhs') = finaliseArgBoxitities ... dmds .. rhs
+-- Then:
+--     dmds' is the same as dmds (including length), except for boxity info
+--     rhs'  is the same as rhs, except for dmd info on lambda binders
+-- NB: For join points, length dmds might be greater than ww_arity
+finaliseArgBoxities env fn ww_arity arg_dmds div rhs
 
   -- Check for an OPAQUE function: see Note [OPAQUE pragma]
   -- In that case, trim off all boxity info from argument demands
   -- and demand info on lambda binders
   -- See Note [The OPAQUE pragma and avoiding the reboxing of arguments]
   | isOpaquePragma (idInlinePragma fn)
-  , let trimmed_rhs_dmds = map trimBoxity rhs_dmds
-  = (trimmed_rhs_dmds, set_lam_dmds trimmed_rhs_dmds rhs)
+  , let trimmed_arg_dmds = map trimBoxity arg_dmds
+  = (trimmed_arg_dmds, set_lam_dmds trimmed_arg_dmds rhs)
 
   -- Check that we have enough visible binders to match the
-  -- threshold arity; if not, we won't do worker/wrapper
+  -- ww arity; if not, we won't do worker/wrapper
   -- This happens if we have simply  {f = g} or a PAP {f = h 13}
   -- we simply want to give f the same demand signature as g
   -- How can such bindings arise?  Perhaps from {-# NOLINE[2] f #-},
   -- or if the call to `f` is currently not-applied (map f xs).
   -- It's a bit of a corner case.  Anyway for now we pass on the
   -- unadulterated demands from the RHS, without any boxity trimming.
-  | threshold_arity > count isId bndrs
-  = (rhs_dmds, rhs)
+  | ww_arity > count isId bndrs
+  = (arg_dmds, rhs)
 
   -- The normal case
-  | otherwise -- NB: threshold_arity might be less than
-              -- manifest arity for join points
+  | otherwise
   = -- pprTrace "finaliseArgBoxities" (
     --   vcat [text "function:" <+> ppr fn
     --        , text "max" <+> ppr max_wkr_args
@@ -1966,23 +2035,29 @@ finaliseArgBoxities env fn threshold_arity rhs_dmds div rhs
     -- of the function, both because that's kosher, and because CPR analysis
     -- uses the info on the binders directly.
   where
-    opts            = ae_opts env
-    (bndrs, _body)  = collectBinders rhs
-    unarise_arity   = sum [ unariseArity (idType b) | b <- bndrs, isId b ]
-    max_wkr_args    = dmd_max_worker_args opts `max` unarise_arity
-                      -- This is the budget initialisation step of
-                      -- Note [Worker argument budget]
-
-    -- This is the key line, which uses almost-circular programming
-    -- The remaining budget from one layer becomes the initial
-    -- budget for the next layer down.  See Note [Worker argument budget]
-    (remaining_budget, arg_dmds') = go_args (MkB max_wkr_args remaining_budget) arg_triples
+    opts           = ae_opts env
+    (bndrs, _body) = collectBinders rhs
+       -- NB: in the interesting code path, count isId bndrs >= ww_arity
 
     arg_triples :: [(Type, StrictnessMark, Demand)]
-    arg_triples = take threshold_arity $
+    arg_triples = take ww_arity $
                   [ (idType bndr, NotMarkedStrict, get_dmd bndr)
                   | bndr <- bndrs, isRuntimeVar bndr ]
 
+    arg_dmds' = ww_arg_dmds ++ map trimBoxity (drop ww_arity arg_dmds)
+                -- If ww_arity < length arg_dmds, the leftover ones
+                -- will not be w/w'd, so trimBoxity them
+                -- See Note [Worker/wrapper arity and join points] point (3)
+
+    -- This is the key line, which uses almost-circular programming
+    -- The remaining budget from one layer becomes the initial
+    -- budget for the next layer down.  See Note [Worker argument budget]
+    (remaining_budget, ww_arg_dmds) = go_args (MkB max_wkr_args remaining_budget) arg_triples
+    unarise_arity = sum [ unariseArity (idType b) | b <- bndrs, isId b ]
+    max_wkr_args  = dmd_max_worker_args opts `max` unarise_arity
+                    -- This is the budget initialisation step of
+                    -- Note [Worker argument budget]
+
     get_dmd :: Id -> Demand
     get_dmd bndr
       | is_bot_fn = unboxDeeplyDmd dmd -- See Note [Boxity for bottoming functions],


=====================================
compiler/GHC/Core/Opt/WorkWrap.hs
=====================================
@@ -758,11 +758,11 @@ by LitRubbish (see Note [Drop absent bindings]) so there is no great harm.
 ---------------------
 splitFun :: WwOpts -> Id -> CoreExpr -> UniqSM [(Id, CoreExpr)]
 splitFun ww_opts fn_id rhs
-  | Just (arg_vars, body) <- collectNValBinders_maybe (length wrap_dmds) rhs
+  | Just (arg_vars, body) <- collectNValBinders_maybe ww_arity rhs
   = warnPprTrace (not (wrap_dmds `lengthIs` (arityInfo fn_info)))
                  "splitFun"
                  (ppr fn_id <+> (ppr wrap_dmds $$ ppr cpr)) $
-    do { mb_stuff <- mkWwBodies ww_opts fn_id arg_vars (exprType body) wrap_dmds cpr
+    do { mb_stuff <- mkWwBodies ww_opts fn_id ww_arity arg_vars (exprType body) wrap_dmds cpr
        ; case mb_stuff of
             Nothing -> -- No useful wrapper; leave the binding alone
                        return [(fn_id, rhs)]
@@ -794,8 +794,10 @@ splitFun ww_opts fn_id rhs
   = return [(fn_id, rhs)]
 
   where
-    uf_opts = so_uf_opts (wo_simple_opts ww_opts)
-    fn_info = idInfo fn_id
+    uf_opts  = so_uf_opts (wo_simple_opts ww_opts)
+    fn_info  = idInfo fn_id
+    ww_arity = workWrapArity fn_id rhs
+      -- workWrapArity: see (4) in Note [Worker/wrapper arity and join points] in DmdAnal
 
     (wrap_dmds, div) = splitDmdSig (dmdSigInfo fn_info)
 


=====================================
compiler/GHC/Core/Opt/WorkWrap/Utils.hs
=====================================
@@ -15,7 +15,7 @@ module GHC.Core.Opt.WorkWrap.Utils
    , findTypeShape, IsRecDataConResult(..), isRecDataCon
    , mkAbsentFiller
    , isWorkerSmallEnough, dubiousDataConInstArgTys
-   , boringSplit , usefulSplit
+   , boringSplit, usefulSplit, workWrapArity
    )
 where
 
@@ -159,6 +159,7 @@ nop_fn body = body
 
 mkWwBodies :: WwOpts
            -> Id             -- ^ The original function
+           -> Arity          -- ^ Worker/wrapper arity
            -> [Var]          -- ^ Manifest args of original function
            -> Type           -- ^ Result type of the original function,
                              --   after being stripped of args
@@ -205,8 +206,8 @@ mkWwBodies :: WwOpts
 -- and beta-redexes]), which allows us to apply the same split to function body
 -- and its unfolding(s) alike.
 --
-mkWwBodies opts fun_id arg_vars res_ty demands res_cpr
-  = do  { massertPpr (filter isId arg_vars `equalLength` demands)
+mkWwBodies opts fun_id ww_arity arg_vars res_ty demands res_cpr
+  = do  { massertPpr arity_ok
                      (text "wrong wrapper arity" $$ ppr fun_id $$ ppr arg_vars $$ ppr res_ty $$ ppr demands)
 
         -- Clone and prepare arg_vars of the original fun RHS
@@ -271,6 +272,10 @@ mkWwBodies opts fun_id arg_vars res_ty demands res_cpr
       | otherwise
       = False
 
+    n_dmds = length demands
+    arity_ok | isJoinId fun_id = ww_arity <= n_dmds
+             | otherwise       = ww_arity == n_dmds
+
 -- | Version of 'GHC.Core.mkApps' that does beta reduction on-the-fly.
 -- PRECONDITION: The arg expressions are not free in any of the lambdas binders.
 mkAppsBeta :: CoreExpr -> [CoreArg] -> CoreExpr
@@ -288,6 +293,13 @@ isWorkerSmallEnough max_worker_args old_n_args vars
     -- Also if the function took 82 arguments before (old_n_args), it's fine if
     -- it takes <= 82 arguments afterwards.
 
+workWrapArity :: Id -> CoreExpr -> Arity
+-- See Note [Worker/wrapper arity and join points] in DmdAnal
+workWrapArity fn rhs
+  = case idJoinPointHood fn of
+      JoinPoint join_arity -> count isId $ fst $ collectNBinders join_arity rhs
+      NotJoinPoint         -> idArity fn
+
 {-
 Note [Always do CPR w/w]
 ~~~~~~~~~~~~~~~~~~~~~~~~


=====================================
compiler/GHC/Types/Demand.hs
=====================================
@@ -1,3 +1,4 @@
+
 {-# LANGUAGE ViewPatterns #-}
 {-# LANGUAGE BinaryLiterals #-}
 {-# LANGUAGE PatternSynonyms #-}
@@ -38,7 +39,7 @@ module GHC.Types.Demand (
     lazyApply1Dmd, lazyApply2Dmd, strictOnceApply1Dmd, strictManyApply1Dmd,
     -- ** Other @Demand@ operations
     oneifyCard, oneifyDmd, strictifyDmd, strictifyDictDmd, lazifyDmd,
-    peelCallDmd, peelManyCalls, mkCalledOnceDmd, mkCalledOnceDmds,
+    peelCallDmd, peelManyCalls, mkCalledOnceDmd, mkCalledOnceDmds, strictCallArity,
     mkWorkerDemand, subDemandIfEvaluated,
     -- ** Extracting one-shot information
     callCards, argOneShots, argsOneShots, saturatedByOneShots,
@@ -1037,6 +1038,12 @@ peelManyCalls k sd = go k C_11 sd
     go _ _  _                          = (topCard, topSubDmd)
 {-# INLINE peelManyCalls #-} -- so that the pair cancels away in a `fst _` context
 
+strictCallArity :: SubDemand -> Arity
+strictCallArity sd = go 0 sd
+  where
+    go n (Call card sd) | isStrict card = go (n+1) sd
+    go n _                              = n
+
 -- | Extract the 'SubDemand' of a 'Demand'.
 -- PRECONDITION: The SubDemand must be used in a context where the expression
 -- denoted by the Demand is under evaluation.
@@ -2073,6 +2080,12 @@ body of the function.
 *                                                                      *
 ************************************************************************
 
+Note [DmdSig: demand signatures, and demand-sig arity]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+See also
+  * Note [Demand signatures semantically]
+  * Note [Understanding DmdType and DmdSig]
+
 In a let-bound Id we record its demand signature.
 In principle, this demand signature is a demand transformer, mapping
 a demand on the Id into a DmdType, which gives
@@ -2083,20 +2096,22 @@ a demand on the Id into a DmdType, which gives
 
 However, in fact we store in the Id an extremely emasculated demand
 transformer, namely
-
-                a single DmdType
+        a single DmdType
 (Nevertheless we dignify DmdSig as a distinct type.)
 
-This DmdType gives the demands unleashed by the Id when it is applied
-to as many arguments as are given in by the arg demands in the DmdType.
+The DmdSig for an Id is a semantic thing.  Suppose a function `f` has a DmdSig of
+  DmdSig (DmdType (fv_dmds,res) [d1..dn])
+Here `n` is called the "demand-sig arity" of the DmdSig.  The signature means:
+  * If you apply `f` to n arguments (the demand-sig-arity)
+  * then you can unleash demands d1..dn on the arguments
+  * and demands fv_dmds on the free variables.
 Also see Note [Demand type Divergence] for the meaning of a Divergence in a
-strictness signature.
+demand signature.
 
-If an Id is applied to less arguments than its arity, it means that
-the demand on the function at a call site is weaker than the vanilla
-call demand, used for signature inference. Therefore we place a top
-demand on all arguments. Otherwise, the demand is specified by Id's
-signature.
+If `f` is applied to fewer value arguments than its demand-sig arity, it means
+that the demand on the function at a call site is weaker than the vanilla call
+demand, used for signature inference. Therefore we place a top demand on all
+arguments.
 
 For example, the demand transformer described by the demand signature
         DmdSig (DmdType {x -> <1L>} <A><1P(L,L)>)
@@ -2107,6 +2122,61 @@ and 1P(L,L) on the second.
 If this same function is applied to one arg, all we can say is that it
 uses x with 1L, and its arg with demand 1P(L,L).
 
+Note [Demand signatures semantically]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Demand analysis interprets expressions in the abstract domain of demand
+transformers. Given a (sub-)demand that denotes the evaluation context, the
+abstract transformer of an expression gives us back a demand type denoting
+how other things (like arguments and free vars) were used when the expression
+was evaluated. Here's an example:
+
+  f x y =
+    if x + expensive
+      then \z -> z + y * ...
+      else \z -> z * ...
+
+The abstract transformer (let's call it F_e) of the if expression (let's
+call it e) would transform an incoming (undersaturated!) head sub-demand A
+into a demand type like {x-><1L>,y-><L>}<L>. In pictures:
+
+     SubDemand ---F_e---> DmdType
+     <A>                  {x-><1L>,y-><L>}<L>
+
+Let's assume that the demand transformers we compute for an expression are
+correct wrt. to some concrete semantics for Core. How do demand signatures fit
+in? They are strange beasts, given that they come with strict rules when to
+it's sound to unleash them.
+
+Fortunately, we can formalise the rules with Galois connections. Consider
+f's strictness signature, {}<1L><L>. It's a single-point approximation of
+the actual abstract transformer of f's RHS for arity 2. So, what happens is that
+we abstract *once more* from the abstract domain we already are in, replacing
+the incoming Demand by a simple lattice with two elements denoting incoming
+arity: A_2 = {<2, >=2} (where '<2' is the top element and >=2 the bottom
+element). Here's the diagram:
+
+     A_2 -----f_f----> DmdType
+      ^                   |
+      | α               γ |
+      |                   v
+  SubDemand --F_f----> DmdType
+
+With
+  α(C(1,C(1,_))) = >=2
+  α(_)         =  <2
+  γ(ty)        =  ty
+and F_f being the abstract transformer of f's RHS and f_f being the abstracted
+abstract transformer computable from our demand signature simply by
+
+  f_f(>=2) = {}<1L><L>
+  f_f(<2)  = multDmdType C_0N {}<1L><L>
+
+where multDmdType makes a proper top element out of the given demand type.
+
+In practice, the A_n domain is not just a simple Bool, but a Card, which is
+exactly the Card with which we have to multDmdType. The Card for arity n
+is computed by calling @peelManyCalls n@, which corresponds to α above.
+
 Note [Understanding DmdType and DmdSig]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Demand types are sound approximations of an expression's semantics relative to
@@ -2119,10 +2189,10 @@ Here is a table with demand types resulting from different incoming demands we
 put that expression under. Note the monotonicity; a stronger incoming demand
 yields a more precise demand type:
 
-    incoming demand   |  demand type
+    incoming sub-demand   |  demand type
     --------------------------------
-    1A                  |  <L><L>{}
-    C(1,C(1,L))           |  <1P(L)><L>{}
+    P(A)                  |  <L><L>{}
+    C(1,C(1,P(L)))        |  <1P(L)><L>{}
     C(1,C(1,1P(1P(L),A))) |  <1P(A)><A>{}
 
 Note that in the first example, the depth of the demand type was *higher* than
@@ -2143,11 +2213,11 @@ being a newtype wrapper around DmdType, it actually encodes two things:
   * A demand type that is sound to unleash when the minimum arity requirement is
     met.
 
-Here comes the subtle part: The threshold is encoded in the wrapped demand
-type's depth! So in mkDmdSigForArity we make sure to trim the list of
-argument demands to the given threshold arity. Call sites will make sure that
-this corresponds to the arity of the call demand that elicited the wrapped
-demand type. See also Note [What are demand signatures?].
+Here comes the subtle part: The threshold is encoded in the demand-sig arity!
+So in mkDmdSigForArity we make sure to trim the list of argument demands to the
+given threshold arity. Call sites will make sure that this corresponds to the
+arity of the call demand that elicited the wrapped demand type. See also
+Note [DmdSig: demand signatures, and demand-sig arity]
 -}
 
 -- | The depth of the wrapped 'DmdType' encodes the arity at which it is safe
@@ -2160,9 +2230,11 @@ newtype DmdSig
 -- | Turns a 'DmdType' computed for the particular 'Arity' into a 'DmdSig'
 -- unleashable at that arity. See Note [Understanding DmdType and DmdSig].
 mkDmdSigForArity :: Arity -> DmdType -> DmdSig
-mkDmdSigForArity arity dmd_ty@(DmdType fvs args)
-  | arity < dmdTypeDepth dmd_ty = DmdSig $ DmdType fvs (take arity args)
-  | otherwise                   = DmdSig (etaExpandDmdType arity dmd_ty)
+mkDmdSigForArity threshold_arity dmd_ty@(DmdType fvs args)
+  | threshold_arity < dmdTypeDepth dmd_ty
+  = DmdSig $ DmdType (fvs { de_div = topDiv }) (take threshold_arity args)
+  | otherwise
+  = DmdSig (etaExpandDmdType threshold_arity dmd_ty)
 
 mkClosedDmdSig :: [Demand] -> Divergence -> DmdSig
 mkClosedDmdSig ds div = mkDmdSigForArity (length ds) (DmdType (mkEmptyDmdEnv div) ds)
@@ -2307,7 +2379,7 @@ etaConvertDmdSig arity (DmdSig dmd_ty)
 -- whether it diverges.
 --
 -- See Note [Understanding DmdType and DmdSig]
--- and Note [What are demand signatures?].
+-- and Note [DmdSig: demand signatures, and demand-sig arity]
 type DmdTransformer = SubDemand -> DmdType
 
 -- | Extrapolate a demand signature ('DmdSig') into a 'DmdTransformer'.
@@ -2318,7 +2390,7 @@ dmdTransformSig :: DmdSig -> DmdTransformer
 dmdTransformSig (DmdSig dmd_ty@(DmdType _ arg_ds)) sd
   = multDmdType (fst $ peelManyCalls (length arg_ds) sd) dmd_ty
     -- see Note [Demands from unsaturated function calls]
-    -- and Note [What are demand signatures?]
+    -- and Note [DmdSig: demand signatures, and demand-sig arity]
 
 -- | A special 'DmdTransformer' for data constructors that feeds product
 -- demands into the constructor arguments.
@@ -2356,61 +2428,6 @@ dmdTransformDictSelSig (DmdSig (DmdType _ [_ :* prod])) call_sd
 dmdTransformDictSelSig sig sd = pprPanic "dmdTransformDictSelSig: no args" (ppr sig $$ ppr sd)
 
 {-
-Note [What are demand signatures?]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Demand analysis interprets expressions in the abstract domain of demand
-transformers. Given a (sub-)demand that denotes the evaluation context, the
-abstract transformer of an expression gives us back a demand type denoting
-how other things (like arguments and free vars) were used when the expression
-was evaluated. Here's an example:
-
-  f x y =
-    if x + expensive
-      then \z -> z + y * ...
-      else \z -> z * ...
-
-The abstract transformer (let's call it F_e) of the if expression (let's
-call it e) would transform an incoming (undersaturated!) head demand 1A into
-a demand type like {x-><1L>,y-><L>}<L>. In pictures:
-
-     Demand ---F_e---> DmdType
-     <1A>              {x-><1L>,y-><L>}<L>
-
-Let's assume that the demand transformers we compute for an expression are
-correct wrt. to some concrete semantics for Core. How do demand signatures fit
-in? They are strange beasts, given that they come with strict rules when to
-it's sound to unleash them.
-
-Fortunately, we can formalise the rules with Galois connections. Consider
-f's strictness signature, {}<1L><L>. It's a single-point approximation of
-the actual abstract transformer of f's RHS for arity 2. So, what happens is that
-we abstract *once more* from the abstract domain we already are in, replacing
-the incoming Demand by a simple lattice with two elements denoting incoming
-arity: A_2 = {<2, >=2} (where '<2' is the top element and >=2 the bottom
-element). Here's the diagram:
-
-     A_2 -----f_f----> DmdType
-      ^                   |
-      | α               γ |
-      |                   v
-  SubDemand --F_f----> DmdType
-
-With
-  α(C(1,C(1,_))) = >=2
-  α(_)         =  <2
-  γ(ty)        =  ty
-and F_f being the abstract transformer of f's RHS and f_f being the abstracted
-abstract transformer computable from our demand signature simply by
-
-  f_f(>=2) = {}<1L><L>
-  f_f(<2)  = multDmdType C_0N {}<1L><L>
-
-where multDmdType makes a proper top element out of the given demand type.
-
-In practice, the A_n domain is not just a simple Bool, but a Card, which is
-exactly the Card with which we have to multDmdType. The Card for arity n
-is computed by calling @peelManyCalls n@, which corresponds to α above.
-
 Note [Demand transformer for a dictionary selector]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Suppose we have a superclass selector 'sc_sel' and a class method


=====================================
testsuite/tests/dmdanal/should_compile/T24623.hs
=====================================
@@ -0,0 +1,16 @@
+-- This gave a Lint error in HEAD (Jun 24)
+module T24623 where
+
+{-# NOINLINE app #-}
+app :: Int -> (Int -> (Int,Int)) -> (Int,Int)
+app x f = if x>0 then f x else (0,0)
+
+foo :: String -> Bool -> Bool -> Int -> (Int,Int)
+foo s b b2 y = app y (let {-# NOINLINE j #-}
+                          j :: Int -> (Int,Int)
+                          j = \z -> error s
+                      in case b of
+                         True -> j
+                         False -> case b2 of
+                                    True -> \x -> (x-1, x+1)
+                                    False -> j)


=====================================
testsuite/tests/dmdanal/should_compile/all.T
=====================================
@@ -97,3 +97,4 @@ test('T22388', [ grep_errmsg(r'^\S+\$w\S+') ], compile, ['-dsuppress-uniques -dd
 # T22997: Just a panic that should not happen
 test('T22997', normal, compile, [''])
 test('T23398', normal, compile, ['-dsuppress-uniques -ddump-simpl -dno-typeable-binds'])
+test('T24623', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/9a757a27ee95afa092899b404cb95881c7578202

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/9a757a27ee95afa092899b404cb95881c7578202
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240620/4f75ab29/attachment-0001.html>