[Git][ghc/ghc][wip/hnf-spec] 3 commits: Make exprIsHNF imply exprOkForSpeculation

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Mon Nov 11 11:39:39 UTC 2024



Sebastian Graf pushed to branch wip/hnf-spec at Glasgow Haskell Compiler / GHC


Commits:
35076bb4 by Sebastian Graf at 2024-11-11T12:38:08+01:00
Make exprIsHNF imply exprOkForSpeculation

Fixes #23256 because SetLevels and CSE no longer do a reverse binder
swap.
Furthermore, we no longer need to check for exprIsHNF when
exprOkForSpeculation fails.

Also fixes #25423.

I had a brief look at `-v` in T5642 (which regresses ghc/alloc by 2-3%).
It is essentially a benchmark on `Generic` code and uses `exprOkForSpeculation`
quite a lot, which now maintains a substitution, explaining the regression.
No changes to any intermediate programs.

I also had a look at `-v` in T9961 (which regresses ghc/alloc by 1-2%)

In T16577 (regresses ghc/alloc by 2-3%) we end up with a marginally smaller
program, but the benefit is offset by an additional Simplifier iteration.

Metric Decrease:
    T9872b_defer
    T9872d
Metric Increase:
    T5642
    T9961
    T16577

- - - - -
e2838bff by Sebastian Graf at 2024-11-11T12:38:08+01:00
SetLevels: Use `exprOkForSpeculation` instead of `exprIsHNF`

- - - - -
be75e30f by Sebastian Graf at 2024-11-11T12:38:08+01:00
Prep: Fix outcommented debug pretty-printing logic

- - - - -


14 changed files:

- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/CSE.hs
- compiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/WorkWrap/Utils.hs
- compiler/GHC/Core/Subst.hs
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Utils/Misc.hs
- + testsuite/tests/simplCore/should_compile/T25423.hs
- + testsuite/tests/simplCore/should_compile/T25423.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Lint.hs
=====================================
@@ -592,7 +592,7 @@ lintLetBind top_lvl rec_flag binder rhs rhs_ty
                || (isNonRec rec_flag && exprOkForSpeculation rhs)
                || isDataConWorkId binder || isDataConWrapId binder -- until #17521 is fixed
                || exprIsTickedString rhs)
-           (badBndrTyMsg binder (text "unlifted"))
+           (badUnliftedBndrTyMsg binder rhs)
 
         -- Check that if the binder is at the top level and has type Addr#,
         -- that it is a string literal.
@@ -3741,10 +3741,11 @@ mkRhsMsg binder what ty
      hsep [text "Binder's type:", ppr (idType binder)],
      hsep [text "Rhs type:", ppr ty]]
 
-badBndrTyMsg :: Id -> SDoc -> SDoc
-badBndrTyMsg binder what
-  = vcat [ text "The type of this binder is" <+> what <> colon <+> ppr binder
-         , text "Binder's type:" <+> ppr (idType binder) ]
+badUnliftedBndrTyMsg :: Id -> CoreExpr -> SDoc
+badUnliftedBndrTyMsg bndr rhs
+  = vcat [ text "The let binder" <+> ppr bndr <+> text "does not satisfy the let-can-float invariant"
+         , text "Type:" <+> ppr (idType bndr)
+         , text "RHS:" <+> ppr rhs ]
 
 mkNonTopExportedMsg :: Id -> SDoc
 mkNonTopExportedMsg binder


=====================================
compiler/GHC/Core/Opt/CSE.hs
=====================================
@@ -106,7 +106,11 @@ Let-bindings have two cases, implemented by extendCSEnvWithBinding.
       the substitution so that we can CSE the binding for y2.
 
     - Second, we use extendCSEnvWithBinding for case expression scrutinees too;
-      see Note [CSE for case expressions]
+      see Note [CSE for case expressions].
+      Although we must be careful not to substitute an unevaluated variable for
+      an evaluated case binder, because that may degrade ok-for-spec-ness and
+      hence cause violations of the Note [Core let-can-float invariant].
+      Hence we only substitute variables of same evaluatedness.
 
 * EXTEND THE REVERSE MAPPING: applies in all other cases
 
@@ -497,8 +501,11 @@ extendCSEnvWithBinding env in_id out_id rhs' cse_done
        -- analysis
 
     -- Should we use SUBSTITUTE or EXTEND?
-    -- See Note [CSE for bindings]
-    use_subst | Var {} <- rhs' = True
+    -- See Note [CSE for bindings], in particular the caveat about evaluatedness
+    use_subst | Var v <- rhs'
+              ,    isEvaldUnfolding (idUnfolding v)     -- NB: Must have same eval'ness
+                == isEvaldUnfolding (idUnfolding in_id) --     See Note [CSE for bindings]
+              = True
               | otherwise      = False
 
 -- | Given a binder `let x = e`, this function


=====================================
compiler/GHC/Core/Opt/CprAnal.hs
=====================================
@@ -298,14 +298,8 @@ data TermFlag -- Better than using a Bool
 exprTerminates :: CoreExpr -> TermFlag
 -- ^ A /very/ simple termination analysis.
 exprTerminates e
-  | exprIsHNF e            = Terminates
   | exprOkForSpeculation e = Terminates
   | otherwise              = MightDiverge
-  -- Annoyingly, we have to check both for HNF and ok-for-spec.
-  --   * `I# (x# *# 2#)` is ok-for-spec, but not in HNF. Still worth CPR'ing!
-  --   * `lvl` is an HNF if its unfolding is evaluated
-  --     (perhaps `lvl = I# 0#` at top-level). But, tiresomely, it is never
-  --     ok-for-spec due to Note [exprOkForSpeculation and evaluated variables].
 
 cprAnalApp :: AnalEnv -> CoreExpr -> [(CprType, CoreArg)] -> (CprType, CoreExpr)
 -- Main function that takes care of /nested/ CPR. See Note [Nested CPR]


=====================================
compiler/GHC/Core/Opt/SetLevels.hs
=====================================
@@ -38,39 +38,58 @@
    We do *not* clone top-level bindings, because some of them must not change,
    but we *do* clone bindings that are heading for the top level
 
-4. Note [Binder-swap during float-out]
-   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+4. Note [Duplicate evals into float]
+   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In the expression
-        case x of wild { p -> ...wild... }
-   we substitute x for wild in the RHS of the case alternatives:
-        case x of wild { p -> ...x... }
-   This means that a sub-expression involving x is not "trapped" inside the RHS
-   (i.e. it can now be floated out, whereas if it mentioned wild it could not).
-   And it's not inconvenient because we already have a substitution.
+        case x of wild { p -> ...f wild... }
+   the sub-expression `f wild` appears "trapped" inside the Case.
+   (i.e. it cannot be floated out because it mentions wild).
+   In order to "free" it, we float out the expression
+   `case x of wild { __DEFAULT -> f wild }`, thus duplicating the eval.
 
    For example, consider:
 
       f x = letrec go y = case x of z { (a,b) -> ...(expensive z)... }
               in ...
 
-   If we do the reverse binder-swap we get
+   If we dup the eval we get
 
-      f x = letrec go y = case x of z { (a,b) -> ...(expensive x)... }
+      f x = letrec go y = case x of z { (a,b) -> ...(case x of z { __DEFAULT -> expensive z })... }
               in ...
 
    and now we can float out:
 
-      f x = let t = expensive x
-              in letrec go y = case x of z { (a,b) -> ...(t)... }
+      f x = let t = case x of wild { __DEFAULT -> expensive wild }
+              in letrec go y = case x of wild { (a,b) -> ...(t)... }
               in ...
 
-   Now (expensive x) is computed once, rather than once each time around the 'go' loop.
-
-   Note that this is EXACTLY BACKWARDS from the what the simplifier does.
-   The simplifier tries to get rid of occurrences of x, in favour of wild,
-   in the hope that there will only be one remaining occurrence of x, namely
-   the scrutinee of the case, and we can inline it.
-
+   Now (expensive wild) is computed once, rather than once each time around the 'go' loop.
+
+   Implementation:
+
+     1. When entering an eval `case x of wild { __DEFAULT -> ...}`, remember
+        that `wild` is the result of evaluating `x`.
+
+     2. After figuring out the variables over which the expression needs to be
+        abstracted in order to float (abstractVars), go over these variables
+        once more to see if `wild` occurs among them. If so, and if the
+        scrutinee variable `x` is bound at a lower level than the destination
+        level of the float, then remove `wild` as an abs_var and record the
+        obligation to eval `x` into `wild` instead (i.e., add a new `eval_vars`
+        pairing). This is implemented in eliminateAbsCaseBndrs.
+        Of course, it only makes sense to duplicate the eval if we end up
+        with a thunk in doing so; hence eliminateAbsCaseBndrs checks whether
+        the abs_vars become empty after duplicating evals.
+
+
+   Historically, we implemented a reverse binder-swap
+   (see Note [The binder-swap substitution] for the forward direction),
+   which would float `expensive x` and thus drop the eval on `x`. This caused
+   trouble for expressions such as
+     case x of wild { __DEFAULT -> let y = dataToTagLarge# wild in ... }
+   When the case binder `wild` is swapped for the unevaluated scrutinee
+   `x`, `dataToTagLarge# x` no longer is ok-for-spec and thus the unlifted let
+   violates the let-can-float invariant. Hence we duplicate the eval nowadays.
 -}
 
 module GHC.Core.Opt.SetLevels (
@@ -127,6 +146,7 @@ import GHC.Utils.Outputable
 import GHC.Utils.Panic
 
 import Data.Maybe
+import Data.Either
 
 {-
 ************************************************************************
@@ -445,21 +465,21 @@ lvlCase :: LevelEnv             -- Level of in-scope names/tyvars
 lvlCase env scrut_fvs scrut' case_bndr ty alts
   -- See Note [Floating single-alternative cases]
   | [AnnAlt con@(DataAlt {}) bs body] <- alts
-  , exprIsHNF (deTagExpr scrut')  -- See Note [Check the output scrutinee for exprIsHNF]
-  , not (isTopLvl dest_lvl)       -- Can't have top-level cases
-  , not (floatTopLvlOnly env)     -- Can float anywhere
-  , ManyTy <- idMult case_bndr     -- See Note [Floating linear case]
+  , exprOkForSpeculation (deTagExpr scrut')  -- See Note [Check the output scrutinee for exprOkForSpeculation]
+  , not (isTopLvl dest_lvl)                  -- Can't have top-level cases
+  , not (floatTopLvlOnly env)                -- Can float anywhere
+  , ManyTy <- idMult case_bndr               -- See Note [Floating linear case]
   =     -- Always float the case if possible
         -- Unlike lets we don't insist that it escapes a value lambda
     do { (env1, (case_bndr' : bs')) <- cloneCaseBndrs env dest_lvl (case_bndr : bs)
-       ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut'
+       ; let rhs_env = rememberEval env1 case_bndr' scrut'
        ; body' <- lvlMFE rhs_env True body
        ; let alt' = Alt con (map (stayPut dest_lvl) bs') body'
        ; return (Case scrut' (TB case_bndr' (FloatMe dest_lvl)) ty' [alt']) }
 
   | otherwise     -- Stays put
-  = do { let (alts_env1, [case_bndr']) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr]
-             alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut'
+  = do { let (alts_env1, [case_bndr'@(TB case_bndrr _)]) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr]
+             alts_env = rememberEval alts_env1 case_bndrr scrut'
        ; alts' <- mapM (lvl_alt alts_env) alts
        ; return (Case scrut' case_bndr' ty' alts') }
   where
@@ -497,17 +517,8 @@ the inner loop.
 
 Things to note:
 
- * The test we perform is exprIsHNF, and /not/ exprOkForSpeculation.
-
-     - exprIsHNF catches the key case of an evaluated variable
-
-     - exprOkForSpeculation is /false/ of an evaluated variable;
-       See Note [exprOkForSpeculation and evaluated variables] in GHC.Core.Utils
-       So we'd actually miss the key case!
-
-     - Nothing is gained from the extra generality of exprOkForSpeculation
-       since we only consider floating a case whose single alternative
-       is a DataAlt   K a b -> rhs
+ * The test we perform is exprOkForSpeculation, because speculating the case is
+   exactly what we do.
 
  * We can't float a case to top level
 
@@ -557,8 +568,8 @@ needed to quantify over some of its free variables (e.g. z), resulting
 in shadowing and very confusing Core Lint failures.
 
 
-Note [Check the output scrutinee for exprIsHNF]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Check the output scrutinee for exprOkForSpeculation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider this:
   case x of y {
     A -> ....(case y of alts)....
@@ -572,10 +583,10 @@ evaluated), but the former is not -- and indeed we can't float the
 inner case out, at least not unless x is also evaluated at its binding
 site.  See #5453.
 
-That's why we apply exprIsHNF to scrut' and not to scrut.
+That's why we apply exprOkForSpeculation to scrut' and not to scrut.
 
 See Note [Floating single-alternative cases] for why
-we use exprIsHNF in the first place.
+we use exprOkForSpeculation in the first place.
 -}
 
 lvlNonTailMFE :: LevelEnv             -- Level of in-scope names/tyvars
@@ -632,7 +643,7 @@ lvlMFE env strict_ctxt ann_expr
   |  float_is_new_lam || exprIsTopLevelBindable expr expr_ty
          -- No wrapping needed if the type is lifted, or is a literal string
          -- or if we are wrapping it in one or more value lambdas
-  = do { expr1 <- lvlFloatRhs abs_vars dest_lvl rhs_env NonRecursive
+  = do { expr1 <- lvlFloatRhs abs_vars eval_vars dest_lvl rhs_env NonRecursive
                               is_bot_lam NotJoinPoint ann_expr
                   -- Treat the expr just like a right-hand side
        ; var <- newLvlVar expr1 NotJoinPoint is_mk_static
@@ -653,6 +664,7 @@ lvlMFE env strict_ctxt ann_expr
   = do { expr1 <- lvlExpr rhs_env ann_expr
        ; let l1r       = incMinorLvlFrom rhs_env
              float_rhs = mkLams abs_vars_w_lvls $
+                         flip (foldr (wrapEval l1r)) eval_vars $
                          Case expr1 (stayPut l1r ubx_bndr) box_ty
                              [Alt DEFAULT [] (App boxing_expr (Var ubx_bndr))]
 
@@ -678,7 +690,8 @@ lvlMFE env strict_ctxt ann_expr
                            -- See Note [Bottoming floats]
                            -- esp Bottoming floats (2)
     expr_ok_for_spec = exprOkForSpeculation expr
-    abs_vars = abstractVars dest_lvl env fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env fvs
     dest_lvl = destLevel env fvs fvs_ty is_function is_bot_lam
                -- NB: is_bot_lam not is_bot; see (3) in
                --     Note [Bottoming floats]
@@ -701,16 +714,16 @@ lvlMFE env strict_ctxt ann_expr
     float_me = saves_work || saves_alloc || is_mk_static
 
     -- See Note [Saving work]
-    saves_work = escapes_value_lam        -- (a)
-                 && not (exprIsHNF expr)  -- (b)
-                 && not float_is_new_lam  -- (c)
+    saves_work = escapes_value_lam                   -- (a)
+                 && not (exprOkForSpeculation expr)  -- (b)
+                 && not float_is_new_lam             -- (c)
     escapes_value_lam = dest_lvl `ltMajLvl` (le_ctxt_lvl env)
 
     -- See Note [Saving allocation] and Note [Floating to the top]
     saves_alloc =  isTopLvl dest_lvl
                 && floatConsts env
                 && (   not strict_ctxt                     -- (a)
-                    || exprIsHNF expr                      -- (b)
+                    || exprOkForSpeculation expr           -- (b)
                     || (is_bot_lam && escapes_value_lam))  -- (c)
 
 hasFreeJoin :: LevelEnv -> DVarSet -> Bool
@@ -730,14 +743,14 @@ But see also Note [Saving allocation].
 
 So we definitely float an expression out if
 (a) It will escape a value lambda (escapes_value_lam)
-(b) The expression is not a head-normal form (exprIsHNF); see (SW1, SW2).
+(b) The expression is not a head-normal form (exprOkForSpeculation); see (SW1, SW2).
 (c) Floating does not require wrapping it in value lambdas (float_is_new_lam).
     See (SW3) below
 
 Wrinkles:
 
 (SW1) Concerning (b) I experimented with using `exprIsCheap` rather than
-      `exprIsHNF` but the latter seems better, according to nofib
+      `exprOkForSpeculation` but the latter seems better, according to nofib
       (`spectral/mate` got 10% worse with exprIsCheap).  It's really a bit of a
       heuristic.
 
@@ -755,7 +768,7 @@ Wrinkles:
       See !12410 for some data comparing the effect of omitting (b) altogether,
       This doesn't apply, though, if we float the thing to the top level; see
       Note [Floating to the top].  Bottom line (data from !12410): adding the
-      not.exprIsHNF test to `saves_work`:
+      not.exprOkForSpeculation test to `saves_work`:
        - Decreases compiler allocations by 0.5%
        - Occasionally decreases runtime allocation (T12996 -2.5%)
        - Slightly mixed effect on nofib: (puzzle -10%, mate -5%, cichelli +5%)
@@ -1233,7 +1246,7 @@ lvlBind env (AnnNonRec bndr rhs)
   -- Otherwise we are going to float
   | null abs_vars
   = do {  -- No type abstraction; clone existing binder
-         rhs' <- lvlFloatRhs [] dest_lvl env NonRecursive
+         rhs' <- lvlFloatRhs [] [] dest_lvl env NonRecursive
                              is_bot_lam NotJoinPoint rhs
        ; (env', [bndr']) <- cloneLetVars NonRecursive env dest_lvl [bndr]
        ; let bndr2 = annotateBotStr bndr' 0 mb_bot_str
@@ -1241,7 +1254,7 @@ lvlBind env (AnnNonRec bndr rhs)
 
   | otherwise
   = do {  -- Yes, type abstraction; create a new binder, extend substitution, etc
-         rhs' <- lvlFloatRhs abs_vars dest_lvl env NonRecursive
+         rhs' <- lvlFloatRhs abs_vars eval_vars dest_lvl env NonRecursive
                              is_bot_lam NotJoinPoint rhs
        ; (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr]
        ; let bndr2 = annotateBotStr bndr' n_extra mb_bot_str
@@ -1252,7 +1265,8 @@ lvlBind env (AnnNonRec bndr rhs)
     ty_fvs     = tyCoVarsOfType bndr_ty
     rhs_fvs    = freeVarsOf rhs
     bind_fvs   = rhs_fvs `unionDVarSet` dIdFreeVars bndr
-    abs_vars   = abstractVars dest_lvl env bind_fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env bind_fvs
     dest_lvl   = destLevel env bind_fvs ty_fvs (isFunction rhs) is_bot_lam
 
     deann_rhs  = deAnnotate rhs
@@ -1332,9 +1346,8 @@ lvlBind env (AnnRec pairs)
                       -- function in a Rec, and we don't much care what
                       -- happens to it.  False is simple!
 
-    do_rhs env (_,rhs) = lvlFloatRhs abs_vars dest_lvl env Recursive
-                                     is_bot NotJoinPoint
-                                     rhs
+    do_rhs env (_,rhs) = lvlFloatRhs abs_vars eval_vars dest_lvl env Recursive
+                                     is_bot NotJoinPoint rhs
 
         -- Finding the free vars of the binding group is annoying
     bind_fvs = ((unionDVarSets [ freeVarsOf rhs | (_, rhs) <- pairs])
@@ -1346,7 +1359,8 @@ lvlBind env (AnnRec pairs)
 
     ty_fvs   = foldr (unionVarSet . tyCoVarsOfType . idType) emptyVarSet bndrs
     dest_lvl = destLevel env bind_fvs ty_fvs is_fun is_bot
-    abs_vars = abstractVars dest_lvl env bind_fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env bind_fvs
 
     is_top_bindable = not (any (mightBeUnliftedType . idType) bndrs)
        -- This mightBeUnliftedType stuff is the same test as in the non-rec case
@@ -1396,21 +1410,28 @@ lvlRhs :: LevelEnv
        -> CoreExprWithFVs
        -> LvlM LevelledExpr
 lvlRhs env rec_flag is_bot mb_join_arity expr
-  = lvlFloatRhs [] (le_ctxt_lvl env) env
+  = lvlFloatRhs [] [] (le_ctxt_lvl env) env
                 rec_flag is_bot mb_join_arity expr
 
-lvlFloatRhs :: [OutVar] -> Level -> LevelEnv -> RecFlag
+wrapEval :: Level -> (OutId, OutId) -> Expr LevelledBndr -> Expr LevelledBndr
+-- A bit like GHC.Core.Utils.mkDefaultCase, but `Expr LevelledBndr`
+wrapEval dest_lvl (scrut_v, case_bndr) body
+  = Case (Var scrut_v) (TB case_bndr (StayPut dest_lvl))
+         (exprType $ deTagExpr body) [Alt DEFAULT [] body]
+
+lvlFloatRhs :: [OutVar] -> [(OutId,OutId)] -> Level -> LevelEnv -> RecFlag
             -> Bool   -- Binding is for a bottoming function
             -> JoinPointHood
             -> CoreExprWithFVs
             -> LvlM (Expr LevelledBndr)
 -- Ignores the le_ctxt_lvl in env; treats dest_lvl as the baseline
-lvlFloatRhs abs_vars dest_lvl env rec is_bot mb_join_arity rhs
-  = do { body' <- if not is_bot  -- See Note [Floating from a RHS]
+lvlFloatRhs abs_vars eval_vars dest_lvl env rec is_bot mb_join_arity rhs
+  = do { body1 <- if not is_bot  -- See Note [Floating from a RHS]
                      && any isId bndrs
                   then lvlMFE  body_env True body
                   else lvlExpr body_env      body
-       ; return (mkLams bndrs' body') }
+       ; let body2 = foldr (wrapEval dest_lvl) body1 eval_vars
+       ; return (mkLams bndrs' body2) }
   where
     (bndrs, body)     | JoinPoint join_arity <- mb_join_arity
                       = collectNAnnBndrs join_arity rhs
@@ -1633,14 +1654,20 @@ countFreeIds = nonDetStrictFoldUDFM add 0 . getUniqDSet
 
 data LevelEnv
   = LE { le_switches :: FloatOutSwitches
-       , le_ctxt_lvl :: Level           -- The current level
-       , le_lvl_env  :: VarEnv Level    -- Domain is *post-cloned* TyVars and Ids
+       , le_ctxt_lvl :: Level
+       -- ^ The current level
+       , le_lvl_env  :: VarEnv (Level, Maybe OutId)
+       -- ^ Domain is *post-cloned* TyVars and Ids.
+       -- If `Just scrut`, then the var mapping belongs to a case binder with
+       -- variable scrutinee `scrut`.
+       -- This is to support Note [Duplicate evals into float].
 
        -- See Note [le_subst and le_env]
-       , le_subst    :: Subst           -- Domain is pre-cloned TyVars and Ids
-                                        -- The Id -> CoreExpr in the Subst is ignored
-                                        -- (since we want to substitute a LevelledExpr for
-                                        -- an Id via le_env) but we do use the Co/TyVar substs
+       , le_subst    :: Subst
+       -- ^ Domain is pre-cloned TyVars and Ids.
+       -- The Id -> CoreExpr in the Subst is ignored
+       -- (since we want to substitute a LevelledExpr for
+       -- an Id via le_env) but we do use the Co/TyVar substs
        , le_env      :: IdEnv ([OutVar], LevelledExpr)  -- Domain is pre-cloned Ids
     }
 
@@ -1690,10 +1717,10 @@ initialEnv float_lams binds
       -- to a later one.  So here we put all the top-level binders in scope before
       -- we start, to satisfy the lookupIdSubst invariants (#20200 and #20294)
 
-addLvl :: Level -> VarEnv Level -> OutVar -> VarEnv Level
-addLvl dest_lvl env v' = extendVarEnv env v' dest_lvl
+addLvl :: Level -> VarEnv (Level, Maybe OutId) -> OutVar -> VarEnv (Level, Maybe OutId)
+addLvl dest_lvl env v' = extendVarEnv env v' (dest_lvl, Nothing)
 
-addLvls :: Level -> VarEnv Level -> [OutVar] -> VarEnv Level
+addLvls :: Level -> VarEnv (Level, Maybe OutId) -> [OutVar] -> VarEnv (Level, Maybe OutId)
 addLvls dest_lvl env vs = foldl' (addLvl dest_lvl) env vs
 
 floatLams :: LevelEnv -> Maybe Int
@@ -1711,20 +1738,19 @@ floatTopLvlOnly le = floatToTopLevelOnly (le_switches le)
 incMinorLvlFrom :: LevelEnv -> Level
 incMinorLvlFrom env = incMinorLvl (le_ctxt_lvl env)
 
--- extendCaseBndrEnv adds the mapping case-bndr->scrut-var if it can
--- See Note [Binder-swap during float-out]
-extendCaseBndrEnv :: LevelEnv
-                  -> Id                 -- Pre-cloned case binder
-                  -> Expr LevelledBndr  -- Post-cloned scrutinee
-                  -> LevelEnv
-extendCaseBndrEnv le@(LE { le_subst = subst, le_env = id_env })
+-- rememberEval adds the mapping case-bndr->scrut-var if it can
+-- See Note [Duplicate evals into float]
+rememberEval :: LevelEnv
+             -> OutId              -- Post-cloned case binder
+             -> Expr LevelledBndr  -- Post-cloned scrutinee
+             -> LevelEnv
+rememberEval le@(LE { le_lvl_env = lvl_env })
                   case_bndr (Var scrut_var)
   -- We could use OccurAnal. scrutOkForBinderSwap here, and perhaps
   -- get a bit more floating.  But we didn't in the past and it's
   -- an unforced change, so I'm leaving it.
-  = le { le_subst   = extendSubstWithVar subst case_bndr scrut_var
-       , le_env     = add_id id_env (case_bndr, scrut_var) }
-extendCaseBndrEnv env _ _ = env
+  = le { le_lvl_env = modifyVarEnv (\(lvl,_) -> (lvl, Just scrut_var)) lvl_env case_bndr }
+rememberEval env _ _ = env
 
 maxFvLevel :: (Var -> Bool) -> LevelEnv -> DVarSet -> Level
 maxFvLevel max_me env var_set
@@ -1745,8 +1771,8 @@ maxIn max_me (LE { le_lvl_env = lvl_env, le_env = id_env }) in_var lvl
   where
     max_out out_var lvl
         | max_me out_var = case lookupVarEnv lvl_env out_var of
-                                Just lvl' -> maxLvl lvl' lvl
-                                Nothing   -> lvl
+                                Just (lvl',_) -> maxLvl lvl' lvl
+                                Nothing       -> lvl
         | otherwise = lvl       -- Ignore some vars depending on max_me
 
 lookupVar :: LevelEnv -> Id -> LevelledExpr
@@ -1765,20 +1791,16 @@ abstractVars :: Level -> LevelEnv -> DVarSet -> [OutVar]
         -- variable computation and deterministic sort.
         -- See Note [Unique Determinism] in GHC.Types.Unique for explanation of why
         -- Uniques are not deterministic.
-abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs
+abstractVars dest_lvl le@(LE { le_subst = subst }) in_fvs
   =  -- NB: sortQuantVars might not put duplicates next to each other
-    map zap $ sortQuantVars $
-    filter abstract_me      $
-    dVarSetElems            $
-    closeOverKindsDSet      $
+    map zap $ sortQuantVars         $
+    filter (abstractMe dest_lvl le) $
+    dVarSetElems                    $
+    closeOverKindsDSet              $
     substDVarSet subst in_fvs
         -- NB: it's important to call abstract_me only on the OutIds the
         -- come from substDVarSet (not on fv, which is an InId)
   where
-    abstract_me v = case lookupVarEnv lvl_env v of
-                        Just lvl -> dest_lvl `ltLvl` lvl
-                        Nothing  -> False
-
         -- We are going to lambda-abstract, so nuke any IdInfo,
         -- and add the tyvars of the Id (if necessary)
     zap v | isId v = warnPprTrace (isStableUnfolding (idUnfolding v) ||
@@ -1787,6 +1809,31 @@ abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs
                      setIdInfo v vanillaIdInfo
           | otherwise = v
 
+abstractMe :: Level -> LevelEnv -> Var -> Bool
+abstractMe dest_lvl (LE { le_lvl_env = lvl_env }) v
+  | Just (lvl, _) <- lookupVarEnv lvl_env v
+  = dest_lvl `ltLvl` lvl
+  | otherwise
+  = False
+
+eliminateAbsCaseBndrs :: Level -> LevelEnv -> [OutVar] -> ([OutVar], [(OutId,OutId)])
+-- Try turning all runtime abs_vars into evals.
+-- See Note [Duplicate evals into float]
+eliminateAbsCaseBndrs dest_lvl le@(LE { le_lvl_env = lvl_env }) abs_vars
+  | enables_thunking -- otherwise we do not get to safe work anyway!
+  = res
+  | otherwise
+  = (abs_vars, [])
+  where
+    res@(abs_vars',_eval_vars') = partitionEithers (map try_elim abs_vars)
+    enables_thunking = not (any isRuntimeVar abs_vars')
+
+    try_elim v = case lookupVarEnv lvl_env v of
+      Just (_, Just scrut_id)
+        | not (abstractMe dest_lvl le scrut_id) -- would not abstract scrut_id
+        -> Right (scrut_id, v) -- turn abs_var v into an eval on scrut_id!
+      _ -> Left v              -- retain as an abs_var
+
 type LvlM result = UniqSM result
 
 initLvl :: UniqSupply -> UniqSM a -> a
@@ -1835,9 +1882,12 @@ newLvlVar :: LevelledExpr        -- The RHS of the new binding
           -> LvlM Id
 newLvlVar lvld_rhs join_arity_maybe is_mk_static
   = do { uniq <- getUniqueM
-       ; return (add_join_info (mk_id uniq rhs_ty))
+       ; return (add_evald $ add_join_info $ mk_id uniq rhs_ty)
        }
   where
+    add_evald var
+      | exprIsHNF de_tagged_rhs = var `setIdUnfolding` evaldUnfolding
+      | otherwise               = var
     add_join_info var = var `asJoinId_maybe` join_arity_maybe
     de_tagged_rhs = deTagExpr lvld_rhs
     rhs_ty        = exprType de_tagged_rhs


=====================================
compiler/GHC/Core/Opt/WorkWrap/Utils.hs
=====================================
@@ -1598,7 +1598,7 @@ move_transit_vars :: [Id] -> (CoreExpr -> CoreExpr -> CoreExpr, CoreExpr)
 move_transit_vars vars
   | [var] <- vars
   , let var_ty = idType var
-  , isUnliftedType var_ty || exprIsHNF (Var var)
+  , isUnliftedType var_ty || isEvaldUnfolding (idUnfolding var)
   -- See Note [No unboxed tuple for single, unlifted transit var]
   --   * Wrapper: `unbox scrut alt = (case <scrut> of a -> <alt>)`
   --   * Worker:  `tup = a`
@@ -1702,9 +1702,7 @@ return unboxed instead of in an unboxed singleton tuple:
     We want  `$wh :: Int# -> [Int]`.
     We'd get `$wh :: Int# -> (# [Int] #)`.
 
-By considering vars as unlifted that satisfy 'exprIsHNF', we catch (3).
-Why not check for 'exprOkForSpeculation'? Quite perplexingly, evaluated vars
-are not ok-for-spec, see Note [exprOkForSpeculation and evaluated variables].
+By considering vars as evaluated that have an evald unfolding, we catch (3).
 For (1) and (2) we would have to look at the term. WW only looks at the
 type and the CPR signature, so the only way to fix (1) and (2) would be to
 have a nested termination signature, like in MR !1866.


=====================================
compiler/GHC/Core/Subst.hs
=====================================
@@ -519,7 +519,9 @@ substUnfolding subst df@(DFunUnfolding { df_bndrs = bndrs, df_args = args })
 substUnfolding subst unf@(CoreUnfolding { uf_tmpl = tmpl, uf_src = src })
   -- Retain stable unfoldings
   | not (isStableSource src)  -- Zap an unstable unfolding, to save substitution work
-  = NoUnfolding
+  = if isEvaldUnfolding unf
+    then evaldUnfolding
+    else NoUnfolding
   | otherwise                 -- But keep a stable one!
   = seqExpr new_tmpl `seq`
     unf { uf_tmpl = new_tmpl }


=====================================
compiler/GHC/Core/TyCo/Subst.hs
=====================================
@@ -22,6 +22,7 @@ module GHC.Core.TyCo.Subst
         extendCvSubst, extendCvSubstWithClone,
         extendTvSubst, extendTvSubstWithClone,
         extendTvSubstList, extendTvSubstAndInScope,
+        extendCvSubstAndInScope,
         extendTCvSubstList,
         unionSubst, zipTyEnv, zipCoEnv,
         zipTvSubst, zipCvSubst,
@@ -408,6 +409,14 @@ extendTvSubstAndInScope (Subst in_scope ids tenv cenv) tv ty
              (extendVarEnv tenv tv ty)
              cenv
 
+extendCvSubstAndInScope :: Subst -> CoVar -> Coercion -> Subst
+-- Also extends the in-scope set
+extendCvSubstAndInScope (Subst in_scope ids tenv cenv) cv co
+  = Subst (in_scope `extendInScopeSetSet` tyCoVarsOfCo co)
+             ids
+             tenv
+             (extendVarEnv cenv cv co)
+
 -- | Adds multiple 'TyVar' substitutions to the 'Subst': see also 'extendTvSubst'
 extendTvSubstList :: Subst -> [(TyVar,Type)] -> Subst
 extendTvSubstList subst vrs


=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -55,7 +55,7 @@ module GHC.Core.Type (
         splitForAllForAllTyBinders, splitForAllForAllTyBinder_maybe,
         splitForAllTyCoVar_maybe, splitForAllTyCoVar,
         splitForAllTyVar_maybe, splitForAllCoVar_maybe,
-        splitPiTy_maybe, splitPiTy, splitPiTys, collectPiTyBinders,
+        splitPiTy_maybe, splitPiTy, splitPiTys,
         getRuntimeArgTys,
         mkTyConBindersPreferAnon,
         mkPiTy, mkPiTys,
@@ -290,7 +290,6 @@ import GHC.Utils.Panic
 import GHC.Data.FastString
 
 import GHC.Data.Maybe   ( orElse, isJust, firstJust )
-import GHC.List (build)
 
 -- $type_classification
 -- #type_classification#
@@ -2056,18 +2055,6 @@ splitPiTys ty = split ty ty []
     split orig_ty ty bs | Just ty' <- coreView ty = split orig_ty ty' bs
     split orig_ty _                bs = (reverse bs, orig_ty)
 
-collectPiTyBinders :: Type -> [PiTyBinder]
-collectPiTyBinders ty = build $ \c n ->
-  let
-    split (ForAllTy b res) = Named b `c` split res
-    split (FunTy { ft_af = af, ft_mult = w, ft_arg = arg, ft_res = res })
-                           = Anon (Scaled w arg) af `c` split res
-    split ty | Just ty' <- coreView ty = split ty'
-    split _                = n
-  in
-    split ty
-{-# INLINE collectPiTyBinders #-}
-
 -- | Extracts a list of run-time arguments from a function type,
 -- looking through newtypes to the right of arrows.
 --


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -74,9 +74,11 @@ import GHC.Core.Ppr
 import GHC.Core.FVs( bindFreeVars )
 import GHC.Core.DataCon
 import GHC.Core.Type as Type
+import GHC.Core.TyCo.Rep as Type
 import GHC.Core.Predicate( isCoVarType )
 import GHC.Core.FamInstEnv
 import GHC.Core.TyCo.Compare( eqType, eqTypeX )
+import GHC.Core.TyCo.Subst
 import GHC.Core.Coercion
 import GHC.Core.Reduction
 import GHC.Core.TyCon
@@ -1714,7 +1716,8 @@ it's applied only to dictionaries.
 
 -----------------------------
 -- | To a first approximation, 'exprOkForSpeculation' returns True of
--- an expression that is:
+-- an expression that is nearly a value (i.e., it implies 'exprIsHNF').
+-- That is, the expression is
 --
 --  * Safe to evaluate even if normal order eval might not
 --    evaluate the expression at all, and
@@ -1740,29 +1743,19 @@ it's applied only to dictionaries.
 -- But in fact both functions are a bit more conservative than the above,
 -- in at least the following ways:
 --
---  * W1: We do not take advantage of already-evaluated lifted variables.
---        As a result, 'exprIsHNF' DOES NOT imply 'exprOkForSpeculation';
---        if @y@ is a case-binder of lifted type, then @exprIsHNF y@ is
---        'True', while @exprOkForSpeculation y@ is 'False'.
---        See Note [exprOkForSpeculation and evaluated variables] for why.
---  * W2: Read-effects on mutable variables are currently also included.
+--  * W1: Read-effects on mutable variables are currently also included.
 --        See Note [Classifying primop effects] "GHC.Builtin.PrimOps".
---  * W3: Currently, 'exprOkForSpeculation' always returns 'False' for
---        let-expressions.  Lets can be stacked deeply, so we just give up.
---        In any case, the argument of 'exprOkForSpeculation' is usually in
---        a strict context, so any lets will have been floated away.
---
 --
 -- As an example of the considerations in this test, consider:
 --
 -- > let x = case y# +# 1# of { r# -> I# r# }
--- > in E
+-- > in e
 --
 -- being translated to:
 --
 -- > case y# +# 1# of { r# ->
 -- >    let x = I# r#
--- >    in E
+-- >    in e
 -- > }
 --
 -- We can only do this if the @y# +# 1#@ is ok for speculation: it has no
@@ -1811,14 +1804,12 @@ expr_ok fun_ok primop_ok (Tick tickish e)
    | tickishCounts tickish = False
    | otherwise             = expr_ok fun_ok primop_ok e
 
-expr_ok _ _ (Let {}) = False
--- See W3 in the Haddock comment for exprOkForSpeculation
+expr_ok fun_ok primop_ok (Let _ e) = expr_ok fun_ok primop_ok e
+  -- Ignore binds becaue of Note [Core let-can-float invariant] in GHC.Core
 
-expr_ok fun_ok primop_ok (Case scrut bndr _ alts)
+expr_ok fun_ok primop_ok (Case scrut _ _ alts)
   =  -- See Note [exprOkForSpeculation: case expressions]
      expr_ok fun_ok primop_ok scrut
-  && isUnliftedType (idType bndr)
-      -- OK to call isUnliftedType: binders always have a fixed RuntimeRep
   && all (\(Alt _ _ rhs) -> expr_ok fun_ok primop_ok rhs) alts
   && altsAreExhaustive alts
 
@@ -1847,7 +1838,7 @@ app_ok fun_ok primop_ok fun args
 
   | idArity fun > n_val_args
   -- Partial application: just check passing the arguments is OK
-  = args_ok
+  = args_ok notCBV
 
   | otherwise
   = case idDetails fun of
@@ -1857,9 +1848,10 @@ app_ok fun_ok primop_ok fun args
 
       DataConWorkId dc
         | isLazyDataConRep dc
-        -> args_ok
+        -> args_ok notCBV
         | otherwise
-        -> fields_ok (dataConRepStrictness dc)
+        -> args_ok (dataConRepStrictness dc)
+            -- See (SFC1) of Note [Strict fields in Core]
 
       ClassOpId _ is_terminating_result
         | is_terminating_result -- See Note [exprOkForSpeculation and type classes]
@@ -1881,7 +1873,7 @@ app_ok fun_ok primop_ok fun args
               -- Often there is a literal divisor, and this
               -- can get rid of a thunk in an inner loop
 
-        | otherwise -> primop_ok op && args_ok
+        | otherwise -> primop_ok op && args_ok notCBV
 
       _other  -- Unlifted and terminating types;
               -- Also c.f. the Var case of exprIsHNF
@@ -1898,37 +1890,54 @@ app_ok fun_ok primop_ok fun args
          -- See (U12) of Note [Implementing unsafeCoerce]
          | fun `hasKey` unsafeEqualityProofIdKey -> True
 
-         | otherwise -> False
-             -- NB: even in the nullary case, do /not/ check
-             --     for evaluated-ness of the fun;
-             --     see Note [exprOkForSpeculation and evaluated variables]
+         | 0 <- n_val_args
+         , isEvaldUnfolding (idUnfolding fun)
+         -> -- pprTrace "yes" (ppr fun)
+            True
+
+         | otherwise -> -- pprTrace "no" (ppr fun <+> ppr args)
+                        False
   where
     fun_ty       = idType fun
     n_val_args   = valArgCount args
-    (arg_tys, _) = splitPiTys fun_ty
 
     -- Even if a function call itself is OK, any unlifted
-    -- args are still evaluated eagerly and must be checked
-    args_ok = all2Prefix arg_ok arg_tys args
-    arg_ok :: PiTyVarBinder -> CoreExpr -> Bool
-    arg_ok (Named _) _ = True   -- A type argument
-    arg_ok (Anon ty _) arg      -- A term argument
-       | definitelyLiftedType (scaledThing ty)
-       = True -- lifted args are not evaluated eagerly
+    -- args are still evaluated eagerly and must be checked.
+    -- Furthermore, for saturated calls, we must check CBV args (strict fields
+    -- of DataCons!)
+    args_ok str_marks = relevantAppArgsSatisfy arg_ok fun_ty args str_marks
+    arg_ok :: Type -> CoreExpr -> StrictnessMark -> Bool
+    arg_ok ty arg str
+       | NotMarkedStrict <- str   -- iff it's not a CBV arg
+       , definitelyLiftedType ty  -- and its type is lifted
+       = True                     -- then the worker app does not eval
        | otherwise
        = expr_ok fun_ok primop_ok arg
 
-    -- Used for strict DataCon worker arguments
-    -- See (SFC1) of Note [Strict fields in Core]
-    fields_ok str_marks = all3Prefix field_ok arg_tys str_marks args
-    field_ok :: PiTyVarBinder -> StrictnessMark -> CoreExpr -> Bool
-    field_ok (Named _)   _   _ = True
-    field_ok (Anon ty _) str arg
-       | NotMarkedStrict <- str                 -- iff it's a lazy field
-       , definitelyLiftedType (scaledThing ty)  -- and its type is lifted
-       = True                                   -- then the worker app does not eval
-       | otherwise
-       = expr_ok fun_ok primop_ok arg
+notCBV :: [StrictnessMark]
+notCBV = repeat NotMarkedStrict
+
+relevantAppArgsSatisfy :: (Type -> CoreExpr -> StrictnessMark -> Bool) -> Type -> [CoreExpr] -> [StrictnessMark] -> Bool
+-- This calls the predicate on every non-Type CoreExpr arg.
+-- We could just do `exprType arg` for every such arg, but carrying around a
+-- substitution is much more efficient.
+-- The obvious definition regresses T16577 by 30% so we don't do it.
+relevantAppArgsSatisfy p ty = go (mkEmptySubst in_scope) ty
+  where
+    in_scope = mkInScopeSet (tyCoVarsOfType ty)
+    go subst (ForAllTy b res) (Type ty : args) strs
+      = go (extendTvSubstAndInScope subst (binderVar b) ty) res args strs
+    go subst (ForAllTy b res) (Coercion co : args) strs
+      = go (extendCvSubstAndInScope subst (binderVar b) co) res args strs
+    go subst (FunTy { ft_arg = arg, ft_res = res }) (e : args) (str : strs)
+      = p (substTy subst arg) e str && go subst res args strs
+    go subst ty args@(_:_) strs@(_:_)
+      | Just ty' <- coreView (substTy subst ty)
+      = go (mkEmptySubst (getSubstInScope subst)) ty' args strs
+      | otherwise
+      = pprPanic "Should see more argument tys" (ppr ty $$ ppr subst $$ ppr args $$ ppr (take 10 strs))
+    go _ _ _ _ = True
+{-# INLINE relevantAppArgsSatisfy #-}
 
 -----------------------------
 altsAreExhaustive :: [Alt b] -> Bool
@@ -1985,6 +1994,8 @@ GHC.Types.Id.Make.mkDictSelId for where this field is initialised.
 
 Note [exprOkForSpeculation: case expressions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+SG: I don't get what is so "very special" about this case. I find it very reasonable.
+
 exprOkForSpeculation accepts very special case expressions.
 Reason: (a ==# b) is ok-for-speculation, but the litEq rules
 in GHC.Core.Opt.ConstantFold convert it (a ==# 3#) to
@@ -1994,39 +2005,22 @@ for excellent reasons described in
 So, annoyingly, we want that case expression to be
 ok-for-speculation too. Bother.
 
-But we restrict it sharply:
-
-* We restrict it to unlifted scrutinees. Consider this:
-     case x of y {
-       DEFAULT -> ... (let v::Int# = case y of { True  -> e1
-                                               ; False -> e2 }
-                       in ...) ...
-
-  Does the RHS of v satisfy the let-can-float invariant?  Previously we said
-  yes, on the grounds that y is evaluated.  But the binder-swap done
-  by GHC.Core.Opt.SetLevels would transform the inner alternative to
-     DEFAULT -> ... (let v::Int# = case x of { ... }
-                     in ...) ....
-  which does /not/ satisfy the let-can-float invariant, because x is
-  not evaluated. See Note [Binder-swap during float-out]
-  in GHC.Core.Opt.SetLevels.  To avoid this awkwardness it seems simpler
-  to stick to unlifted scrutinees where the issue does not
-  arise.
-
-* We restrict it to exhaustive alternatives. A non-exhaustive
-  case manifestly isn't ok-for-speculation. for example,
-  this is a valid program (albeit a slightly dodgy one)
-    let v = case x of { B -> ...; C -> ... }
-    in case x of
-         A -> ...
-         _ ->  ...v...v....
-  Should v be considered ok-for-speculation?  Its scrutinee may be
-  evaluated, but the alternatives are incomplete so we should not
-  evaluate it strictly.
-
-  Now, all this is for lifted types, but it'd be the same for any
-  finite unlifted type. We don't have many of them, but we might
-  add unlifted algebraic types in due course.
+SG: Again, I don't see why we need to list a specific example.
+Clearly, we want to detect as many expressions as ok-for-spec as possible!
+
+But we restrict it to exhaustive alternatives. A non-exhaustive
+case manifestly isn't ok-for-speculation. For example,
+this is a valid program (albeit a slightly dodgy one)
+  let v = case x of { B -> ...; C -> ... }
+  in case x of
+       A -> ...
+       _ ->  ...v...v....
+Should v be considered ok-for-speculation?  Its scrutinee may be
+evaluated, but the alternatives are incomplete so we should not
+evaluate it strictly.
+
+Now, all this is for lifted types, but it'd be the same for any
+finite unlifted type.
 
 
 ----- Historical note: #15696: --------
@@ -2073,37 +2067,6 @@ points do the job nicely.
 ------- End of historical note ------------
 
 
-Note [exprOkForSpeculation and evaluated variables]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider these examples:
- * case x of y { DEFAULT -> ....y.... }
-   Should 'y' (alone) be considered ok-for-speculation?
-
- * case x of y { DEFAULT -> ....let z = dataToTagLarge# y... }
-   Should (dataToTagLarge# y) be considered ok-for-spec? Recall that
-     dataToTagLarge# :: forall a. a -> Int#
-   must always evaluate its argument. (See also Note [DataToTag overview].)
-
-You could argue 'yes', because in the case alternative we know that
-'y' is evaluated.  But the binder-swap transformation, which is
-extremely useful for float-out, changes these expressions to
-   case x of y { DEFAULT -> ....x.... }
-   case x of y { DEFAULT -> ....let z = dataToTagLarge# x... }
-
-And now the expression does not obey the let-can-float invariant!  Yikes!
-Moreover we really might float (dataToTagLarge# x) outside the case,
-and then it really, really doesn't obey the let-can-float invariant.
-
-The solution is simple: exprOkForSpeculation does not try to take
-advantage of the evaluated-ness of (lifted) variables.  And it returns
-False (always) for primops that perform evaluation.  We achieve the latter
-by marking the relevant primops as "ThrowsException" or
-"ReadWriteEffect"; see also Note [Classifying primop effects] in
-GHC.Builtin.PrimOps.
-
-Note that exprIsHNF /can/ and does take advantage of evaluated-ness;
-it doesn't have the trickiness of the let-can-float invariant to worry about.
-
 ************************************************************************
 *                                                                      *
              exprIsHNF, exprIsConLike
@@ -2166,7 +2129,6 @@ exprIsHNFlike is_con is_con_unf e
       || is_con_unf (idUnfolding v)
         -- Check the thing's unfolding; it might be bound to a value
         --   or to a guaranteed-evaluated variable (isEvaldUnfolding)
-        --   Contrast with Note [exprOkForSpeculation and evaluated variables]
         -- We don't look through loop breakers here, which is a bit conservative
         -- but otherwise I worry that if an Id's unfolding is just itself,
         -- we could get an infinite loop
@@ -2185,66 +2147,59 @@ exprIsHNFlike is_con is_con_unf e
                                    && is_hnf_like e
                                       -- See Note [exprIsHNF Tick]
     is_hnf_like (Cast e _)       = is_hnf_like e
-    is_hnf_like (App e a)
-      | isValArg a               = app_is_value e [a]
-      | otherwise                = is_hnf_like e
     is_hnf_like (Let _ e)        = is_hnf_like e  -- Lazy let(rec)s don't affect us
     is_hnf_like (Case e b _ as)
       | Just rhs <- isUnsafeEqualityCase e b as
       = is_hnf_like rhs
-    is_hnf_like _                = False
-
-    -- Collect arguments through Casts and Ticks and call id_app_is_value
-    app_is_value :: CoreExpr -> [CoreArg] -> Bool
-    app_is_value (Var f)    as = id_app_is_value f as
-    app_is_value (Tick _ f) as = app_is_value f as
-    app_is_value (Cast f _) as = app_is_value f as
-    app_is_value (App f a)  as | isValArg a = app_is_value f (a:as)
-                               | otherwise  = app_is_value f as
-    app_is_value _          _  = False
-
-    id_app_is_value id val_args =
+    is_hnf_like e
+      | (fun, args) <- collectArgs e
+      = case stripTicksTopE (not . tickishCounts) fun of
+            Var f -> id_app_is_value f args
+
+            -- 'LitRubbish' is the only literal that can occur in the head of an
+            -- application and will not be matched by the above case (Var /= Lit).
+            -- See Note [How a rubbish literal can be the head of an application]
+            -- in GHC.Types.Literal
+            Lit lit | debugIsOn, not (isLitRubbish lit)
+                     -> pprPanic "Non-rubbish lit in app head" (ppr lit)
+                     | otherwise
+                     -> True
+
+            _ -> False
+
+    id_app_is_value id args =
       -- See Note [exprIsHNF for function applications]
       --   for the specification and examples
-      case compare (idArity id) (length val_args) of
+      case compare (idArity id) n_val_args of
         EQ | is_con id ->      -- Saturated app of a DataCon/CONLIKE Id
           case mb_str_marks id of
             Just str_marks ->  -- with strict fields; see (SFC1) of Note [Strict fields in Core]
-              assert (val_args `equalLength` str_marks) $
-              fields_hnf str_marks
+              args_hnf str_marks
             Nothing ->         -- without strict fields: like PAP
-              args_hnf         -- NB: CONLIKEs are lazy!
+              args_hnf notCBV  -- NB: CONLIKEs are lazy!
 
-        GT ->                  -- PAP: Check unlifted val_args
-          args_hnf
+        GT ->                  -- PAP: Check unlifted args
+          args_hnf notCBV
 
         _  -> False
 
       where
-        -- Saturated, Strict DataCon: Check unlifted val_args and strict fields
-        fields_hnf str_marks = all3Prefix check_field val_arg_tys str_marks val_args
-
-        -- PAP: Check unlifted val_args
-        args_hnf             = all2Prefix check_arg   val_arg_tys           val_args
-
         fun_ty = idType id
-        val_arg_tys = mapMaybe anonPiTyBinderType_maybe (collectPiTyBinders fun_ty)
-          -- val_arg_tys = map exprType val_args, but much less costly.
-          -- The obvious definition regresses T16577 by 30% so we don't do it.
-
-        check_arg a_ty a
+        n_val_args   = valArgCount args
+        -- Check the args for HNFness.
+        args_hnf str_marks = relevantAppArgsSatisfy check_arg fun_ty args str_marks
+        --   * Unlifted args must always be HNF
+        --   * CBV args (strict fields!) must be HNF for saturated calls,
+        --     indicated by str_marks
+        check_arg a_ty a str
           | mightBeUnliftedType a_ty = is_hnf_like a
+          | isMarkedStrict str       = is_hnf_like a
           | otherwise                = True
          -- Check unliftedness; for example f (x /# 12#) where f has arity two,
          -- and the first argument is unboxed. This is not a value!
          -- But  f 34#  is a value, so check args for HNFs.
          -- NB: We check arity (and CONLIKEness) first because it's cheaper
          --     and we reject quickly on saturated apps.
-        check_field a_ty str a
-          | mightBeUnliftedType a_ty = is_hnf_like a
-          | isMarkedStrict str       = is_hnf_like a
-          | otherwise                = True
-          -- isMarkedStrict: Respect Note [Strict fields in Core]
 
         mb_str_marks id
           | Just dc <- isDataConWorkId_maybe id


=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -2241,6 +2241,20 @@ decideFloatInfo FIA{fia_levity=lev, fia_demand=dmd, fia_is_hnf=is_hnf,
   | Lifted   <- lev       = (LetBound, TopLvlFloatable)
       -- And these float freely but can't be speculated, hence LetBound
 
+instance Outputable FloatInfoArgs where
+  ppr FIA{ fia_levity=lev, fia_demand=dmd, fia_is_hnf=hnf, fia_is_triv=triv
+         , fia_is_string=string, fia_is_dc_worker=dcw, fia_ok_for_spec=ok_spec }
+    = brackets (pprWithCommas id traits)
+    where
+      traits = concat $
+        [ [ text "string" | string ]
+        , [ text "DataCon worker binding" | dcw ]
+        , [ text "trivial" | triv ]
+        , [ text "unlifted" | Unlifted <- pure lev ]
+        , [ text "strict" | isStrUsedDmd dmd ]
+        , [ if hnf then text "hnf" else if ok_spec then text "ok-for-spec" else empty ]
+        ]
+
 mkCaseFloat :: Id -> CpeRhs -> FloatingBind
 mkCaseFloat bndr scrut
   = -- pprTrace "mkCaseFloat" (ppr bndr <+> ppr (bound,info)
@@ -2262,13 +2276,12 @@ mkCaseFloat bndr scrut
 mkNonRecFloat :: CorePrepEnv -> Levity -> Id -> CpeRhs -> (FloatingBind, Id)
 mkNonRecFloat env lev bndr rhs
   = -- pprTrace "mkNonRecFloat" (ppr bndr <+> ppr (bound,info)
-    --                             <+> if is_strict then text "strict" else if is_lifted then text "lazy" else text "unlifted"
-    --                             <+> if ok_for_spec then text "ok-for-spec" else empty
-    --                             <+> if evald then text "evald" else empty
+    --                           $$ ppr float_info_args
     --                           $$ ppr rhs) $
     (Float (NonRec bndr' rhs) bound info, bndr')
   where
-    !(bound, info) = decideFloatInfo $ (defFloatInfoArgs bndr rhs)
+    !(bound, info) = decideFloatInfo float_info_args
+    float_info_args = (defFloatInfoArgs bndr rhs)
       { fia_levity = lev
       , fia_is_hnf = is_hnf
       , fia_ok_for_spec = ok_for_spec


=====================================
compiler/GHC/Utils/Misc.hs
=====================================
@@ -23,7 +23,7 @@ module GHC.Utils.Misc (
 
         dropWhileEndLE, spanEnd, last2, lastMaybe, onJust,
 
-        List.foldl1', foldl2, count, countWhile, all2, any2, all2Prefix, all3Prefix,
+        List.foldl1', foldl2, count, countWhile, all2, any2,
 
         lengthExceeds, lengthIs, lengthIsNot,
         lengthAtLeast, lengthAtMost, lengthLessThan,
@@ -658,36 +658,6 @@ any2 :: (a -> b -> Bool) -> [a] -> [b] -> Bool
 any2 p (x:xs) (y:ys) = p x y || any2 p xs ys
 any2 _ _      _      = False
 
-all2Prefix :: forall a b. (a -> b -> Bool) -> [a] -> [b] -> Bool
--- ^ `all2Prefix p xs ys` is a fused version of `and $ zipWith2 p xs ys`.
--- (It generates good code nonetheless.)
--- So if one list is shorter than the other, `p` is assumed to be `True` for the
--- suffix.
-all2Prefix p = foldr k z
-  where
-    k :: a -> ([b] -> Bool) -> [b] -> Bool
-    k x go ys' = case ys' of
-      (y:ys'') -> p x y && go ys''
-      _ -> True
-    z :: [b] -> Bool
-    z _ = True
-{-# INLINE all2Prefix #-}
-
-all3Prefix :: forall a b c. (a -> b -> c -> Bool) -> [a] -> [b] -> [c] -> Bool
--- ^ `all3Prefix p xs ys zs` is a fused version of `and $ zipWith3 p xs ys zs`.
--- (It generates good code nonetheless.)
--- So if one list is shorter than the others, `p` is assumed to be `True` for
--- the suffix.
-all3Prefix p = foldr k z
-  where
-    k :: a -> ([b] -> [c] -> Bool) -> [b] -> [c] -> Bool
-    k x go ys' zs' = case (ys',zs') of
-      (y:ys'',z:zs'') -> p x y z && go ys'' zs''
-      _ -> False
-    z :: [b] -> [c] -> Bool
-    z _ _ = True
-{-# INLINE all3Prefix #-}
-
 -- Count the number of times a predicate is true
 
 count :: (a -> Bool) -> [a] -> Int


=====================================
testsuite/tests/simplCore/should_compile/T25423.hs
=====================================
@@ -0,0 +1,8 @@
+{-# LANGUAGE BangPatterns #-}
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
+module T25423 where
+
+f :: Int -> Int -> (# (# Int, Int #), (# Int, Int #) #)
+f x y = let !p = (# x, y #) in (# p, p #)


=====================================
testsuite/tests/simplCore/should_compile/T25423.stderr
=====================================
@@ -0,0 +1,41 @@
+
+==================== Tidy Core ====================
+Result size of Tidy Core = {terms: 24, types: 75, coercions: 0, joins: 0/1}
+
+-- RHS size: {terms: 9, types: 39, coercions: 0, joins: 0/1}
+f :: Int -> Int -> (# (# Int, Int #), (# Int, Int #) #)
+[GblId, Arity=2, Str=<L><L>, Cpr=1(1, 1), Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=True)}]
+f = \ (x :: Int) (y :: Int) ->
+      let {
+        p :: (# Int, Int #)
+        [LclId, Unf=OtherCon []]
+        p = (# x, y #) } in
+      (# p, p #)
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule4 :: GHC.Prim.Addr#
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 0}]
+T25423.$trModule4 = "main"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule3 :: GHC.Types.TrName
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule3 = GHC.Types.TrNameS T25423.$trModule4
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule2 :: GHC.Prim.Addr#
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 0}]
+T25423.$trModule2 = "T25423"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule1 :: GHC.Types.TrName
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule1 = GHC.Types.TrNameS T25423.$trModule2
+
+-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule :: GHC.Types.Module
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule = GHC.Types.Module T25423.$trModule3 T25423.$trModule1
+
+
+


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -532,3 +532,4 @@ test('T24725a', [ grep_errmsg(r'testedRule')], compile, ['-O -ddump-rule-firings
 test('T25033', normal, compile, ['-O'])
 test('T25160', normal, compile, ['-O -ddump-rules'])
 test('T25197', [req_th, extra_files(["T25197_TH.hs"]), only_ways(['optasm'])], multimod_compile, ['T25197', '-O2 -v0'])
+test('T25423', [ grep_errmsg(r'let') ], compile, ['-O -ddump-simpl -dsuppress-uniques'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/38b99494093a16390405b1b2830ecb5802c30a71...be75e30fe7569bbb81d7a2ca5d66f663a99bd178

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/38b99494093a16390405b1b2830ecb5802c30a71...be75e30fe7569bbb81d7a2ca5d66f663a99bd178
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20241111/1be1fe71/attachment-0001.html>


More information about the ghc-commits mailing list