[Git][ghc/ghc][wip/hnf-spec] 3 commits: Make exprIsHNF imply exprOkForSpeculation

Tue Dec 3 17:32:31 UTC 2024

Sebastian Graf pushed to branch wip/hnf-spec at Glasgow Haskell Compiler / GHC


Commits:
8b4b6447 by Sebastian Graf at 2024-12-03T18:32:09+01:00
Make exprIsHNF imply exprOkForSpeculation

Fixes #23256 because SetLevels and CSE no longer do a reverse binder
swap.
Furthermore, we no longer need to check for exprIsHNF when
exprOkForSpeculation fails.

Also fixes #25423.

I had a brief look at `-v` in T5642 (which regresses ghc/alloc by 2-3%).
It is essentially a benchmark on `Generic` code and uses `exprOkForSpeculation`
quite a lot, which now maintains a substitution, explaining the regression.
No changes to any intermediate programs.

I also had a look at `-v` in T9961 (which regresses ghc/alloc by 1-2%)

In T16577 (regresses ghc/alloc by 2-3%) we end up with a marginally smaller
program, but the benefit is offset by an additional Simplifier iteration.

Metric Decrease:
    T9872b_defer
    T9872d
Metric Increase:
    T5642
    T9961
    T16577

- - - - -
3622268e by Sebastian Graf at 2024-12-03T18:32:09+01:00
SetLevels: Use `exprOkForSpeculation` instead of `exprIsHNF`

- - - - -
05287d4f by Sebastian Graf at 2024-12-03T18:32:09+01:00
Prep: Fix outcommented debug pretty-printing logic

- - - - -


14 changed files:

- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/CSE.hs
- compiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/WorkWrap/Utils.hs
- compiler/GHC/Core/Subst.hs
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Utils/Misc.hs
- + testsuite/tests/simplCore/should_compile/T25423.hs
- + testsuite/tests/simplCore/should_compile/T25423.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Lint.hs
=====================================
@@ -592,7 +592,7 @@ lintLetBind top_lvl rec_flag binder rhs rhs_ty
                || (isNonRec rec_flag && exprOkForSpeculation rhs)
                || isDataConWorkId binder || isDataConWrapId binder -- until #17521 is fixed
                || exprIsTickedString rhs)
-           (badBndrTyMsg binder (text "unlifted"))
+           (badUnliftedBndrTyMsg binder rhs)
 
         -- Check that if the binder is at the top level and has type Addr#,
         -- that it is a string literal.
@@ -3741,10 +3741,11 @@ mkRhsMsg binder what ty
      hsep [text "Binder's type:", ppr (idType binder)],
      hsep [text "Rhs type:", ppr ty]]
 
-badBndrTyMsg :: Id -> SDoc -> SDoc
-badBndrTyMsg binder what
-  = vcat [ text "The type of this binder is" <+> what <> colon <+> ppr binder
-         , text "Binder's type:" <+> ppr (idType binder) ]
+badUnliftedBndrTyMsg :: Id -> CoreExpr -> SDoc
+badUnliftedBndrTyMsg bndr rhs
+  = vcat [ text "The let binder" <+> ppr bndr <+> text "does not satisfy the let-can-float invariant"
+         , text "Type:" <+> ppr (idType bndr)
+         , text "RHS:" <+> ppr rhs ]
 
 mkNonTopExportedMsg :: Id -> SDoc
 mkNonTopExportedMsg binder


=====================================
compiler/GHC/Core/Opt/CSE.hs
=====================================
@@ -30,24 +30,28 @@ import Data.List        ( mapAccumL )
                         Simple common sub-expression
                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 When we see
-        x1 = C a b
-        x2 = C x1 b
-we build up a reverse mapping:   C a b  -> x1
-                                 C x1 b -> x2
+  x1 = C a b
+  x2 = C x1 b
+we build up a reverse mapping:
+  C a b  :-> x1
+  C x1 b :-> x2
 and apply that to the rest of the program.
 
 When we then see
-        y1 = C a b
-        y2 = C y1 b
+  y1 = C a b
+  y2 = C y1 b
 we replace the C a b with x1.  But then we *don't* want to
 add   x1 -> y1  to the mapping.  Rather, we want the reverse, y1 -> x1
 so that a subsequent binding
         y2 = C y1 b
 will get transformed to C x1 b, and then to x2.
 
-So we carry an extra var->var substitution which we apply *before* looking up in the
-reverse mapping.
+So we carry an extra var->var substitution cs_canon which we apply *before*
+looking up in the reverse mapping. We call this step "canonicalisation", because
+it makes α-equivalent expressions /syntactically/ equal by choosing the names
+consistently.
 
+Note [CSE for bindings] explains both cases (EXTEND and CANONICALISE) in detail.
 
 Note [Shadowing in CSE]
 ~~~~~~~~~~~~~~~~~~~~~~~
@@ -59,7 +63,6 @@ For example, consider
 
 Here we must *not* do CSE on the inner x+x!  The simplifier used to guarantee no
 shadowing, but it doesn't any more (it proved too hard), so we clone as we go.
-We can simply add clones to the substitution already described.
 
 A similar tricky situation is this, with x_123 and y_123 sharing the same unique:
 
@@ -80,18 +83,31 @@ why we have to substitute binders as we go so we will properly get:
     let x2 = e2 in
     let foo = x1
 
+It may be tempting to do the cloning using the same substitution cs_canon that
+does the canonicalisation.
+We may not do so; see Note [Canonicalisation reverts binder swap transformation].
+
+Hence we maintain two substitutions: cs_subst to implement cloning, and cs_canon
+to implement the var->var substitution for canonicalising keys prior to lookup
+in the reverse mapping.
+
 Note [CSE for bindings]
 ~~~~~~~~~~~~~~~~~~~~~~~
 Let-bindings have two cases, implemented by extendCSEnvWithBinding.
 
-* SUBSTITUTE: applies when the RHS is a variable
+* CANONICALISE: applies when the RHS is a variable
 
      let x = y in ...(h x)....
 
-  Here we want to extend the /substitution/ with x -> y, so that the
-  (h x) in the body might CSE with an enclosing (let v = h y in ...).
-  NB: the substitution maps InIds, so we extend the substitution with
-      a binding for the original InId 'x'
+  Here we want to extend the canon. substitution cs_canon with x :-> y, so that
+  the (h x) in the body might CSE with an enclosing (let v = h y in ...).
+
+  NB: cs_canon is distinct from cs_subst, which clones InIds into OutIds in
+      order to handle Note [Shadowing in CSE].
+      cs_canon maps from OutIds to a subset of OutIds, so it uses the same
+      InScopeSet as cs_subst.
+  Couldn't we merge cs_subst and cs_canon? In theory yes, in practice no;
+  see Note [Canonicalisation reverts binder swap transformation].
 
   How can we have a variable on the RHS? Doesn't the simplifier inline them?
 
@@ -102,11 +118,14 @@ Let-bindings have two cases, implemented by extendCSEnvWithBinding.
          x2 = C x1 b
          y1 = C a b
          y2 = C y1 b
-      Here we CSE y1's rhs to 'x1', and then we must add (y1->x1) to
+      Here we CSE y1's rhs to 'x1', and then we must add (y1:->x1) to
       the substitution so that we can CSE the binding for y2.
 
     - Second, we use extendCSEnvWithBinding for case expression scrutinees too;
-      see Note [CSE for case expressions]
+      see Note [CSE for case expressions].
+
+  Do note that adding a mapping (x:->y) to cs_canon by itself does not
+  substitute any occurrence of x in the program.
 
 * EXTEND THE REVERSE MAPPING: applies in all other cases
 
@@ -123,23 +142,9 @@ Let-bindings have two cases, implemented by extendCSEnvWithBinding.
   Here we want to common-up the two uses of (f @ Int) so we can
   remove one of the case expressions.
 
-  See also Note [Corner case for case expressions] for another
-  reason not to use SUBSTITUTE for all trivial expressions.
-
-Notice that
-  - The SUBSTITUTE situation extends the substitution (cs_subst)
+  - The CANONICALISE situation extends the canon. substitution (cs_canon)
   - The EXTEND situation extends the reverse mapping (cs_map)
 
-Notice also that in the SUBSTITUTE case we leave behind a binding
-  x = y
-even though we /also/ carry a substitution x -> y.  Can we just drop
-the binding instead?  Well, not at top level! See Note [Top level and
-postInlineUnconditionally] in GHC.Core.Opt.Simplify.Utils; and in any
-case CSE applies only to the /bindings/ of the program, and we leave
-it to the simplifier to propagate effects to the RULES. Finally, it
-doesn't seem worth the effort to discard the nested bindings because
-the simplifier will do it next.
-
 Note [CSE for case expressions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider
@@ -270,31 +275,13 @@ Wrinkles
   where the unfolding was added by strictness analysis, say.  Then
   CSE goes ahead, so we get
      bar = foo
-  and probably use SUBSTITUTE that will make 'bar' dead.  But just
-  possibly not -- see Note [Dealing with ticks].  In that case we might
-  be left with
+  or possibly (due to Note [Dealing with ticks])
      bar = tick t1 (tick t2 foo)
-  in which case we would really like to get rid of the stable unfolding
-  (generated by the strictness analyser, say).
+  in both cases we would really like to get rid of the stable unfolding so
+  that the Simplifier inlines the possibly trivial RHS rather than the stable
+  unfolding, which would in turn keep alive other bindings.
 
-  Hence the zapStableUnfolding in cse_bind.  Not a big deal, and only
-  makes a difference when ticks get into the picture.
-
-Note [Corner case for case expressions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Here is another reason that we do not use SUBSTITUTE for
-all trivial expressions. Consider
-   case x |> co of (y::Array# Int) { ... }
-
-We do not want to extend the substitution with (y -> x |> co); since y
-is of unlifted type, this would destroy the let-can-float invariant if
-(x |> co) was not ok-for-speculation.
-
-But surely (x |> co) is ok-for-speculation, because it's a trivial
-expression, and x's type is also unlifted, presumably.  Well, maybe
-not if you are using unsafe casts.  I actually found a case where we
-had
-   (x :: HValue) |> (UnsafeCo :: HValue ~ Array# Int)
+  Hence the zapStableUnfolding in cse_bind.  Not a big deal.
 
 Note [CSE for join points?]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -371,6 +358,39 @@ To fix this we pass two different cse envs to cse_bind. One we use the cse the r
 And one we update with the result of cseing the rhs which we then use going forward for the
 body/rest of the module.
 
+Note [Canonicalisation reverts binder swap transformation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The CSEnv maintains two substitutions:
+
+  * cs_subst clones every binder so that CSE does not need to worry about
+    shadowing. See Note [Shadowing in CSE].
+  * cs_canon implements the CANONICALISE case of Note [CSE for bindings], but
+    is never used to actually substitute an expression in the resulting program.
+
+These substitutions must stay distinct. Consider
+
+  data T a :: UnliftedType where MkT :: !a -> T a
+  case x of x' { __DEFAULT -> let y = MkT x' in ... }
+
+Note that the RHS of y satisfies the Note [Core let-can-float invariant] because
+x' is a case binder and thus evaluated.
+Note that the scrutinee x is trivial; hence case CANONICALISE applies and we
+extend cs_canon with
+
+  x' :-> x
+
+Now, /if/ we were to merge cs_canon into cs_subst, then we would apply this
+"reverse binder swap" substitution to the final program and we'd get
+
+  case x of x' { __DEFAULT -> let y = MkT x in ... }
+
+now `MkT x` is no longer ok-for-spec and the program violates the let-can-float
+invariant. This is only temporary, because the next run of the occurrence
+analyser will perform a Note [Binder swap] again, however it will trip up
+CoreLint nonetheless.
+Hence cs_canon is distinct from cs_subst, and the former is only applied before
+looking up a canonicalised key in the reverse mapping.
+
 ************************************************************************
 *                                                                      *
 \section{Common subexpression}
@@ -398,7 +418,7 @@ cseBind toplevel env (Rec [(in_id, rhs)])
   , let previous' = mkTicks ticks previous
         out_id'   = delayInlining toplevel out_id
   = -- We have a hit in the recursive-binding cache
-    (extendCSSubst env1 in_id previous', NonRec out_id' previous')
+    (env1, NonRec out_id' previous')
 
   | otherwise
   = (extendCSRecEnv env1 out_id rhs'' id_expr', Rec [(zapped_id, rhs')])
@@ -427,7 +447,7 @@ cseBind toplevel env (Rec pairs)
 -- We use a different env for cse on the rhs and for extendCSEnvWithBinding
 -- for reasons explain in See Note [Separate envs for let rhs and body]
 cse_bind :: TopLevelFlag -> CSEnv -> CSEnv -> (InId, InExpr) -> OutId -> (CSEnv, (OutId, OutExpr))
-cse_bind toplevel env_rhs env_body (in_id, in_rhs) out_id
+cse_bind toplevel env_rhs env_body (_in_id, in_rhs) out_id
   | isTopLevel toplevel, exprIsTickedString in_rhs
       -- See Note [Take care with literal strings]
   = (env_body', (out_id', in_rhs))
@@ -442,7 +462,7 @@ cse_bind toplevel env_rhs env_body (in_id, in_rhs) out_id
   | otherwise
   = (env_body', (out_id'', out_rhs))
   where
-    (env_body', out_id') = extendCSEnvWithBinding env_body  in_id out_id out_rhs cse_done
+    (env_body', out_id') = extendCSEnvWithBinding env_body out_id out_rhs cse_done
     (cse_done, out_rhs)  = try_for_cse env_rhs in_rhs
     out_id'' | cse_done  = zapStableUnfolding $
                            delayInlining toplevel out_id'
@@ -464,28 +484,28 @@ delayInlining top_lvl bndr
   = bndr
 
 extendCSEnvWithBinding
-           :: CSEnv            -- Includes InId->OutId cloning
-           -> InVar            -- Could be a let-bound type
-           -> OutId -> OutExpr -- Processed binding
-           -> Bool             -- True <=> RHS was CSE'd and is a variable
-                               --          or maybe (Tick t variable)
-           -> (CSEnv, OutId)   -- Final env, final bndr
+           :: CSEnv             -- Includes InId->OutId cloning
+           -> OutVar -> OutExpr -- Processed binding
+           -> Bool              -- True <=> RHS was CSE'd and is a variable
+                                --          or maybe (Tick t variable)
+           -> (CSEnv, OutVar)    -- Final env, final bndr
 -- Extend the CSE env with a mapping [rhs -> out-id]

--- unless we can instead just substitute [in-id -> rhs]
+-- unless we can instead just canonicalise [out-id -> rhs-id]
 --
 -- It's possible for the binder to be a type variable,
--- in which case we can just substitute.
+-- in which case we can just CANONICALISE.
 -- See Note [CSE for bindings]
-extendCSEnvWithBinding env in_id out_id rhs' cse_done
-  | not (isId out_id) = (extendCSSubst env in_id rhs',     out_id)
-  | noCSE out_id      = (env,                              out_id)
-  | use_subst         = (extendCSSubst env in_id rhs',     out_id)
-  | cse_done          = (env,                              out_id)
-                       -- See Note [Dealing with ticks]
-  | otherwise         = (extendCSEnv env rhs' id_expr', zapped_id)
+extendCSEnvWithBinding env v rhs' cse_done
+  -- Should we use CANONICALISE or EXTEND? See Note [CSE for bindings]
+  | not (isId v)   = (extendCSCanon env v rhs', v)              -- CANONICALISE
+  | noCSE v        = (env,                      v)
+  | Var{} <- rhs'  = (extendCSCanon env v rhs', v)              -- CANONICALISE
+  | cse_done       = (env,                      v)
+                    -- See Note [Dealing with ticks]
+  | otherwise      = (extendCSEnv env rhs' id_expr', zapped_id) -- EXTEND
   where
-    id_expr'  = varToCoreExpr out_id
-    zapped_id = zapIdUsageInfo out_id
+    id_expr'  = varToCoreExpr v
+    zapped_id = zapIdUsageInfo v
        -- Putting the Id into the cs_map makes it possible that
        -- it'll become shared more than it is now, which would
        -- invalidate (the usage part of) its demand info.
@@ -496,11 +516,6 @@ extendCSEnvWithBinding env in_id out_id rhs' cse_done
        -- it is bad for performance if you don't do late demand
        -- analysis
 
-    -- Should we use SUBSTITUTE or EXTEND?
-    -- See Note [CSE for bindings]
-    use_subst | Var {} <- rhs' = True
-              | otherwise      = False
-
 -- | Given a binder `let x = e`, this function
 -- determines whether we should add `e -> x` to the cs_map
 noCSE :: InId -> Bool
@@ -544,7 +559,7 @@ the original RHS unmodified. This produces:
 Now 'y' will be discarded as dead code, and we are done.
 
 The net effect is that for the y-binding we want to
-  - Use SUBSTITUTE, by extending the substitution with  y :-> x
+  - Use CANONICALISE, by extending the canon. substitution with  y :-> x
   - but leave the original binding for y undisturbed
 
 This is done by cse_bind.  I got it wrong the first time (#13367).
@@ -587,7 +602,7 @@ with
 where 'y' is the variable that 'e' maps to.  Now consider extendCSEnvWithBinding for
 the binding for 'x':
 
-* We can't use SUBSTITUTE because those ticks might not be trivial (we
+* We can't use CANONICALISE because those ticks might not be trivial (we
   use tickishIsCode in exprIsTrivial)
 
 * We should not use EXTEND, because we definitely don't want to
@@ -721,7 +736,7 @@ cseCase env scrut bndr ty alts
       -- in cse_alt may mean that a dead case binder
       -- becomes alive, and Lint rejects that
     (env1, bndr2)    = addBinder env bndr1
-    (alt_env, bndr3) = extendCSEnvWithBinding env1 bndr bndr2 scrut1 cse_done
+    (alt_env, bndr3) = extendCSEnvWithBinding env1 bndr2 scrut1 cse_done
          -- extendCSEnvWithBinding: see Note [CSE for case expressions]
 
     con_target :: OutExpr
@@ -844,15 +859,21 @@ the case binder is alive; see Note [DataAlt occ info] in GHC.Core.Opt.Simplify.
 
 data CSEnv
   = CS { cs_subst :: Subst  -- Maps InBndrs to OutExprs
-            -- The substitution variables to
+            -- The cloning substitution; maps variables to
             -- /trivial/ OutExprs, not arbitrary expressions
 
+       , cs_canon :: Subst  -- Maps OutBndrs to OutExprs
+            -- The canonicalising substitution to apply before applying the
+            -- reverse mapping cs_subst.
+            -- Maps to /trivial/ OutExprs.
+
        , cs_map   :: CoreMap OutExpr
             -- The "reverse" mapping.
             -- Maps a OutExpr to a /trivial/ OutExpr
-            -- The key of cs_map is stripped of all Ticks
+            -- The key of cs_subst is stripped of all Ticks
             -- It maps arbitrary expressions to trivial expressions
             -- representing the same value. E.g @C a b@ to @x1 at .
+            -- Canonicalise key with cs_canon before looking up in here.
 
        , cs_rec_map :: CoreMap OutExpr
             -- See Note [CSE for recursive bindings]
@@ -860,18 +881,18 @@ data CSEnv
 
 emptyCSEnv :: CSEnv
 emptyCSEnv = CS { cs_map = emptyCoreMap, cs_rec_map = emptyCoreMap
-                , cs_subst = emptySubst }
+                , cs_subst = emptySubst, cs_canon = emptySubst }
 
 lookupCSEnv :: CSEnv -> OutExpr -> Maybe OutExpr
-lookupCSEnv (CS { cs_map = csmap }) expr
-  = lookupCoreMap csmap expr
+lookupCSEnv cse expr
+  = lookupCoreMap (cs_map cse) (canonCSEnv cse expr)
 
 -- | @extendCSEnv env e triv_expr@ will replace any occurrence of @e@ with @triv_expr@ going forward.
 extendCSEnv :: CSEnv -> OutExpr -> OutExpr -> CSEnv
 extendCSEnv cse expr triv_expr
   = cse { cs_map = extendCoreMap (cs_map cse) sexpr triv_expr }
   where
-    sexpr = stripTicksE tickishFloatable expr
+    sexpr = canonCSEnv cse $ stripTicksE tickishFloatable expr
 
 extendCSRecEnv :: CSEnv -> OutId -> OutExpr -> OutExpr -> CSEnv
 -- See Note [CSE for recursive bindings]
@@ -889,8 +910,17 @@ csEnvSubst = cs_subst
 lookupSubst :: CSEnv -> Id -> OutExpr
 lookupSubst (CS { cs_subst = sub}) x = lookupIdSubst sub x
 
-extendCSSubst :: CSEnv -> Id  -> CoreExpr -> CSEnv
-extendCSSubst cse x rhs = cse { cs_subst = extendSubst (cs_subst cse) x rhs }
+extendCSCanon :: CSEnv -> OutVar -> OutExpr -> CSEnv
+extendCSCanon cse x y = cse { cs_canon = extendSubst (cs_canon cse) x y' }
+  where
+    y' = canonCSEnv cse y -- canonicalise y first!
+
+canonCSEnv :: CSEnv -> OutExpr -> OutExpr
+canonCSEnv cse@(CS { cs_canon = sub }) e = substExpr (sub `setInScope` is) e
+  where
+    is = getSubstInScope (cs_subst cse)
+    -- We do not separately maintain the in-scope set of cs_canon; it's just
+    -- the one from the substitution used for cloning.
 
 -- | Add clones to the substitution to deal with shadowing.  See
 -- Note [Shadowing in CSE] for more details.  You should call this whenever


=====================================
compiler/GHC/Core/Opt/CprAnal.hs
=====================================
@@ -298,14 +298,8 @@ data TermFlag -- Better than using a Bool
 exprTerminates :: CoreExpr -> TermFlag
 -- ^ A /very/ simple termination analysis.
 exprTerminates e
-  | exprIsHNF e            = Terminates
   | exprOkForSpeculation e = Terminates
   | otherwise              = MightDiverge
-  -- Annoyingly, we have to check both for HNF and ok-for-spec.
-  --   * `I# (x# *# 2#)` is ok-for-spec, but not in HNF. Still worth CPR'ing!
-  --   * `lvl` is an HNF if its unfolding is evaluated
-  --     (perhaps `lvl = I# 0#` at top-level). But, tiresomely, it is never
-  --     ok-for-spec due to Note [exprOkForSpeculation and evaluated variables].
 
 cprAnalApp :: AnalEnv -> CoreExpr -> [(CprType, CoreArg)] -> (CprType, CoreExpr)
 -- Main function that takes care of /nested/ CPR. See Note [Nested CPR]


=====================================
compiler/GHC/Core/Opt/SetLevels.hs
=====================================
@@ -38,39 +38,58 @@
    We do *not* clone top-level bindings, because some of them must not change,
    but we *do* clone bindings that are heading for the top level
 
-4. Note [Binder-swap during float-out]
-   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+4. Note [Duplicate evals into float]
+   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In the expression
-        case x of wild { p -> ...wild... }
-   we substitute x for wild in the RHS of the case alternatives:
-        case x of wild { p -> ...x... }
-   This means that a sub-expression involving x is not "trapped" inside the RHS
-   (i.e. it can now be floated out, whereas if it mentioned wild it could not).
-   And it's not inconvenient because we already have a substitution.
+        case x of wild { p -> ...f wild... }
+   the sub-expression `f wild` appears "trapped" inside the Case.
+   (i.e. it cannot be floated out because it mentions wild).
+   In order to "free" it, we float out the expression
+   `case x of wild { __DEFAULT -> f wild }`, thus duplicating the eval.
 
    For example, consider:
 
       f x = letrec go y = case x of z { (a,b) -> ...(expensive z)... }
               in ...
 
-   If we do the reverse binder-swap we get
+   If we dup the eval we get
 
-      f x = letrec go y = case x of z { (a,b) -> ...(expensive x)... }
+      f x = letrec go y = case x of z { (a,b) -> ...(case x of z { __DEFAULT -> expensive z })... }
               in ...
 
    and now we can float out:
 
-      f x = let t = expensive x
-              in letrec go y = case x of z { (a,b) -> ...(t)... }
+      f x = let t = case x of wild { __DEFAULT -> expensive wild }
+              in letrec go y = case x of wild { (a,b) -> ...(t)... }
               in ...
 
-   Now (expensive x) is computed once, rather than once each time around the 'go' loop.
-
-   Note that this is EXACTLY BACKWARDS from the what the simplifier does.
-   The simplifier tries to get rid of occurrences of x, in favour of wild,
-   in the hope that there will only be one remaining occurrence of x, namely
-   the scrutinee of the case, and we can inline it.
-
+   Now (expensive wild) is computed once, rather than once each time around the 'go' loop.
+
+   Implementation:
+
+     1. When entering an eval `case x of wild { __DEFAULT -> ...}`, remember
+        that `wild` is the result of evaluating `x`.
+
+     2. After figuring out the variables over which the expression needs to be
+        abstracted in order to float (abstractVars), go over these variables
+        once more to see if `wild` occurs among them. If so, and if the
+        scrutinee variable `x` is bound at a lower level than the destination
+        level of the float, then remove `wild` as an abs_var and record the
+        obligation to eval `x` into `wild` instead (i.e., add a new `eval_vars`
+        pairing). This is implemented in eliminateAbsCaseBndrs.
+        Of course, it only makes sense to duplicate the eval if we end up
+        with a thunk in doing so; hence eliminateAbsCaseBndrs checks whether
+        the abs_vars become empty after duplicating evals.
+
+
+   Historically, we implemented a reverse binder-swap
+   (see Note [The binder-swap substitution] for the forward direction),
+   which would float `expensive x` and thus drop the eval on `x`. This caused
+   trouble for expressions such as
+     case x of wild { __DEFAULT -> let y = dataToTagLarge# wild in ... }
+   When the case binder `wild` is swapped for the unevaluated scrutinee
+   `x`, `dataToTagLarge# x` no longer is ok-for-spec and thus the unlifted let
+   violates the let-can-float invariant. Hence we duplicate the eval nowadays.
 -}
 
 module GHC.Core.Opt.SetLevels (
@@ -127,6 +146,7 @@ import GHC.Utils.Outputable
 import GHC.Utils.Panic
 
 import Data.Maybe
+import Data.Either
 
 {-
 ************************************************************************
@@ -445,21 +465,21 @@ lvlCase :: LevelEnv             -- Level of in-scope names/tyvars
 lvlCase env scrut_fvs scrut' case_bndr ty alts
   -- See Note [Floating single-alternative cases]
   | [AnnAlt con@(DataAlt {}) bs body] <- alts
-  , exprIsHNF (deTagExpr scrut')  -- See Note [Check the output scrutinee for exprIsHNF]
-  , not (isTopLvl dest_lvl)       -- Can't have top-level cases
-  , not (floatTopLvlOnly env)     -- Can float anywhere
-  , ManyTy <- idMult case_bndr     -- See Note [Floating linear case]
+  , exprOkForSpeculation (deTagExpr scrut')  -- See Note [Check the output scrutinee for exprOkForSpeculation]
+  , not (isTopLvl dest_lvl)                  -- Can't have top-level cases
+  , not (floatTopLvlOnly env)                -- Can float anywhere
+  , ManyTy <- idMult case_bndr               -- See Note [Floating linear case]
   =     -- Always float the case if possible
         -- Unlike lets we don't insist that it escapes a value lambda
     do { (env1, (case_bndr' : bs')) <- cloneCaseBndrs env dest_lvl (case_bndr : bs)
-       ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut'
+       ; let rhs_env = rememberEval env1 case_bndr' scrut'
        ; body' <- lvlMFE rhs_env True body
        ; let alt' = Alt con (map (stayPut dest_lvl) bs') body'
        ; return (Case scrut' (TB case_bndr' (FloatMe dest_lvl)) ty' [alt']) }
 
   | otherwise     -- Stays put
-  = do { let (alts_env1, [case_bndr']) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr]
-             alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut'
+  = do { let (alts_env1, [case_bndr'@(TB case_bndrr _)]) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr]
+             alts_env = rememberEval alts_env1 case_bndrr scrut'
        ; alts' <- mapM (lvl_alt alts_env) alts
        ; return (Case scrut' case_bndr' ty' alts') }
   where
@@ -497,17 +517,8 @@ the inner loop.
 
 Things to note:
 
- * The test we perform is exprIsHNF, and /not/ exprOkForSpeculation.
-
-     - exprIsHNF catches the key case of an evaluated variable
-
-     - exprOkForSpeculation is /false/ of an evaluated variable;
-       See Note [exprOkForSpeculation and evaluated variables] in GHC.Core.Utils
-       So we'd actually miss the key case!
-
-     - Nothing is gained from the extra generality of exprOkForSpeculation
-       since we only consider floating a case whose single alternative
-       is a DataAlt   K a b -> rhs
+ * The test we perform is exprOkForSpeculation, because speculating the case is
+   exactly what we do.
 
  * We can't float a case to top level
 
@@ -557,8 +568,8 @@ needed to quantify over some of its free variables (e.g. z), resulting
 in shadowing and very confusing Core Lint failures.
 
 
-Note [Check the output scrutinee for exprIsHNF]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Check the output scrutinee for exprOkForSpeculation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider this:
   case x of y {
     A -> ....(case y of alts)....
@@ -572,10 +583,10 @@ evaluated), but the former is not -- and indeed we can't float the
 inner case out, at least not unless x is also evaluated at its binding
 site.  See #5453.
 
-That's why we apply exprIsHNF to scrut' and not to scrut.
+That's why we apply exprOkForSpeculation to scrut' and not to scrut.
 
 See Note [Floating single-alternative cases] for why
-we use exprIsHNF in the first place.
+we use exprOkForSpeculation in the first place.
 -}
 
 lvlNonTailMFE :: LevelEnv             -- Level of in-scope names/tyvars
@@ -632,7 +643,7 @@ lvlMFE env strict_ctxt ann_expr
   |  float_is_new_lam || exprIsTopLevelBindable expr expr_ty
          -- No wrapping needed if the type is lifted, or is a literal string
          -- or if we are wrapping it in one or more value lambdas
-  = do { expr1 <- lvlFloatRhs abs_vars dest_lvl rhs_env NonRecursive
+  = do { expr1 <- lvlFloatRhs abs_vars eval_vars dest_lvl rhs_env NonRecursive
                               is_bot_lam NotJoinPoint ann_expr
                   -- Treat the expr just like a right-hand side
        ; var <- newLvlVar expr1 NotJoinPoint is_mk_static
@@ -653,6 +664,7 @@ lvlMFE env strict_ctxt ann_expr
   = do { expr1 <- lvlExpr rhs_env ann_expr
        ; let l1r       = incMinorLvlFrom rhs_env
              float_rhs = mkLams abs_vars_w_lvls $
+                         flip (foldr (wrapEval l1r)) eval_vars $
                          Case expr1 (stayPut l1r ubx_bndr) box_ty
                              [Alt DEFAULT [] (App boxing_expr (Var ubx_bndr))]
 
@@ -678,7 +690,8 @@ lvlMFE env strict_ctxt ann_expr
                            -- See Note [Bottoming floats]
                            -- esp Bottoming floats (2)
     expr_ok_for_spec = exprOkForSpeculation expr
-    abs_vars = abstractVars dest_lvl env fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env fvs
     dest_lvl = destLevel env fvs fvs_ty is_function is_bot_lam
                -- NB: is_bot_lam not is_bot; see (3) in
                --     Note [Bottoming floats]
@@ -701,16 +714,16 @@ lvlMFE env strict_ctxt ann_expr
     float_me = saves_work || saves_alloc || is_mk_static
 
     -- See Note [Saving work]
-    saves_work = escapes_value_lam        -- (a)
-                 && not (exprIsHNF expr)  -- (b)
-                 && not float_is_new_lam  -- (c)
+    saves_work = escapes_value_lam                   -- (a)
+                 && not (exprOkForSpeculation expr)  -- (b)
+                 && not float_is_new_lam             -- (c)
     escapes_value_lam = dest_lvl `ltMajLvl` (le_ctxt_lvl env)
 
     -- See Note [Saving allocation] and Note [Floating to the top]
     saves_alloc =  isTopLvl dest_lvl
                 && floatConsts env
                 && (   not strict_ctxt                     -- (a)
-                    || exprIsHNF expr                      -- (b)
+                    || exprOkForSpeculation expr           -- (b)
                     || (is_bot_lam && escapes_value_lam))  -- (c)
 
 hasFreeJoin :: LevelEnv -> DVarSet -> Bool
@@ -730,14 +743,14 @@ But see also Note [Saving allocation].
 
 So we definitely float an expression out if
 (a) It will escape a value lambda (escapes_value_lam)
-(b) The expression is not a head-normal form (exprIsHNF); see (SW1, SW2).
+(b) The expression is not a head-normal form (exprOkForSpeculation); see (SW1, SW2).
 (c) Floating does not require wrapping it in value lambdas (float_is_new_lam).
     See (SW3) below
 
 Wrinkles:
 
 (SW1) Concerning (b) I experimented with using `exprIsCheap` rather than
-      `exprIsHNF` but the latter seems better, according to nofib
+      `exprOkForSpeculation` but the latter seems better, according to nofib
       (`spectral/mate` got 10% worse with exprIsCheap).  It's really a bit of a
       heuristic.
 
@@ -755,7 +768,7 @@ Wrinkles:
       See !12410 for some data comparing the effect of omitting (b) altogether,
       This doesn't apply, though, if we float the thing to the top level; see
       Note [Floating to the top].  Bottom line (data from !12410): adding the
-      not.exprIsHNF test to `saves_work`:
+      not.exprOkForSpeculation test to `saves_work`:
        - Decreases compiler allocations by 0.5%
        - Occasionally decreases runtime allocation (T12996 -2.5%)
        - Slightly mixed effect on nofib: (puzzle -10%, mate -5%, cichelli +5%)
@@ -1233,7 +1246,7 @@ lvlBind env (AnnNonRec bndr rhs)
   -- Otherwise we are going to float
   | null abs_vars
   = do {  -- No type abstraction; clone existing binder
-         rhs' <- lvlFloatRhs [] dest_lvl env NonRecursive
+         rhs' <- lvlFloatRhs [] [] dest_lvl env NonRecursive
                              is_bot_lam NotJoinPoint rhs
        ; (env', [bndr']) <- cloneLetVars NonRecursive env dest_lvl [bndr]
        ; let bndr2 = annotateBotStr bndr' 0 mb_bot_str
@@ -1241,7 +1254,7 @@ lvlBind env (AnnNonRec bndr rhs)
 
   | otherwise
   = do {  -- Yes, type abstraction; create a new binder, extend substitution, etc
-         rhs' <- lvlFloatRhs abs_vars dest_lvl env NonRecursive
+         rhs' <- lvlFloatRhs abs_vars eval_vars dest_lvl env NonRecursive
                              is_bot_lam NotJoinPoint rhs
        ; (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr]
        ; let bndr2 = annotateBotStr bndr' n_extra mb_bot_str
@@ -1252,7 +1265,8 @@ lvlBind env (AnnNonRec bndr rhs)
     ty_fvs     = tyCoVarsOfType bndr_ty
     rhs_fvs    = freeVarsOf rhs
     bind_fvs   = rhs_fvs `unionDVarSet` dIdFreeVars bndr
-    abs_vars   = abstractVars dest_lvl env bind_fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env bind_fvs
     dest_lvl   = destLevel env bind_fvs ty_fvs (isFunction rhs) is_bot_lam
 
     deann_rhs  = deAnnotate rhs
@@ -1332,9 +1346,8 @@ lvlBind env (AnnRec pairs)
                       -- function in a Rec, and we don't much care what
                       -- happens to it.  False is simple!
 
-    do_rhs env (_,rhs) = lvlFloatRhs abs_vars dest_lvl env Recursive
-                                     is_bot NotJoinPoint
-                                     rhs
+    do_rhs env (_,rhs) = lvlFloatRhs abs_vars eval_vars dest_lvl env Recursive
+                                     is_bot NotJoinPoint rhs
 
         -- Finding the free vars of the binding group is annoying
     bind_fvs = ((unionDVarSets [ freeVarsOf rhs | (_, rhs) <- pairs])
@@ -1346,7 +1359,8 @@ lvlBind env (AnnRec pairs)
 
     ty_fvs   = foldr (unionVarSet . tyCoVarsOfType . idType) emptyVarSet bndrs
     dest_lvl = destLevel env bind_fvs ty_fvs is_fun is_bot
-    abs_vars = abstractVars dest_lvl env bind_fvs
+    (abs_vars, eval_vars) = eliminateAbsCaseBndrs dest_lvl env $
+                            abstractVars dest_lvl env bind_fvs
 
     is_top_bindable = not (any (mightBeUnliftedType . idType) bndrs)
        -- This mightBeUnliftedType stuff is the same test as in the non-rec case
@@ -1396,21 +1410,28 @@ lvlRhs :: LevelEnv
        -> CoreExprWithFVs
        -> LvlM LevelledExpr
 lvlRhs env rec_flag is_bot mb_join_arity expr
-  = lvlFloatRhs [] (le_ctxt_lvl env) env
+  = lvlFloatRhs [] [] (le_ctxt_lvl env) env
                 rec_flag is_bot mb_join_arity expr
 
-lvlFloatRhs :: [OutVar] -> Level -> LevelEnv -> RecFlag
+wrapEval :: Level -> (OutId, OutId) -> Expr LevelledBndr -> Expr LevelledBndr
+-- A bit like GHC.Core.Utils.mkDefaultCase, but `Expr LevelledBndr`
+wrapEval dest_lvl (scrut_v, case_bndr) body
+  = Case (Var scrut_v) (TB case_bndr (StayPut dest_lvl))
+         (exprType $ deTagExpr body) [Alt DEFAULT [] body]
+
+lvlFloatRhs :: [OutVar] -> [(OutId,OutId)] -> Level -> LevelEnv -> RecFlag
             -> Bool   -- Binding is for a bottoming function
             -> JoinPointHood
             -> CoreExprWithFVs
             -> LvlM (Expr LevelledBndr)
 -- Ignores the le_ctxt_lvl in env; treats dest_lvl as the baseline
-lvlFloatRhs abs_vars dest_lvl env rec is_bot mb_join_arity rhs
-  = do { body' <- if not is_bot  -- See Note [Floating from a RHS]
+lvlFloatRhs abs_vars eval_vars dest_lvl env rec is_bot mb_join_arity rhs
+  = do { body1 <- if not is_bot  -- See Note [Floating from a RHS]
                      && any isId bndrs
                   then lvlMFE  body_env True body
                   else lvlExpr body_env      body
-       ; return (mkLams bndrs' body') }
+       ; let body2 = foldr (wrapEval dest_lvl) body1 eval_vars
+       ; return (mkLams bndrs' body2) }
   where
     (bndrs, body)     | JoinPoint join_arity <- mb_join_arity
                       = collectNAnnBndrs join_arity rhs
@@ -1633,14 +1654,20 @@ countFreeIds = nonDetStrictFoldUDFM add 0 . getUniqDSet
 
 data LevelEnv
   = LE { le_switches :: FloatOutSwitches
-       , le_ctxt_lvl :: Level           -- The current level
-       , le_lvl_env  :: VarEnv Level    -- Domain is *post-cloned* TyVars and Ids
+       , le_ctxt_lvl :: Level
+       -- ^ The current level
+       , le_lvl_env  :: VarEnv (Level, Maybe OutId)
+       -- ^ Domain is *post-cloned* TyVars and Ids.
+       -- If `Just scrut`, then the var mapping belongs to a case binder with
+       -- variable scrutinee `scrut`.
+       -- This is to support Note [Duplicate evals into float].
 
        -- See Note [le_subst and le_env]
-       , le_subst    :: Subst           -- Domain is pre-cloned TyVars and Ids
-                                        -- The Id -> CoreExpr in the Subst is ignored
-                                        -- (since we want to substitute a LevelledExpr for
-                                        -- an Id via le_env) but we do use the Co/TyVar substs
+       , le_subst    :: Subst
+       -- ^ Domain is pre-cloned TyVars and Ids.
+       -- The Id -> CoreExpr in the Subst is ignored
+       -- (since we want to substitute a LevelledExpr for
+       -- an Id via le_env) but we do use the Co/TyVar substs
        , le_env      :: IdEnv ([OutVar], LevelledExpr)  -- Domain is pre-cloned Ids
     }
 
@@ -1690,10 +1717,10 @@ initialEnv float_lams binds
       -- to a later one.  So here we put all the top-level binders in scope before
       -- we start, to satisfy the lookupIdSubst invariants (#20200 and #20294)
 
-addLvl :: Level -> VarEnv Level -> OutVar -> VarEnv Level
-addLvl dest_lvl env v' = extendVarEnv env v' dest_lvl
+addLvl :: Level -> VarEnv (Level, Maybe OutId) -> OutVar -> VarEnv (Level, Maybe OutId)
+addLvl dest_lvl env v' = extendVarEnv env v' (dest_lvl, Nothing)
 
-addLvls :: Level -> VarEnv Level -> [OutVar] -> VarEnv Level
+addLvls :: Level -> VarEnv (Level, Maybe OutId) -> [OutVar] -> VarEnv (Level, Maybe OutId)
 addLvls dest_lvl env vs = foldl' (addLvl dest_lvl) env vs
 
 floatLams :: LevelEnv -> Maybe Int
@@ -1711,20 +1738,19 @@ floatTopLvlOnly le = floatToTopLevelOnly (le_switches le)
 incMinorLvlFrom :: LevelEnv -> Level
 incMinorLvlFrom env = incMinorLvl (le_ctxt_lvl env)
 
--- extendCaseBndrEnv adds the mapping case-bndr->scrut-var if it can
--- See Note [Binder-swap during float-out]
-extendCaseBndrEnv :: LevelEnv
-                  -> Id                 -- Pre-cloned case binder
-                  -> Expr LevelledBndr  -- Post-cloned scrutinee
-                  -> LevelEnv
-extendCaseBndrEnv le@(LE { le_subst = subst, le_env = id_env })
+-- rememberEval adds the mapping case-bndr->scrut-var if it can
+-- See Note [Duplicate evals into float]
+rememberEval :: LevelEnv
+             -> OutId              -- Post-cloned case binder
+             -> Expr LevelledBndr  -- Post-cloned scrutinee
+             -> LevelEnv
+rememberEval le@(LE { le_lvl_env = lvl_env })
                   case_bndr (Var scrut_var)
   -- We could use OccurAnal. scrutOkForBinderSwap here, and perhaps
   -- get a bit more floating.  But we didn't in the past and it's
   -- an unforced change, so I'm leaving it.
-  = le { le_subst   = extendSubstWithVar subst case_bndr scrut_var
-       , le_env     = add_id id_env (case_bndr, scrut_var) }
-extendCaseBndrEnv env _ _ = env
+  = le { le_lvl_env = modifyVarEnv (\(lvl,_) -> (lvl, Just scrut_var)) lvl_env case_bndr }
+rememberEval env _ _ = env
 
 maxFvLevel :: (Var -> Bool) -> LevelEnv -> DVarSet -> Level
 maxFvLevel max_me env var_set
@@ -1745,8 +1771,8 @@ maxIn max_me (LE { le_lvl_env = lvl_env, le_env = id_env }) in_var lvl
   where
     max_out out_var lvl
         | max_me out_var = case lookupVarEnv lvl_env out_var of
-                                Just lvl' -> maxLvl lvl' lvl
-                                Nothing   -> lvl
+                                Just (lvl',_) -> maxLvl lvl' lvl
+                                Nothing       -> lvl
         | otherwise = lvl       -- Ignore some vars depending on max_me
 
 lookupVar :: LevelEnv -> Id -> LevelledExpr
@@ -1765,20 +1791,16 @@ abstractVars :: Level -> LevelEnv -> DVarSet -> [OutVar]
         -- variable computation and deterministic sort.
         -- See Note [Unique Determinism] in GHC.Types.Unique for explanation of why
         -- Uniques are not deterministic.
-abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs
+abstractVars dest_lvl le@(LE { le_subst = subst }) in_fvs
   =  -- NB: sortQuantVars might not put duplicates next to each other
-    map zap $ sortQuantVars $
-    filter abstract_me      $
-    dVarSetElems            $
-    closeOverKindsDSet      $
+    map zap $ sortQuantVars         $
+    filter (abstractMe dest_lvl le) $
+    dVarSetElems                    $
+    closeOverKindsDSet              $
     substDVarSet subst in_fvs
         -- NB: it's important to call abstract_me only on the OutIds the
         -- come from substDVarSet (not on fv, which is an InId)
   where
-    abstract_me v = case lookupVarEnv lvl_env v of
-                        Just lvl -> dest_lvl `ltLvl` lvl
-                        Nothing  -> False
-
         -- We are going to lambda-abstract, so nuke any IdInfo,
         -- and add the tyvars of the Id (if necessary)
     zap v | isId v = warnPprTrace (isStableUnfolding (idUnfolding v) ||
@@ -1787,6 +1809,31 @@ abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs
                      setIdInfo v vanillaIdInfo
           | otherwise = v
 
+abstractMe :: Level -> LevelEnv -> Var -> Bool
+abstractMe dest_lvl (LE { le_lvl_env = lvl_env }) v
+  | Just (lvl, _) <- lookupVarEnv lvl_env v
+  = dest_lvl `ltLvl` lvl
+  | otherwise
+  = False
+
+eliminateAbsCaseBndrs :: Level -> LevelEnv -> [OutVar] -> ([OutVar], [(OutId,OutId)])
+-- Try turning all runtime abs_vars into evals.
+-- See Note [Duplicate evals into float]
+eliminateAbsCaseBndrs dest_lvl le@(LE { le_lvl_env = lvl_env }) abs_vars
+  | enables_thunking -- otherwise we do not get to safe work anyway!
+  = res
+  | otherwise
+  = (abs_vars, [])
+  where
+    res@(abs_vars',_eval_vars') = partitionEithers (map try_elim abs_vars)
+    enables_thunking = not (any isRuntimeVar abs_vars')
+
+    try_elim v = case lookupVarEnv lvl_env v of
+      Just (_, Just scrut_id)
+        | not (abstractMe dest_lvl le scrut_id) -- would not abstract scrut_id
+        -> Right (scrut_id, v) -- turn abs_var v into an eval on scrut_id!
+      _ -> Left v              -- retain as an abs_var
+
 type LvlM result = UniqSM result
 
 initLvl :: UniqSupply -> UniqSM a -> a
@@ -1835,9 +1882,12 @@ newLvlVar :: LevelledExpr        -- The RHS of the new binding
           -> LvlM Id
 newLvlVar lvld_rhs join_arity_maybe is_mk_static
   = do { uniq <- getUniqueM
-       ; return (add_join_info (mk_id uniq rhs_ty))
+       ; return (add_evald $ add_join_info $ mk_id uniq rhs_ty)
        }
   where
+    add_evald var
+      | exprIsHNF de_tagged_rhs = var `setIdUnfolding` evaldUnfolding
+      | otherwise               = var
     add_join_info var = var `asJoinId_maybe` join_arity_maybe
     de_tagged_rhs = deTagExpr lvld_rhs
     rhs_ty        = exprType de_tagged_rhs


=====================================
compiler/GHC/Core/Opt/WorkWrap/Utils.hs
=====================================
@@ -1598,7 +1598,7 @@ move_transit_vars :: [Id] -> (CoreExpr -> CoreExpr -> CoreExpr, CoreExpr)
 move_transit_vars vars
   | [var] <- vars
   , let var_ty = idType var
-  , isUnliftedType var_ty || exprIsHNF (Var var)
+  , isUnliftedType var_ty || isEvaldUnfolding (idUnfolding var)
   -- See Note [No unboxed tuple for single, unlifted transit var]
   --   * Wrapper: `unbox scrut alt = (case <scrut> of a -> <alt>)`
   --   * Worker:  `tup = a`
@@ -1702,9 +1702,7 @@ return unboxed instead of in an unboxed singleton tuple:
     We want  `$wh :: Int# -> [Int]`.
     We'd get `$wh :: Int# -> (# [Int] #)`.
 
-By considering vars as unlifted that satisfy 'exprIsHNF', we catch (3).
-Why not check for 'exprOkForSpeculation'? Quite perplexingly, evaluated vars
-are not ok-for-spec, see Note [exprOkForSpeculation and evaluated variables].
+By considering vars as evaluated that have an evald unfolding, we catch (3).
 For (1) and (2) we would have to look at the term. WW only looks at the
 type and the CPR signature, so the only way to fix (1) and (2) would be to
 have a nested termination signature, like in MR !1866.


=====================================
compiler/GHC/Core/Subst.hs
=====================================
@@ -519,7 +519,9 @@ substUnfolding subst df@(DFunUnfolding { df_bndrs = bndrs, df_args = args })
 substUnfolding subst unf@(CoreUnfolding { uf_tmpl = tmpl, uf_src = src })
   -- Retain stable unfoldings
   | not (isStableSource src)  -- Zap an unstable unfolding, to save substitution work
-  = NoUnfolding
+  = if isEvaldUnfolding unf
+    then evaldUnfolding
+    else NoUnfolding
   | otherwise                 -- But keep a stable one!
   = seqExpr new_tmpl `seq`
     unf { uf_tmpl = new_tmpl }


=====================================
compiler/GHC/Core/TyCo/Subst.hs
=====================================
@@ -22,6 +22,7 @@ module GHC.Core.TyCo.Subst
         extendCvSubst, extendCvSubstWithClone,
         extendTvSubst, extendTvSubstWithClone,
         extendTvSubstList, extendTvSubstAndInScope,
+        extendCvSubstAndInScope,
         extendTCvSubstList,
         unionSubst, zipTyEnv, zipCoEnv,
         zipTvSubst, zipCvSubst,
@@ -409,6 +410,14 @@ extendTvSubstAndInScope (Subst in_scope ids tenv cenv) tv ty
              (extendVarEnv tenv tv ty)
              cenv
 
+extendCvSubstAndInScope :: Subst -> CoVar -> Coercion -> Subst
+-- Also extends the in-scope set
+extendCvSubstAndInScope (Subst in_scope ids tenv cenv) cv co
+  = Subst (in_scope `extendInScopeSetSet` tyCoVarsOfCo co)
+             ids
+             tenv
+             (extendVarEnv cenv cv co)
+
 -- | Adds multiple 'TyVar' substitutions to the 'Subst': see also 'extendTvSubst'
 extendTvSubstList :: Subst -> [(TyVar,Type)] -> Subst
 extendTvSubstList subst vrs


=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -55,7 +55,7 @@ module GHC.Core.Type (
         splitForAllForAllTyBinders, splitForAllForAllTyBinder_maybe,
         splitForAllTyCoVar_maybe, splitForAllTyCoVar,
         splitForAllTyVar_maybe, splitForAllCoVar_maybe,
-        splitPiTy_maybe, splitPiTy, splitPiTys, collectPiTyBinders,
+        splitPiTy_maybe, splitPiTy, splitPiTys,
         getRuntimeArgTys,
         mkTyConBindersPreferAnon,
         mkPiTy, mkPiTys,
@@ -290,7 +290,6 @@ import GHC.Utils.Panic
 import GHC.Data.FastString
 
 import GHC.Data.Maybe   ( orElse, isJust, firstJust )
-import GHC.List (build)
 
 -- $type_classification
 -- #type_classification#
@@ -2056,18 +2055,6 @@ splitPiTys ty = split ty ty []
     split orig_ty ty bs | Just ty' <- coreView ty = split orig_ty ty' bs
     split orig_ty _                bs = (reverse bs, orig_ty)
 
-collectPiTyBinders :: Type -> [PiTyBinder]
-collectPiTyBinders ty = build $ \c n ->
-  let
-    split (ForAllTy b res) = Named b `c` split res
-    split (FunTy { ft_af = af, ft_mult = w, ft_arg = arg, ft_res = res })
-                           = Anon (Scaled w arg) af `c` split res
-    split ty | Just ty' <- coreView ty = split ty'
-    split _                = n
-  in
-    split ty
-{-# INLINE collectPiTyBinders #-}
-
 -- | Extracts a list of run-time arguments from a function type,
 -- looking through newtypes to the right of arrows.
 --


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -74,9 +74,11 @@ import GHC.Core.Ppr
 import GHC.Core.FVs( bindFreeVars )
 import GHC.Core.DataCon
 import GHC.Core.Type as Type
+import GHC.Core.TyCo.Rep as Type
 import GHC.Core.Predicate( isCoVarType )
 import GHC.Core.FamInstEnv
 import GHC.Core.TyCo.Compare( eqType, eqTypeX )
+import GHC.Core.TyCo.Subst
 import GHC.Core.Coercion
 import GHC.Core.Reduction
 import GHC.Core.TyCon
@@ -1714,7 +1716,8 @@ it's applied only to dictionaries.
 
 -----------------------------
 -- | To a first approximation, 'exprOkForSpeculation' returns True of
--- an expression that is:
+-- an expression that is nearly a value (i.e., it implies 'exprIsHNF').
+-- That is, the expression is
 --
 --  * Safe to evaluate even if normal order eval might not
 --    evaluate the expression at all, and
@@ -1740,29 +1743,19 @@ it's applied only to dictionaries.
 -- But in fact both functions are a bit more conservative than the above,
 -- in at least the following ways:
 --
---  * W1: We do not take advantage of already-evaluated lifted variables.
---        As a result, 'exprIsHNF' DOES NOT imply 'exprOkForSpeculation';
---        if @y@ is a case-binder of lifted type, then @exprIsHNF y@ is
---        'True', while @exprOkForSpeculation y@ is 'False'.
---        See Note [exprOkForSpeculation and evaluated variables] for why.
---  * W2: Read-effects on mutable variables are currently also included.
+--  * W1: Read-effects on mutable variables are currently also included.
 --        See Note [Classifying primop effects] "GHC.Builtin.PrimOps".
---  * W3: Currently, 'exprOkForSpeculation' always returns 'False' for
---        let-expressions.  Lets can be stacked deeply, so we just give up.
---        In any case, the argument of 'exprOkForSpeculation' is usually in
---        a strict context, so any lets will have been floated away.
---
 --
 -- As an example of the considerations in this test, consider:
 --
 -- > let x = case y# +# 1# of { r# -> I# r# }
--- > in E
+-- > in e
 --
 -- being translated to:
 --
 -- > case y# +# 1# of { r# ->
 -- >    let x = I# r#
--- >    in E
+-- >    in e
 -- > }
 --
 -- We can only do this if the @y# +# 1#@ is ok for speculation: it has no
@@ -1811,14 +1804,12 @@ expr_ok fun_ok primop_ok (Tick tickish e)
    | tickishCounts tickish = False
    | otherwise             = expr_ok fun_ok primop_ok e
 
-expr_ok _ _ (Let {}) = False
--- See W3 in the Haddock comment for exprOkForSpeculation
+expr_ok fun_ok primop_ok (Let _ e) = expr_ok fun_ok primop_ok e
+  -- Ignore binds becaue of Note [Core let-can-float invariant] in GHC.Core
 
-expr_ok fun_ok primop_ok (Case scrut bndr _ alts)
+expr_ok fun_ok primop_ok (Case scrut _ _ alts)
   =  -- See Note [exprOkForSpeculation: case expressions]
      expr_ok fun_ok primop_ok scrut
-  && isUnliftedType (idType bndr)
-      -- OK to call isUnliftedType: binders always have a fixed RuntimeRep
   && all (\(Alt _ _ rhs) -> expr_ok fun_ok primop_ok rhs) alts
   && altsAreExhaustive alts
 
@@ -1847,7 +1838,7 @@ app_ok fun_ok primop_ok fun args
 
   | idArity fun > n_val_args
   -- Partial application: just check passing the arguments is OK
-  = args_ok
+  = args_ok notCBV
 
   | otherwise
   = case idDetails fun of
@@ -1857,9 +1848,10 @@ app_ok fun_ok primop_ok fun args
 
       DataConWorkId dc
         | isLazyDataConRep dc
-        -> args_ok
+        -> args_ok notCBV
         | otherwise
-        -> fields_ok (dataConRepStrictness dc)
+        -> args_ok (dataConRepStrictness dc)
+            -- See (SFC1) of Note [Strict fields in Core]
 
       ClassOpId _ is_terminating_result
         | is_terminating_result -- See Note [exprOkForSpeculation and type classes]
@@ -1881,7 +1873,7 @@ app_ok fun_ok primop_ok fun args
               -- Often there is a literal divisor, and this
               -- can get rid of a thunk in an inner loop
 
-        | otherwise -> primop_ok op && args_ok
+        | otherwise -> primop_ok op && args_ok notCBV
 
       _other  -- Unlifted and terminating types;
               -- Also c.f. the Var case of exprIsHNF
@@ -1898,37 +1890,54 @@ app_ok fun_ok primop_ok fun args
          -- See (U12) of Note [Implementing unsafeCoerce]
          | fun `hasKey` unsafeEqualityProofIdKey -> True
 
-         | otherwise -> False
-             -- NB: even in the nullary case, do /not/ check
-             --     for evaluated-ness of the fun;
-             --     see Note [exprOkForSpeculation and evaluated variables]
+         | 0 <- n_val_args
+         , isEvaldUnfolding (idUnfolding fun)
+         -> -- pprTrace "yes" (ppr fun)
+            True
+
+         | otherwise -> -- pprTrace "no" (ppr fun <+> ppr args)
+                        False
   where
     fun_ty       = idType fun
     n_val_args   = valArgCount args
-    (arg_tys, _) = splitPiTys fun_ty
 
     -- Even if a function call itself is OK, any unlifted
-    -- args are still evaluated eagerly and must be checked
-    args_ok = all2Prefix arg_ok arg_tys args
-    arg_ok :: PiTyVarBinder -> CoreExpr -> Bool
-    arg_ok (Named _) _ = True   -- A type argument
-    arg_ok (Anon ty _) arg      -- A term argument
-       | definitelyLiftedType (scaledThing ty)
-       = True -- lifted args are not evaluated eagerly
+    -- args are still evaluated eagerly and must be checked.
+    -- Furthermore, for saturated calls, we must check CBV args (strict fields
+    -- of DataCons!)
+    args_ok str_marks = relevantAppArgsSatisfy arg_ok fun_ty args str_marks
+    arg_ok :: Type -> CoreExpr -> StrictnessMark -> Bool
+    arg_ok ty arg str
+       | NotMarkedStrict <- str   -- iff it's not a CBV arg
+       , definitelyLiftedType ty  -- and its type is lifted
+       = True                     -- then the worker app does not eval
        | otherwise
        = expr_ok fun_ok primop_ok arg
 
-    -- Used for strict DataCon worker arguments
-    -- See (SFC1) of Note [Strict fields in Core]
-    fields_ok str_marks = all3Prefix field_ok arg_tys str_marks args
-    field_ok :: PiTyVarBinder -> StrictnessMark -> CoreExpr -> Bool
-    field_ok (Named _)   _   _ = True
-    field_ok (Anon ty _) str arg
-       | NotMarkedStrict <- str                 -- iff it's a lazy field
-       , definitelyLiftedType (scaledThing ty)  -- and its type is lifted
-       = True                                   -- then the worker app does not eval
-       | otherwise
-       = expr_ok fun_ok primop_ok arg
+notCBV :: [StrictnessMark]
+notCBV = repeat NotMarkedStrict
+
+relevantAppArgsSatisfy :: (Type -> CoreExpr -> StrictnessMark -> Bool) -> Type -> [CoreExpr] -> [StrictnessMark] -> Bool
+-- This calls the predicate on every non-Type CoreExpr arg.
+-- We could just do `exprType arg` for every such arg, but carrying around a
+-- substitution is much more efficient.
+-- The obvious definition regresses T16577 by 30% so we don't do it.
+relevantAppArgsSatisfy p ty = go (mkEmptySubst in_scope) ty
+  where
+    in_scope = mkInScopeSet (tyCoVarsOfType ty)
+    go subst (ForAllTy b res) (Type ty : args) strs
+      = go (extendTvSubstAndInScope subst (binderVar b) ty) res args strs
+    go subst (ForAllTy b res) (Coercion co : args) strs
+      = go (extendCvSubstAndInScope subst (binderVar b) co) res args strs
+    go subst (FunTy { ft_arg = arg, ft_res = res }) (e : args) (str : strs)
+      = p (substTy subst arg) e str && go subst res args strs
+    go subst ty args@(_:_) strs@(_:_)
+      | Just ty' <- coreView (substTy subst ty)
+      = go (mkEmptySubst (getSubstInScope subst)) ty' args strs
+      | otherwise
+      = pprPanic "Should see more argument tys" (ppr ty $$ ppr subst $$ ppr args $$ ppr (take 10 strs))
+    go _ _ _ _ = True
+{-# INLINE relevantAppArgsSatisfy #-}
 
 -----------------------------
 altsAreExhaustive :: [Alt b] -> Bool
@@ -1985,6 +1994,8 @@ GHC.Types.Id.Make.mkDictSelId for where this field is initialised.
 
 Note [exprOkForSpeculation: case expressions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+SG: I don't get what is so "very special" about this case. I find it very reasonable.
+
 exprOkForSpeculation accepts very special case expressions.
 Reason: (a ==# b) is ok-for-speculation, but the litEq rules
 in GHC.Core.Opt.ConstantFold convert it (a ==# 3#) to
@@ -1994,39 +2005,22 @@ for excellent reasons described in
 So, annoyingly, we want that case expression to be
 ok-for-speculation too. Bother.
 
-But we restrict it sharply:
-
-* We restrict it to unlifted scrutinees. Consider this:
-     case x of y {
-       DEFAULT -> ... (let v::Int# = case y of { True  -> e1
-                                               ; False -> e2 }
-                       in ...) ...
-
-  Does the RHS of v satisfy the let-can-float invariant?  Previously we said
-  yes, on the grounds that y is evaluated.  But the binder-swap done
-  by GHC.Core.Opt.SetLevels would transform the inner alternative to
-     DEFAULT -> ... (let v::Int# = case x of { ... }
-                     in ...) ....
-  which does /not/ satisfy the let-can-float invariant, because x is
-  not evaluated. See Note [Binder-swap during float-out]
-  in GHC.Core.Opt.SetLevels.  To avoid this awkwardness it seems simpler
-  to stick to unlifted scrutinees where the issue does not
-  arise.
-
-* We restrict it to exhaustive alternatives. A non-exhaustive
-  case manifestly isn't ok-for-speculation. for example,
-  this is a valid program (albeit a slightly dodgy one)
-    let v = case x of { B -> ...; C -> ... }
-    in case x of
-         A -> ...
-         _ ->  ...v...v....
-  Should v be considered ok-for-speculation?  Its scrutinee may be
-  evaluated, but the alternatives are incomplete so we should not
-  evaluate it strictly.
-
-  Now, all this is for lifted types, but it'd be the same for any
-  finite unlifted type. We don't have many of them, but we might
-  add unlifted algebraic types in due course.
+SG: Again, I don't see why we need to list a specific example.
+Clearly, we want to detect as many expressions as ok-for-spec as possible!
+
+But we restrict it to exhaustive alternatives. A non-exhaustive
+case manifestly isn't ok-for-speculation. For example,
+this is a valid program (albeit a slightly dodgy one)
+  let v = case x of { B -> ...; C -> ... }
+  in case x of
+       A -> ...
+       _ ->  ...v...v....
+Should v be considered ok-for-speculation?  Its scrutinee may be
+evaluated, but the alternatives are incomplete so we should not
+evaluate it strictly.
+
+Now, all this is for lifted types, but it'd be the same for any
+finite unlifted type.
 
 
 ----- Historical note: #15696: --------
@@ -2073,37 +2067,6 @@ points do the job nicely.
 ------- End of historical note ------------
 
 
-Note [exprOkForSpeculation and evaluated variables]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider these examples:
- * case x of y { DEFAULT -> ....y.... }
-   Should 'y' (alone) be considered ok-for-speculation?
-
- * case x of y { DEFAULT -> ....let z = dataToTagLarge# y... }
-   Should (dataToTagLarge# y) be considered ok-for-spec? Recall that
-     dataToTagLarge# :: forall a. a -> Int#
-   must always evaluate its argument. (See also Note [DataToTag overview].)
-
-You could argue 'yes', because in the case alternative we know that
-'y' is evaluated.  But the binder-swap transformation, which is
-extremely useful for float-out, changes these expressions to
-   case x of y { DEFAULT -> ....x.... }
-   case x of y { DEFAULT -> ....let z = dataToTagLarge# x... }
-
-And now the expression does not obey the let-can-float invariant!  Yikes!
-Moreover we really might float (dataToTagLarge# x) outside the case,
-and then it really, really doesn't obey the let-can-float invariant.
-
-The solution is simple: exprOkForSpeculation does not try to take
-advantage of the evaluated-ness of (lifted) variables.  And it returns
-False (always) for primops that perform evaluation.  We achieve the latter
-by marking the relevant primops as "ThrowsException" or
-"ReadWriteEffect"; see also Note [Classifying primop effects] in
-GHC.Builtin.PrimOps.
-
-Note that exprIsHNF /can/ and does take advantage of evaluated-ness;
-it doesn't have the trickiness of the let-can-float invariant to worry about.
-
 ************************************************************************
 *                                                                      *
              exprIsHNF, exprIsConLike
@@ -2166,7 +2129,6 @@ exprIsHNFlike is_con is_con_unf e
       || is_con_unf (idUnfolding v)
         -- Check the thing's unfolding; it might be bound to a value
         --   or to a guaranteed-evaluated variable (isEvaldUnfolding)
-        --   Contrast with Note [exprOkForSpeculation and evaluated variables]
         -- We don't look through loop breakers here, which is a bit conservative
         -- but otherwise I worry that if an Id's unfolding is just itself,
         -- we could get an infinite loop
@@ -2185,66 +2147,59 @@ exprIsHNFlike is_con is_con_unf e
                                    && is_hnf_like e
                                       -- See Note [exprIsHNF Tick]
     is_hnf_like (Cast e _)       = is_hnf_like e
-    is_hnf_like (App e a)
-      | isValArg a               = app_is_value e [a]
-      | otherwise                = is_hnf_like e
     is_hnf_like (Let _ e)        = is_hnf_like e  -- Lazy let(rec)s don't affect us
     is_hnf_like (Case e b _ as)
       | Just rhs <- isUnsafeEqualityCase e b as
       = is_hnf_like rhs
-    is_hnf_like _                = False
-
-    -- Collect arguments through Casts and Ticks and call id_app_is_value
-    app_is_value :: CoreExpr -> [CoreArg] -> Bool
-    app_is_value (Var f)    as = id_app_is_value f as
-    app_is_value (Tick _ f) as = app_is_value f as
-    app_is_value (Cast f _) as = app_is_value f as
-    app_is_value (App f a)  as | isValArg a = app_is_value f (a:as)
-                               | otherwise  = app_is_value f as
-    app_is_value _          _  = False
-
-    id_app_is_value id val_args =
+    is_hnf_like e
+      | (fun, args) <- collectArgs e
+      = case stripTicksTopE (not . tickishCounts) fun of
+            Var f -> id_app_is_value f args
+
+            -- 'LitRubbish' is the only literal that can occur in the head of an
+            -- application and will not be matched by the above case (Var /= Lit).
+            -- See Note [How a rubbish literal can be the head of an application]
+            -- in GHC.Types.Literal
+            Lit lit | debugIsOn, not (isLitRubbish lit)
+                     -> pprPanic "Non-rubbish lit in app head" (ppr lit)
+                     | otherwise
+                     -> True
+
+            _ -> False
+
+    id_app_is_value id args =
       -- See Note [exprIsHNF for function applications]
       --   for the specification and examples
-      case compare (idArity id) (length val_args) of
+      case compare (idArity id) n_val_args of
         EQ | is_con id ->      -- Saturated app of a DataCon/CONLIKE Id
           case mb_str_marks id of
             Just str_marks ->  -- with strict fields; see (SFC1) of Note [Strict fields in Core]
-              assert (val_args `equalLength` str_marks) $
-              fields_hnf str_marks
+              args_hnf str_marks
             Nothing ->         -- without strict fields: like PAP
-              args_hnf         -- NB: CONLIKEs are lazy!
+              args_hnf notCBV  -- NB: CONLIKEs are lazy!
 
-        GT ->                  -- PAP: Check unlifted val_args
-          args_hnf
+        GT ->                  -- PAP: Check unlifted args
+          args_hnf notCBV
 
         _  -> False
 
       where
-        -- Saturated, Strict DataCon: Check unlifted val_args and strict fields
-        fields_hnf str_marks = all3Prefix check_field val_arg_tys str_marks val_args
-
-        -- PAP: Check unlifted val_args
-        args_hnf             = all2Prefix check_arg   val_arg_tys           val_args
-
         fun_ty = idType id
-        val_arg_tys = mapMaybe anonPiTyBinderType_maybe (collectPiTyBinders fun_ty)
-          -- val_arg_tys = map exprType val_args, but much less costly.
-          -- The obvious definition regresses T16577 by 30% so we don't do it.
-
-        check_arg a_ty a
+        n_val_args   = valArgCount args
+        -- Check the args for HNFness.
+        args_hnf str_marks = relevantAppArgsSatisfy check_arg fun_ty args str_marks
+        --   * Unlifted args must always be HNF
+        --   * CBV args (strict fields!) must be HNF for saturated calls,
+        --     indicated by str_marks
+        check_arg a_ty a str
           | mightBeUnliftedType a_ty = is_hnf_like a
+          | isMarkedStrict str       = is_hnf_like a
           | otherwise                = True
          -- Check unliftedness; for example f (x /# 12#) where f has arity two,
          -- and the first argument is unboxed. This is not a value!
          -- But  f 34#  is a value, so check args for HNFs.
          -- NB: We check arity (and CONLIKEness) first because it's cheaper
          --     and we reject quickly on saturated apps.
-        check_field a_ty str a
-          | mightBeUnliftedType a_ty = is_hnf_like a
-          | isMarkedStrict str       = is_hnf_like a
-          | otherwise                = True
-          -- isMarkedStrict: Respect Note [Strict fields in Core]
 
         mb_str_marks id
           | Just dc <- isDataConWorkId_maybe id


=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -2241,6 +2241,20 @@ decideFloatInfo FIA{fia_levity=lev, fia_demand=dmd, fia_is_hnf=is_hnf,
   | Lifted   <- lev       = (LetBound, TopLvlFloatable)
       -- And these float freely but can't be speculated, hence LetBound
 
+instance Outputable FloatInfoArgs where
+  ppr FIA{ fia_levity=lev, fia_demand=dmd, fia_is_hnf=hnf, fia_is_triv=triv
+         , fia_is_string=string, fia_is_dc_worker=dcw, fia_ok_for_spec=ok_spec }
+    = brackets (pprWithCommas id traits)
+    where
+      traits = concat $
+        [ [ text "string" | string ]
+        , [ text "DataCon worker binding" | dcw ]
+        , [ text "trivial" | triv ]
+        , [ text "unlifted" | Unlifted <- pure lev ]
+        , [ text "strict" | isStrUsedDmd dmd ]
+        , [ if hnf then text "hnf" else if ok_spec then text "ok-for-spec" else empty ]
+        ]
+
 mkCaseFloat :: Id -> CpeRhs -> FloatingBind
 mkCaseFloat bndr scrut
   = -- pprTrace "mkCaseFloat" (ppr bndr <+> ppr (bound,info)
@@ -2262,13 +2276,12 @@ mkCaseFloat bndr scrut
 mkNonRecFloat :: CorePrepEnv -> Levity -> Id -> CpeRhs -> (FloatingBind, Id)
 mkNonRecFloat env lev bndr rhs
   = -- pprTrace "mkNonRecFloat" (ppr bndr <+> ppr (bound,info)
-    --                             <+> if is_strict then text "strict" else if is_lifted then text "lazy" else text "unlifted"
-    --                             <+> if ok_for_spec then text "ok-for-spec" else empty
-    --                             <+> if evald then text "evald" else empty
+    --                           $$ ppr float_info_args
     --                           $$ ppr rhs) $
     (Float (NonRec bndr' rhs) bound info, bndr')
   where
-    !(bound, info) = decideFloatInfo $ (defFloatInfoArgs bndr rhs)
+    !(bound, info) = decideFloatInfo float_info_args
+    float_info_args = (defFloatInfoArgs bndr rhs)
       { fia_levity = lev
       , fia_is_hnf = is_hnf
       , fia_ok_for_spec = ok_for_spec


=====================================
compiler/GHC/Utils/Misc.hs
=====================================
@@ -23,7 +23,7 @@ module GHC.Utils.Misc (
 
         dropWhileEndLE, spanEnd, last2, lastMaybe, onJust,
 
-        List.foldl1', foldl2, count, countWhile, all2, any2, all2Prefix, all3Prefix,
+        List.foldl1', foldl2, count, countWhile, all2, any2,
 
         lengthExceeds, lengthIs, lengthIsNot,
         lengthAtLeast, lengthAtMost, lengthLessThan,
@@ -658,36 +658,6 @@ any2 :: (a -> b -> Bool) -> [a] -> [b] -> Bool
 any2 p (x:xs) (y:ys) = p x y || any2 p xs ys
 any2 _ _      _      = False
 
-all2Prefix :: forall a b. (a -> b -> Bool) -> [a] -> [b] -> Bool
--- ^ `all2Prefix p xs ys` is a fused version of `and $ zipWith2 p xs ys`.
--- (It generates good code nonetheless.)
--- So if one list is shorter than the other, `p` is assumed to be `True` for the
--- suffix.
-all2Prefix p = foldr k z
-  where
-    k :: a -> ([b] -> Bool) -> [b] -> Bool
-    k x go ys' = case ys' of
-      (y:ys'') -> p x y && go ys''
-      _ -> True
-    z :: [b] -> Bool
-    z _ = True
-{-# INLINE all2Prefix #-}
-
-all3Prefix :: forall a b c. (a -> b -> c -> Bool) -> [a] -> [b] -> [c] -> Bool
--- ^ `all3Prefix p xs ys zs` is a fused version of `and $ zipWith3 p xs ys zs`.
--- (It generates good code nonetheless.)
--- So if one list is shorter than the others, `p` is assumed to be `True` for
--- the suffix.
-all3Prefix p = foldr k z
-  where
-    k :: a -> ([b] -> [c] -> Bool) -> [b] -> [c] -> Bool
-    k x go ys' zs' = case (ys',zs') of
-      (y:ys'',z:zs'') -> p x y z && go ys'' zs''
-      _ -> False
-    z :: [b] -> [c] -> Bool
-    z _ _ = True
-{-# INLINE all3Prefix #-}
-
 -- Count the number of times a predicate is true
 
 count :: (a -> Bool) -> [a] -> Int


=====================================
testsuite/tests/simplCore/should_compile/T25423.hs
=====================================
@@ -0,0 +1,8 @@
+{-# LANGUAGE BangPatterns #-}
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
+module T25423 where
+
+f :: Int -> Int -> (# (# Int, Int #), (# Int, Int #) #)
+f x y = let !p = (# x, y #) in (# p, p #)


=====================================
testsuite/tests/simplCore/should_compile/T25423.stderr
=====================================
@@ -0,0 +1,41 @@
+
+==================== Tidy Core ====================
+Result size of Tidy Core = {terms: 24, types: 75, coercions: 0, joins: 0/1}
+
+-- RHS size: {terms: 9, types: 39, coercions: 0, joins: 0/1}
+f :: Int -> Int -> (# (# Int, Int #), (# Int, Int #) #)
+[GblId, Arity=2, Str=<L><L>, Cpr=1(1, 1), Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=True)}]
+f = \ (x :: Int) (y :: Int) ->
+      let {
+        p :: (# Int, Int #)
+        [LclId, Unf=OtherCon []]
+        p = (# x, y #) } in
+      (# p, p #)
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule4 :: GHC.Prim.Addr#
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 0}]
+T25423.$trModule4 = "main"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule3 :: GHC.Types.TrName
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule3 = GHC.Types.TrNameS T25423.$trModule4
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule2 :: GHC.Prim.Addr#
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 0}]
+T25423.$trModule2 = "T25423"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule1 :: GHC.Types.TrName
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule1 = GHC.Types.TrNameS T25423.$trModule2
+
+-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
+T25423.$trModule :: GHC.Types.Module
+[GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+T25423.$trModule = GHC.Types.Module T25423.$trModule3 T25423.$trModule1
+
+
+


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -532,3 +532,4 @@ test('T24725a', [ grep_errmsg(r'testedRule')], compile, ['-O -ddump-rule-firings
 test('T25033', normal, compile, ['-O'])
 test('T25160', normal, compile, ['-O -ddump-rules'])
 test('T25197', [req_th, extra_files(["T25197_TH.hs"]), only_ways(['optasm'])], multimod_compile, ['T25197', '-O2 -v0'])
+test('T25423', [ grep_errmsg(r'let') ], compile, ['-O -ddump-simpl -dsuppress-uniques'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/742e886d8cad589a336ae010010210977619e6f4...05287d4f18caf8f4f1069978f6e315a5e891b0c5

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/742e886d8cad589a336ae010010210977619e6f4...05287d4f18caf8f4f1069978f6e315a5e891b0c5
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20241203/05a41ad7/attachment-0001.html>