[Git][ghc/ghc][wip/T20749] 5 commits: CorePrep: Refactor FloatingBind (#23442)

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Sun Oct 1 15:17:55 UTC 2023



Sebastian Graf pushed to branch wip/T20749 at Glasgow Haskell Compiler / GHC


Commits:
81bd91e3 by Sebastian Graf at 2023-10-01T15:12:24+02:00
CorePrep: Refactor FloatingBind (#23442)

A drastically improved architecture for local floating in CorePrep
that decouples the decision of whether a float is going to be let- or case-bound
from how far it can float (out of strict contexts, out of lazy contexts, to
top-level).

There are a couple of new Notes describing the effort:

  * `Note [Floating in CorePrep]` for the overview
  * `Note [BindInfo and FloatInfo]` for the new classification of floats
  * `Note [Floats and FloatDecision]` for how FloatInfo is used to inform
    floating decisions

This is necessary ground work for proper treatment of Strict fields and
unlifted values at top-level.

Fixes #23442.

- - - - -
9cd0892a by Sebastian Graf at 2023-10-01T15:12:24+02:00
Fix restarts in .ghcid

Using the whole of `hadrian/` restarted in a loop for me.

- - - - -
cb48d34f by Sebastian Graf at 2023-10-01T15:52:54+02:00
Make DataCon workers strict in strict fields (#20749)

This patch tweaks `exprIsConApp_maybe`, `exprIsHNF` and friends, and Demand
Analysis so that they exploit and maintain strictness of DataCon workers. See
`Note [Strict fields in Core]` for details.

Very little needed to change, and it puts field seq insertion done by Tag
Inference into a new perspective: That of *implementing* strict field semantics.
Before Tag Inference, DataCon workers are strict. Afterwards they are
effectively lazy and field seqs happen around use sites. History has shown
that there is no other way to guarantee taggedness and thus the STG Strict Field
Invariant.

Knock-on changes:

  * `exprIsHNF` previously used `exprOkForSpeculation` on unlifted arguments
    instead of recursing into `exprIsHNF`. That regressed the termination
    analysis in CPR analysis (which simply calls out to `exprIsHNF`), so I made
    it call `exprOkForSpeculation`, too.

  * There's a small regression in Demand Analysis, visible in the changed test
    output of T16859: Previously, a field seq on a variable would give that
    variable a "used exactly once" demand, now it's "used at least once",
    because `dmdTransformDataConSig` accounts for future uses of the field
    that actually all go through the case binder (and hence won't re-enter the
    potential thunk). The difference should hardly be observable.

  * The Simplifier's fast path for data constructors only applies to lazy
    data constructors now. I observed regressions involving Data.Binary.Put's
    `Pair` data type.

  * Unfortunately, T21392 does no longer reproduce after this patch, so I marked
    it as "not broken" in order to track whether we regress again in the future.

Fixes #20749, the satisfying conclusion of an annoying saga (cf. the ideas
in #21497 and #22475).

- - - - -
dffbd39a by Jaro Reinders at 2023-10-01T15:52:54+02:00
Try fixing allocation regressions

- - - - -
3f334d12 by Sebastian Graf at 2023-10-01T17:17:06+02:00
CorePrep: Attach evaldUnfolding to floats to detect more values

- - - - -


26 changed files:

- .ghcid
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Data/OrdList.hs
- compiler/GHC/Stg/InferTags.hs
- compiler/GHC/Stg/InferTags/Rewrite.hs
- compiler/GHC/Tc/TyCl/Build.hs
- compiler/GHC/Types/Demand.hs
- compiler/GHC/Types/Id/Info.hs
- compiler/GHC/Types/Id/Make.hs
- compiler/GHC/Utils/Misc.hs
- testsuite/tests/simplCore/should_compile/T18013.stderr
- testsuite/tests/simplCore/should_compile/all.T
- testsuite/tests/simplStg/should_compile/inferTags002.stderr
- testsuite/tests/stranal/sigs/T16859.stderr


Changes:

=====================================
.ghcid
=====================================
@@ -2,4 +2,4 @@
 --reload compiler
 --reload ghc
 --reload includes
---restart hadrian/
+--restart hadrian/ghci


=====================================
compiler/GHC/Builtin/Types.hs
=====================================
@@ -636,6 +636,8 @@ pcDataConWithFixity' declared_infix dc_name wrk_key rri
     -- See Note [Constructor tag allocation] and #14657
     data_con = mkDataCon dc_name declared_infix prom_info
                 (map (const no_bang) arg_tys)
+                (map (const HsLazy) arg_tys)
+                (map (const NotMarkedStrict) arg_tys)
                 []      -- No labelled fields
                 tyvars ex_tyvars
                 conc_tyvars


=====================================
compiler/GHC/Core.hs
=====================================
@@ -42,7 +42,7 @@ module GHC.Core (
         foldBindersOfBindStrict, foldBindersOfBindsStrict,
         collectBinders, collectTyBinders, collectTyAndValBinders,
         collectNBinders, collectNValBinders_maybe,
-        collectArgs, stripNArgs, collectArgsTicks, flattenBinds,
+        collectArgs, collectValArgs, stripNArgs, collectArgsTicks, flattenBinds,
         collectFunSimple,
 
         exprToType,
@@ -1005,6 +1005,60 @@ tail position: A cast changes the type, but the type must be the same. But
 operationally, casts are vacuous, so this is a bit unfortunate! See #14610 for
 ideas how to fix this.
 
+Note [Strict fields in Core]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Evaluating a data constructor worker evaluates its strict fields.
+
+In other words, if `MkT` is strict in its first field and `xs` reduces to
+`error "boom"`, then `MkT xs b` will throw that error.
+Conversely, it is sound to seq the field before the call to the constructor,
+e.g., with `case xs of xs' { __DEFAULT -> MkT xs' b }`.
+Let's call this transformation "field seq insertion".
+
+Note in particular that the data constructor application `MkT xs b` above is
+*not* a value, unless `xs` is!
+
+This has pervasive effect on the Core pipeline:
+
+  * `exprIsHNF`/`exprIsConLike`/`exprOkForSpeculation` need to assert that the
+    strict arguments of a DataCon worker are values/ok-for-spec themselves.
+
+  * `exprIsConApp_maybe` inserts field seqs in the `FloatBind`s it returns, so
+    that the Simplifier, Constant-folding, the pattern-match checker, etc.
+    all see the insert field seqs when they match on strict workers. Often this
+    is just to emphasise strict semantics, but for case-of-known constructor
+    and case-to-let field insertion is *vital*, otherwise these transformations
+    would lose field seqs.
+
+  * The demand signature of a data constructor is strict in strict field
+    position, whereas is it's normally lazy. Likewise the demand *transformer*
+    of a DataCon worker can add stricten up demands on strict field args.
+    See Note [Demand transformer for data constructors].
+
+  * In the absence of `-fpedantic-bottoms`, it is still possible that some seqs
+    are ultimately dropped or delayed due to eta-expansion.
+    See Note [Dealing with bottom].
+
+Strict field semantics is exploited in STG by Note [Tag Inference]:
+It performs field seq insertion to statically guarantee *taggedness* of strict
+fields, establishing the Note [STG Strict Field Invariant]. (Happily, most
+of those seqs are immediately detected as redundant by tag inference and are
+omitted.) From then on, DataCon worker semantics are actually lazy, hence it is
+important that STG passes maintain the Strict Field Invariant.
+
+
+Historical Note:
+The delightfully simple description of strict field semantics is the result of
+a long saga (#20749, the bits about strict data constructors in #21497, #22475),
+where we tried a more lenient (but actually not) semantics first that would
+allow both strict and lazy implementations of DataCon workers. This was favoured
+because the "pervasive effect" throughout the compiler was deemed too large
+(when it really turned out to be very modest).
+Alas, this semantics would require us to implement `exprIsHNF` in *exactly* the
+same way as above, otherwise the analysis would not be conservative wrt. the
+lenient semantics (which includes the strict one). It is also much harder to
+explain and maintain, as it turned out.
+
 ************************************************************************
 *                                                                      *
             In/Out type synonyms
@@ -2091,6 +2145,17 @@ collectArgs expr
     go (App f a) as = go f (a:as)
     go e         as = (e, as)
 
+-- | Takes a nested application expression and returns the function
+-- being applied and the arguments to which it is applied
+collectValArgs :: Expr b -> (Expr b, [Arg b])
+collectValArgs expr
+  = go expr []
+  where
+    go (App f a) as
+      | isValArg a  = go f (a:as)
+      | otherwise   = go f as
+    go e         as = (e, as)
+
 -- | Takes a nested application expression and returns the function
 -- being applied. Looking through casts and ticks to find it.
 collectFunSimple :: Expr b -> Expr b


=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -49,7 +49,8 @@ module GHC.Core.DataCon (
         dataConIsInfix,
         dataConWorkId, dataConWrapId, dataConWrapId_maybe,
         dataConImplicitTyThings,
-        dataConRepStrictness, dataConImplBangs, dataConBoxer,
+        dataConRepStrictness, dataConRepStrictness_maybe,
+        dataConImplBangs, dataConBoxer,
 
         splitDataProductType_maybe,
 
@@ -60,7 +61,7 @@ module GHC.Core.DataCon (
         isVanillaDataCon, isNewDataCon, isTypeDataCon,
         classDataCon, dataConCannotMatch,
         dataConUserTyVarsNeedWrapper, checkDataConTyVars,
-        isBanged, isMarkedStrict, cbvFromStrictMark, eqHsBang, isSrcStrict, isSrcUnpacked,
+        isBanged, isUnpacked, isMarkedStrict, cbvFromStrictMark, eqHsBang, isSrcStrict, isSrcUnpacked,
         specialPromotedDc,
 
         -- ** Promotion related functions
@@ -97,6 +98,7 @@ import GHC.Types.Unique.FM ( UniqFM )
 import GHC.Types.Unique.Set
 import GHC.Builtin.Uniques( mkAlphaTyVarUnique )
 import GHC.Data.Graph.UnVar  -- UnVarSet and operations
+import GHC.Data.Maybe (orElse)
 
 import {-# SOURCE #-} GHC.Tc.Utils.TcType ( ConcreteTyVars )
 
@@ -524,6 +526,18 @@ data DataCon
                 -- Matches 1-1 with dcOrigArgTys
                 -- Hence length = dataConSourceArity dataCon
 
+        dcImplBangs :: [HsImplBang],
+                -- The actual decisions made (including failures)
+                -- about the original arguments; 1-1 with orig_arg_tys
+                -- See Note [Bangs on data constructor arguments]
+
+        dcStricts :: [StrictnessMark],
+                -- One mark for every field of the DataCon worker;
+                -- if it's empty, then all fields are lazy,
+                -- otherwise it has the same length as dataConRepArgTys.
+                -- See also Note [Strict fields in Core] in GHC.Core
+                -- for the effect on the strictness signature
+
         dcFields  :: [FieldLabel],
                 -- Field labels for this constructor, in the
                 -- same order as the dcOrigArgTys;
@@ -826,13 +840,6 @@ data DataConRep
                                           -- after unboxing and flattening,
                                           -- and *including* all evidence args
 
-        , dcr_stricts :: [StrictnessMark]  -- 1-1 with dcr_arg_tys
-                -- See also Note [Data-con worker strictness]
-
-        , dcr_bangs :: [HsImplBang]  -- The actual decisions made (including failures)
-                                     -- about the original arguments; 1-1 with orig_arg_tys
-                                     -- See Note [Bangs on data constructor arguments]
-
     }
 
 type DataConEnv a = UniqFM DataCon a     -- Keyed by DataCon
@@ -901,43 +908,8 @@ eqSpecPreds spec = [ mkPrimEqPred (mkTyVarTy tv) ty
 instance Outputable EqSpec where
   ppr (EqSpec tv ty) = ppr (tv, ty)
 
-{- Note [Data-con worker strictness]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Notice that we do *not* say the worker Id is strict even if the data
-constructor is declared strict
-     e.g.    data T = MkT ![Int] Bool
-Even though most often the evals are done by the *wrapper* $WMkT, there are
-situations in which tag inference will re-insert evals around the worker.
-So for all intents and purposes the *worker* MkT is strict, too!
-
-Unfortunately, if we exposed accurate strictness of DataCon workers, we'd
-see the following transformation:
-
-  f xs = case xs of xs' { __DEFAULT -> ... case MkT xs b of x { __DEFAULT -> [x] } } -- DmdAnal: Strict in xs
-  ==> { drop-seq, binder swap on xs' }
-  f xs = case MkT xs b of x { __DEFAULT -> [x] } -- DmdAnal: Still strict in xs
-  ==> { case-to-let }
-  f xs = let x = MkT xs' b in [x] -- DmdAnal: No longer strict in xs!
-
-I.e., we are ironically losing strictness in `xs` by dropping the eval on `xs`
-and then doing case-to-let. The issue is that `exprIsHNF` currently says that
-every DataCon worker app is a value. The implicit assumption is that surrounding
-evals will have evaluated strict fields like `xs` before! But now that we had
-just dropped the eval on `xs`, that assumption is no longer valid.
-
-Long story short: By keeping the demand signature lazy, the Simplifier will not
-drop the eval on `xs` and using `exprIsHNF` to decide case-to-let and others
-remains sound.
-
-Similarly, during demand analysis in dmdTransformDataConSig, we bump up the
-field demand with `C_01`, *not* `C_11`, because the latter exposes too much
-strictness that will drop the eval on `xs` above.
-
-This issue is discussed at length in
-"Failed idea: no wrappers for strict data constructors" in #21497 and #22475.
-
-Note [Bangs on data constructor arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+{- Note [Bangs on data constructor arguments]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider
   data T = MkT !Int {-# UNPACK #-} !Int Bool
 
@@ -963,8 +935,8 @@ Terminology:
   the flag settings in the importing module.
   Also see Note [Bangs on imported data constructors] in GHC.Types.Id.Make
 
-* The dcr_bangs field of the dcRep field records the [HsImplBang]
-  If T was defined in this module, Without -O the dcr_bangs might be
+* The dcImplBangs field records the [HsImplBang]
+  If T was defined in this module, Without -O the dcImplBangs might be
     [HsStrict _, HsStrict _, HsLazy]
   With -O it might be
     [HsStrict _, HsUnpack _, HsLazy]
@@ -973,6 +945,17 @@ Terminology:
   With -XStrictData it might be
     [HsStrict _, HsUnpack _, HsStrict _]
 
+* Core passes will often need to know whether the DataCon worker or wrapper in
+  an application is strict in some (lifted) field or not. This is tracked in the
+  demand signature attached to a DataCon's worker resp. wrapper Id.
+
+  So if you've got a DataCon dc, you can get the demand signature by
+  `idDmdSig (dataConWorkId dc)` and make out strict args by testing with
+  `isStrictDmd`. Similarly, `idDmdSig <$> dataConWrapId_maybe dc` gives
+  you the demand signature of the wrapper, if it exists.
+
+  These demand signatures are set in GHC.Types.Id.Make.
+
 Note [Detecting useless UNPACK pragmas]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We want to issue a warning when there's an UNPACK pragma in the source code,
@@ -1008,7 +991,6 @@ we consult HsImplBang:
 The boolean flag is used only for this warning.
 See #11270 for motivation.
 
-
 ************************************************************************
 *                                                                      *
 \subsection{Instances}
@@ -1110,6 +1092,11 @@ isBanged (HsUnpack {}) = True
 isBanged (HsStrict {}) = True
 isBanged HsLazy        = False
 
+isUnpacked :: HsImplBang -> Bool
+isUnpacked (HsUnpack {}) = True
+isUnpacked (HsStrict {}) = False
+isUnpacked HsLazy        = False
+
 isSrcStrict :: SrcStrictness -> Bool
 isSrcStrict SrcStrict = True
 isSrcStrict _ = False
@@ -1135,13 +1122,15 @@ cbvFromStrictMark MarkedStrict = MarkedCbv
 
 -- | Build a new data constructor
 mkDataCon :: Name
-          -> Bool           -- ^ Is the constructor declared infix?
-          -> TyConRepName   -- ^  TyConRepName for the promoted TyCon
-          -> [HsSrcBang]    -- ^ Strictness/unpack annotations, from user
-          -> [FieldLabel]   -- ^ Field labels for the constructor,
-                            -- if it is a record, otherwise empty
-          -> [TyVar]        -- ^ Universals.
-          -> [TyCoVar]      -- ^ Existentials.
+          -> Bool               -- ^ Is the constructor declared infix?
+          -> TyConRepName       -- ^  TyConRepName for the promoted TyCon
+          -> [HsSrcBang]        -- ^ Strictness/unpack annotations, from user
+          -> [HsImplBang]       -- ^ Strictness/unpack annotations, as inferred by the compiler
+          -> [StrictnessMark]   -- ^ Strictness marks for the DataCon worker's fields in Core
+          -> [FieldLabel]       -- ^ Field labels for the constructor,
+                                -- if it is a record, otherwise empty
+          -> [TyVar]            -- ^ Universals.
+          -> [TyCoVar]          -- ^ Existentials.
           -> ConcreteTyVars
                                 -- ^ TyVars which must be instantiated with
                                 -- concrete types
@@ -1163,7 +1152,9 @@ mkDataCon :: Name
   -- Can get the tag from the TyCon
 
 mkDataCon name declared_infix prom_info
-          arg_stricts   -- Must match orig_arg_tys 1-1
+          arg_stricts  -- Must match orig_arg_tys 1-1
+          impl_bangs   -- Must match orig_arg_tys 1-1
+          str_marks    -- Must be empty or match dataConRepArgTys 1-1
           fields
           univ_tvs ex_tvs conc_tvs user_tvbs
           eq_spec theta
@@ -1180,6 +1171,8 @@ mkDataCon name declared_infix prom_info
   = con
   where
     is_vanilla = null ex_tvs && null eq_spec && null theta
+    str_marks' | not $ any isMarkedStrict str_marks = []
+               | otherwise                          = str_marks
 
     con = MkData {dcName = name, dcUnique = nameUnique name,
                   dcVanilla = is_vanilla, dcInfix = declared_infix,
@@ -1192,7 +1185,8 @@ mkDataCon name declared_infix prom_info
                   dcStupidTheta = stupid_theta,
                   dcOrigArgTys = orig_arg_tys, dcOrigResTy = orig_res_ty,
                   dcRepTyCon = rep_tycon,
-                  dcSrcBangs = arg_stricts,
+                  dcSrcBangs = arg_stricts, dcImplBangs = impl_bangs,
+                  dcStricts = str_marks',
                   dcFields = fields, dcTag = tag, dcRepType = rep_ty,
                   dcWorkId = work_id,
                   dcRep = rep,
@@ -1436,19 +1430,27 @@ isNullaryRepDataCon :: DataCon -> Bool
 isNullaryRepDataCon dc = dataConRepArity dc == 0
 
 dataConRepStrictness :: DataCon -> [StrictnessMark]
--- ^ Give the demands on the arguments of a
--- Core constructor application (Con dc args)
-dataConRepStrictness dc = case dcRep dc of
-                            NoDataConRep -> [NotMarkedStrict | _ <- dataConRepArgTys dc]
-                            DCR { dcr_stricts = strs } -> strs
+-- ^ Give the demands on the runtime arguments of a Core DataCon worker
+-- application.
+-- The length of the list matches `dataConRepArgTys` (e.g., the number
+-- of runtime arguments).
+dataConRepStrictness dc
+  = dataConRepStrictness_maybe dc
+    `orElse` map (const NotMarkedStrict) (dataConRepArgTys dc)
+
+dataConRepStrictness_maybe :: DataCon -> Maybe [StrictnessMark]
+-- ^ Give the demands on the runtime arguments of a Core DataCon worker
+-- application or `Nothing` if all of them are lazy.
+-- The length of the list matches `dataConRepArgTys` (e.g., the number
+-- of runtime arguments).
+dataConRepStrictness_maybe dc
+  | null (dcStricts dc) = Nothing
+  | otherwise           = Just (dcStricts dc)
 
 dataConImplBangs :: DataCon -> [HsImplBang]
 -- The implementation decisions about the strictness/unpack of each
 -- source program argument to the data constructor
-dataConImplBangs dc
-  = case dcRep dc of
-      NoDataConRep              -> replicate (dcSourceArity dc) HsLazy
-      DCR { dcr_bangs = bangs } -> bangs
+dataConImplBangs dc = dcImplBangs dc
 
 dataConBoxer :: DataCon -> Maybe DataConBoxer
 dataConBoxer (MkData { dcRep = DCR { dcr_boxer = boxer } }) = Just boxer


=====================================
compiler/GHC/Core/Opt/Arity.hs
=====================================
@@ -1463,7 +1463,7 @@ myExprIsCheap (AE { am_opts = opts, am_sigs = sigs }) e mb_ty
         -- See Note [Eta expanding through dictionaries]
         -- See Note [Eta expanding through CallStacks]
 
-    cheap_fun e = exprIsCheapX (myIsCheapApp sigs) e
+    cheap_fun e = exprIsCheapX (myIsCheapApp sigs) False e
 
 -- | A version of 'isCheapApp' that considers results from arity analysis.
 -- See Note [Arity analysis] for what's in the signature environment and why


=====================================
compiler/GHC/Core/Opt/ConstantFold.hs
=====================================
@@ -47,7 +47,7 @@ import GHC.Core
 import GHC.Core.Make
 import GHC.Core.SimpleOpt (  exprIsConApp_maybe, exprIsLiteral_maybe )
 import GHC.Core.DataCon ( DataCon,dataConTagZ, dataConTyCon, dataConWrapId, dataConWorkId )
-import GHC.Core.Utils  ( cheapEqExpr, exprIsHNF
+import GHC.Core.Utils  ( cheapEqExpr, exprOkForSpeculation
                        , stripTicksTop, stripTicksTopT, mkTicks )
 import GHC.Core.Multiplicity
 import GHC.Core.Rules.Config
@@ -2101,7 +2101,7 @@ Things to note
 
 Implementing seq#.  The compiler has magic for SeqOp in
 
-- GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <whnf> s)
+- GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <ok-for-spec> s)
 
 - GHC.StgToCmm.Expr.cgExpr, and cgCase: special case for seq#
 
@@ -2113,7 +2113,7 @@ Implementing seq#.  The compiler has magic for SeqOp in
 seqRule :: RuleM CoreExpr
 seqRule = do
   [Type _ty_a, Type _ty_s, a, s] <- getArgs
-  guard $ exprIsHNF a
+  guard $ exprOkForSpeculation a
   return $ mkCoreUnboxedTuple [s, a]
 
 -- spark# :: forall a s . a -> State# s -> (# State# s, a #)


=====================================
compiler/GHC/Core/Opt/CprAnal.hs
=====================================
@@ -296,9 +296,16 @@ data TermFlag -- Better than using a Bool
 
 -- See Note [Nested CPR]
 exprTerminates :: CoreExpr -> TermFlag
+-- ^ A /very/ simple termination analysis.
 exprTerminates e
-  | exprIsHNF e = Terminates -- A /very/ simple termination analysis.
-  | otherwise   = MightDiverge
+  | exprIsHNF e            = Terminates
+  | exprOkForSpeculation e = Terminates
+  | otherwise              = MightDiverge
+  -- Annyingly, we have to check both for HNF and ok-for-spec.
+  --   * `I# (x# *# 2#)` is ok-for-spec, but not in HNF. Still worth CPR'ing!
+  --   * `lvl` is an HNF if its unfolding is evaluated
+  --     (perhaps `lvl = I# 0#` at top-level). But, tiresomely, it is never
+  --     ok-for-spec due to Note [exprOkForSpeculation and evaluated variables].
 
 cprAnalApp :: AnalEnv -> CoreExpr -> [(CprType, CoreArg)] -> (CprType, CoreExpr)
 -- Main function that takes care of /nested/ CPR. See Note [Nested CPR]


=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -824,6 +824,10 @@ to the Divergence lattice, but in practice it turned out to be hard to untaint
 from 'topDiv' to 'conDiv', leading to bugs, performance regressions and
 complexity that didn't justify the single fixed testcase T13380c.
 
+You might think that we should check for side-effects rather than just for
+precise exceptions. Right you are! See Note [Side-effects and strictness]
+for why we unfortunately do not.
+
 Note [Demand analysis for recursive data constructors]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 T11545 features a single-product, recursive data type


=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -8,14 +8,13 @@
 
 module GHC.Core.Opt.Simplify.Env (
         -- * The simplifier mode
-        SimplMode(..), updMode,
-        smPedanticBottoms, smPlatform,
+        SimplMode(..), updMode, smPlatform,
 
         -- * Environments
         SimplEnv(..), pprSimplEnv,   -- Temp not abstract
         seArityOpts, seCaseCase, seCaseFolding, seCaseMerge, seCastSwizzle,
         seDoEtaReduction, seEtaExpand, seFloatEnable, seInline, seNames,
-        seOptCoercionOpts, sePedanticBottoms, sePhase, sePlatform, sePreInline,
+        seOptCoercionOpts, sePhase, sePlatform, sePreInline,
         seRuleOpts, seRules, seUnfoldingOpts,
         mkSimplEnv, extendIdSubst,
         extendTvSubst, extendCvSubst,
@@ -219,9 +218,6 @@ seNames env = sm_names (seMode env)
 seOptCoercionOpts :: SimplEnv -> OptCoercionOpts
 seOptCoercionOpts env = sm_co_opt_opts (seMode env)
 
-sePedanticBottoms :: SimplEnv -> Bool
-sePedanticBottoms env = smPedanticBottoms (seMode env)
-
 sePhase :: SimplEnv -> CompilerPhase
 sePhase env = sm_phase (seMode env)
 
@@ -276,9 +272,6 @@ instance Outputable SimplMode where
          where
            pp_flag f s = ppUnless f (text "no") <+> s
 
-smPedanticBottoms :: SimplMode -> Bool
-smPedanticBottoms opts = ao_ped_bot (sm_arity_opts opts)
-
 smPlatform :: SimplMode -> Platform
 smPlatform opts = roPlatform (sm_rule_opts opts)
 


=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -33,7 +33,7 @@ import GHC.Core.Reduction
 import GHC.Core.Coercion.Opt    ( optCoercion )
 import GHC.Core.FamInstEnv      ( FamInstEnv, topNormaliseType_maybe )
 import GHC.Core.DataCon
-   ( DataCon, dataConWorkId, dataConRepStrictness
+   ( DataCon, dataConWorkId, dataConRepStrictness, dataConRepStrictness_maybe
    , dataConRepArgTys, isUnboxedTupleDataCon
    , StrictnessMark (..) )
 import GHC.Core.Opt.Stats ( Tick(..) )
@@ -2102,14 +2102,14 @@ zap the SubstEnv.  This is VITAL.  Consider
 We'll clone the inner \x, adding x->x' in the id_subst Then when we
 inline y, we must *not* replace x by x' in the inlined copy!!
 
-Note [Fast path for data constructors]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Fast path for lazy data constructors]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 For applications of a data constructor worker, the full glory of
 rebuildCall is a waste of effort;
 * They never inline, obviously
 * They have no rewrite rules
-* They are not strict (see Note [Data-con worker strictness]
-  in GHC.Core.DataCon)
+* Though they might be strict (see Note [Strict fields in Core] in GHC.Core),
+  we will exploit that strictness through their demand signature
 So it's fine to zoom straight to `rebuild` which just rebuilds the
 call in a very straightforward way.
 
@@ -2133,7 +2133,8 @@ simplVar env var
 
 simplIdF :: SimplEnv -> InId -> SimplCont -> SimplM (SimplFloats, OutExpr)
 simplIdF env var cont
-  | isDataConWorkId var         -- See Note [Fast path for data constructors]
+  | Just dc <- isDataConWorkId_maybe var      -- See Note [Fast path for lazy data constructors]
+  , Nothing <- dataConRepStrictness_maybe dc
   = rebuild env (Var var) cont
   | otherwise
   = case substId env var of
@@ -3318,7 +3319,7 @@ a case pattern.  This is *important*.  Consider
 
 We really must record that b is already evaluated so that we don't
 go and re-evaluate it when constructing the result.
-See Note [Data-con worker strictness] in GHC.Core.DataCon
+See Note [Strict fields in Core] in GHC.Core.
 
 NB: simplLamBndrs preserves this eval info
 


=====================================
compiler/GHC/Core/SimpleOpt.hs
=====================================
@@ -1263,11 +1263,8 @@ exprIsConApp_maybe ise@(ISE in_scope id_unf) expr
        --       simplifier produces rhs[exp/a], changing semantics if exp is not ok-for-spec
        -- Good: returning (Mk#, [x]) with a float of  case exp of x { DEFAULT -> [] }
        --       simplifier produces case exp of a { DEFAULT -> exp[x/a] }
-       = let arg' = subst_expr subst arg
-             bndr = uniqAway (subst_in_scope subst) (mkWildValBinder ManyTy arg_type)
-             float = FloatCase arg' bndr DEFAULT []
-             subst' = subst_extend_in_scope subst bndr
-         in go subst' (float:floats) fun (CC (Var bndr : args) co)
+       , (subst', float, bndr) <- case_bind subst arg arg_type
+       = go subst' (float:floats) fun (CC (Var bndr : args) co)
        | otherwise
        = go subst floats fun (CC (subst_expr subst arg : args) co)
 
@@ -1306,8 +1303,9 @@ exprIsConApp_maybe ise@(ISE in_scope id_unf) expr
 
         | Just con <- isDataConWorkId_maybe fun
         , count isValArg args == idArity fun
-        = succeedWith in_scope floats $
-          pushCoDataCon con args co
+        , (in_scope', seq_floats, args') <- mkFieldSeqFloats in_scope con args
+        = succeedWith in_scope' (seq_floats ++ floats) $
+          pushCoDataCon con args' co
 
         -- Look through data constructor wrappers: they inline late (See Note
         -- [Activation for data constructor wrappers]) but we want to do
@@ -1393,6 +1391,36 @@ exprIsConApp_maybe ise@(ISE in_scope id_unf) expr
     extend (Left in_scope) v e = Right (extendSubst (mkEmptySubst in_scope) v e)
     extend (Right s)       v e = Right (extendSubst s v e)
 
+    case_bind :: Either InScopeSet Subst -> CoreExpr -> Type -> (Either InScopeSet Subst, FloatBind, Id)
+    case_bind subst expr expr_ty = (subst', float, bndr)
+      where
+        bndr   = setCaseBndrEvald MarkedStrict $
+                 uniqAway (subst_in_scope subst) $
+                 mkWildValBinder ManyTy expr_ty
+        subst' = subst_extend_in_scope subst bndr
+        expr'  = subst_expr subst expr
+        float  = FloatCase expr' bndr DEFAULT []
+
+    mkFieldSeqFloats :: InScopeSet -> DataCon -> [CoreExpr] -> (InScopeSet, [FloatBind], [CoreExpr])
+    mkFieldSeqFloats in_scope dc args
+      | Nothing <- dataConRepStrictness_maybe dc
+      = (in_scope, [], args)
+      | otherwise
+      = (in_scope', floats', ty_args ++ val_args')
+      where
+        (ty_args, val_args) = splitAtList (dataConUnivAndExTyCoVars dc) args
+        (in_scope', floats', val_args') = foldr do_one (in_scope, [], []) $ zipEqual "mkFieldSeqFloats" str_marks val_args
+        str_marks = dataConRepStrictness dc
+        do_one (str, arg) (in_scope,floats,args)
+          | NotMarkedStrict <- str   = (in_scope, floats, arg:args)
+          | Var v <- arg, is_evald v = (in_scope, floats, arg:args)
+          | otherwise                = (in_scope', float:floats, Var bndr:args)
+          where
+            is_evald v = isId v && isEvaldUnfolding (idUnfolding v)
+            (in_scope', float, bndr) =
+               case case_bind (Left in_scope) arg (exprType arg) of
+                 (Left in_scope', float, bndr) -> (in_scope', float, bndr)
+                 (right, _, _) -> pprPanic "case_bind did not preserve Left" (ppr in_scope $$ ppr arg $$ ppr right)
 
 -- See Note [exprIsConApp_maybe on literal strings]
 dealWithStringLiteral :: Var -> BS.ByteString -> Coercion


=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -55,7 +55,7 @@ module GHC.Core.Type (
         splitForAllForAllTyBinders, splitForAllForAllTyBinder_maybe,
         splitForAllTyCoVar_maybe, splitForAllTyCoVar,
         splitForAllTyVar_maybe, splitForAllCoVar_maybe,
-        splitPiTy_maybe, splitPiTy, splitPiTys,
+        splitPiTy_maybe, splitPiTy, splitPiTys, collectPiTyBinders,
         getRuntimeArgTys,
         mkTyConBindersPreferAnon,
         mkPiTy, mkPiTys,
@@ -292,6 +292,7 @@ import GHC.Data.FastString
 
 import Control.Monad    ( guard )
 import GHC.Data.Maybe   ( orElse, isJust )
+import GHC.List (build)
 
 -- $type_classification
 -- #type_classification#
@@ -2004,6 +2005,18 @@ splitPiTys ty = split ty ty []
     split orig_ty ty bs | Just ty' <- coreView ty = split orig_ty ty' bs
     split orig_ty _                bs = (reverse bs, orig_ty)
 
+collectPiTyBinders :: Type -> [PiTyBinder]
+collectPiTyBinders ty = build $ \c n ->
+  let
+    split (ForAllTy b res) = Named b `c` split res
+    split (FunTy { ft_af = af, ft_mult = w, ft_arg = arg, ft_res = res })
+                           = Anon (Scaled w arg) af `c` split res
+    split ty | Just ty' <- coreView ty = split ty'
+    split _                = n
+  in
+    split ty
+{-# INLINE collectPiTyBinders #-}
+
 -- | Extracts a list of run-time arguments from a function type,
 -- looking through newtypes to the right of arrows.
 --


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -1269,18 +1269,23 @@ in this (which it previously was):
             in \w. v True
 -}
 
---------------------
-exprIsWorkFree :: CoreExpr -> Bool   -- See Note [exprIsWorkFree]
-exprIsWorkFree e = exprIsCheapX isWorkFreeApp e
-
-exprIsCheap :: CoreExpr -> Bool
-exprIsCheap e = exprIsCheapX isCheapApp e
+-------------------------------------
+type CheapAppFun = Id -> Arity -> Bool
+  -- Is an application of this function to n *value* args
+  -- always cheap, assuming the arguments are cheap?
+  -- True mainly of data constructors, partial applications;
+  -- but with minor variations:
+  --    isWorkFreeApp
+  --    isCheapApp
+  --    isExpandableApp
 
-exprIsCheapX :: CheapAppFun -> CoreExpr -> Bool
+exprIsCheapX :: CheapAppFun -> Bool -> CoreExpr -> Bool
 {-# INLINE exprIsCheapX #-}
--- allow specialization of exprIsCheap and exprIsWorkFree
+-- allow specialization of exprIsCheap, exprIsWorkFree and exprIsExpandable
 -- instead of having an unknown call to ok_app
-exprIsCheapX ok_app e
+-- expandable: Only True for exprIsExpandable, where Case and Let are never
+--             expandable.
+exprIsCheapX ok_app expandable e
   = ok e
   where
     ok e = go 0 e
@@ -1291,7 +1296,7 @@ exprIsCheapX ok_app e
     go _ (Type {})                    = True
     go _ (Coercion {})                = True
     go n (Cast e _)                   = go n e
-    go n (Case scrut _ _ alts)        = ok scrut &&
+    go n (Case scrut _ _ alts)        = not expandable && ok scrut &&
                                         and [ go n rhs | Alt _ _ rhs <- alts ]
     go n (Tick t e) | tickishCounts t = False
                     | otherwise       = go n e
@@ -1299,90 +1304,26 @@ exprIsCheapX ok_app e
                     | otherwise       = go n e
     go n (App f e)  | isRuntimeArg e  = go (n+1) f && ok e
                     | otherwise       = go n f
-    go n (Let (NonRec _ r) e)         = go n e && ok r
-    go n (Let (Rec prs) e)            = go n e && all (ok . snd) prs
+    go n (Let (NonRec _ r) e)         = not expandable && go n e && ok r
+    go n (Let (Rec prs) e)            = not expandable && go n e && all (ok . snd) prs
 
       -- Case: see Note [Case expressions are work-free]
       -- App, Let: see Note [Arguments and let-bindings exprIsCheapX]
 
+--------------------
+exprIsWorkFree :: CoreExpr -> Bool
+-- See Note [exprIsWorkFree]
+exprIsWorkFree e = exprIsCheapX isWorkFreeApp False e
 
-{- Note [exprIsExpandable]
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-An expression is "expandable" if we are willing to duplicate it, if doing
-so might make a RULE or case-of-constructor fire.  Consider
-   let x = (a,b)
-       y = build g
-   in ....(case x of (p,q) -> rhs)....(foldr k z y)....
-
-We don't inline 'x' or 'y' (see Note [Lone variables] in GHC.Core.Unfold),
-but we do want
-
- * the case-expression to simplify
-   (via exprIsConApp_maybe, exprIsLiteral_maybe)
-
- * the foldr/build RULE to fire
-   (by expanding the unfolding during rule matching)
-
-So we classify the unfolding of a let-binding as "expandable" (via the
-uf_expandable field) if we want to do this kind of on-the-fly
-expansion.  Specifically:
-
-* True of constructor applications (K a b)
-
-* True of applications of a "CONLIKE" Id; see Note [CONLIKE pragma] in GHC.Types.Basic.
-  (NB: exprIsCheap might not be true of this)
-
-* False of case-expressions.  If we have
-    let x = case ... in ...(case x of ...)...
-  we won't simplify.  We have to inline x.  See #14688.
-
-* False of let-expressions (same reason); and in any case we
-  float lets out of an RHS if doing so will reveal an expandable
-  application (see SimplEnv.doFloatFromRhs).
-
-* Take care: exprIsExpandable should /not/ be true of primops.  I
-  found this in test T5623a:
-    let q = /\a. Ptr a (a +# b)
-    in case q @ Float of Ptr v -> ...q...
-
-  q's inlining should not be expandable, else exprIsConApp_maybe will
-  say that (q @ Float) expands to (Ptr a (a +# b)), and that will
-  duplicate the (a +# b) primop, which we should not do lightly.
-  (It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.)
--}
+--------------------
+exprIsCheap :: CoreExpr -> Bool
+-- See Note [exprIsCheap]
+exprIsCheap e = exprIsCheapX isCheapApp False e
 
--------------------------------------
+--------------------
 exprIsExpandable :: CoreExpr -> Bool
 -- See Note [exprIsExpandable]
-exprIsExpandable e
-  = ok e
-  where
-    ok e = go 0 e
-
-    -- n is the number of value arguments
-    go n (Var v)                      = isExpandableApp v n
-    go _ (Lit {})                     = True
-    go _ (Type {})                    = True
-    go _ (Coercion {})                = True
-    go n (Cast e _)                   = go n e
-    go n (Tick t e) | tickishCounts t = False
-                    | otherwise       = go n e
-    go n (Lam x e)  | isRuntimeVar x  = n==0 || go (n-1) e
-                    | otherwise       = go n e
-    go n (App f e)  | isRuntimeArg e  = go (n+1) f && ok e
-                    | otherwise       = go n f
-    go _ (Case {})                    = False
-    go _ (Let {})                     = False
-
-
--------------------------------------
-type CheapAppFun = Id -> Arity -> Bool
-  -- Is an application of this function to n *value* args
-  -- always cheap, assuming the arguments are cheap?
-  -- True mainly of data constructors, partial applications;
-  -- but with minor variations:
-  --    isWorkFreeApp
-  --    isCheapApp
+exprIsExpandable e = exprIsCheapX isExpandableApp True e
 
 isWorkFreeApp :: CheapAppFun
 isWorkFreeApp fn n_val_args
@@ -1402,7 +1343,7 @@ isCheapApp fn n_val_args
   | isDeadEndId fn              = True  -- See Note [isCheapApp: bottoming functions]
   | otherwise
   = case idDetails fn of
-      DataConWorkId {} -> True  -- Actually handled by isWorkFreeApp
+      -- DataConWorkId {} -> _  -- Handled by isWorkFreeApp
       RecSelId {}      -> n_val_args == 1  -- See Note [Record selection]
       ClassOpId {}     -> n_val_args == 1
       PrimOpId op _    -> primOpIsCheap op
@@ -1417,6 +1358,7 @@ isExpandableApp fn n_val_args
   | isWorkFreeApp fn n_val_args = True
   | otherwise
   = case idDetails fn of
+      -- DataConWorkId {} -> _  -- Handled by isWorkFreeApp
       RecSelId {}  -> n_val_args == 1  -- See Note [Record selection]
       ClassOpId {} -> n_val_args == 1
       PrimOpId {}  -> False
@@ -1448,6 +1390,50 @@ isExpandableApp fn n_val_args
 I'm not sure why we have a special case for bottoming
 functions in isCheapApp.  Maybe we don't need it.
 
+Note [exprIsExpandable]
+~~~~~~~~~~~~~~~~~~~~~~~
+An expression is "expandable" if we are willing to duplicate it, if doing
+so might make a RULE or case-of-constructor fire.  Consider
+   let x = (a,b)
+       y = build g
+   in ....(case x of (p,q) -> rhs)....(foldr k z y)....
+
+We don't inline 'x' or 'y' (see Note [Lone variables] in GHC.Core.Unfold),
+but we do want
+
+ * the case-expression to simplify
+   (via exprIsConApp_maybe, exprIsLiteral_maybe)
+
+ * the foldr/build RULE to fire
+   (by expanding the unfolding during rule matching)
+
+So we classify the unfolding of a let-binding as "expandable" (via the
+uf_expandable field) if we want to do this kind of on-the-fly
+expansion.  Specifically:
+
+* True of constructor applications (K a b)
+
+* True of applications of a "CONLIKE" Id; see Note [CONLIKE pragma] in GHC.Types.Basic.
+  (NB: exprIsCheap might not be true of this)
+
+* False of case-expressions.  If we have
+    let x = case ... in ...(case x of ...)...
+  we won't simplify.  We have to inline x.  See #14688.
+
+* False of let-expressions (same reason); and in any case we
+  float lets out of an RHS if doing so will reveal an expandable
+  application (see SimplEnv.doFloatFromRhs).
+
+* Take care: exprIsExpandable should /not/ be true of primops.  I
+  found this in test T5623a:
+    let q = /\a. Ptr a (a +# b)
+    in case q @ Float of Ptr v -> ...q...
+
+  q's inlining should not be expandable, else exprIsConApp_maybe will
+  say that (q @ Float) expands to (Ptr a (a +# b)), and that will
+  duplicate the (a +# b) primop, which we should not do lightly.
+  (It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.)
+
 Note [isExpandableApp: bottoming functions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 It's important that isExpandableApp does not respond True to bottoming
@@ -1628,7 +1614,7 @@ expr_ok fun_ok primop_ok other_expr
         _ -> False
 
 -----------------------------
-app_ok :: (Id -> Bool) -> (PrimOp -> Bool) -> Id -> [CoreExpr] -> Bool
+app_ok :: (Id -> Bool) -> (PrimOp -> Bool) -> Id -> [CoreArg] -> Bool
 app_ok fun_ok primop_ok fun args
   | not (fun_ok fun)
   = False -- This code path is only taken for Note [Speculative evaluation]
@@ -1643,13 +1629,11 @@ app_ok fun_ok primop_ok fun args
          -- DFuns terminate, unless the dict is implemented
          -- with a newtype in which case they may not
 
-      DataConWorkId {} -> args_ok
-                -- The strictness of the constructor has already
-                -- been expressed by its "wrapper", so we don't need
-                -- to take the arguments into account
-                   -- Well, we thought so.  But it's definitely wrong!
-                   -- See #20749 and Note [How untagged pointers can
-                   -- end up in strict fields] in GHC.Stg.InferTags
+      DataConWorkId dc
+        | Just str_marks <- dataConRepStrictness_maybe dc
+        -> fields_ok str_marks
+        | otherwise
+        -> args_ok
 
       ClassOpId _ is_terminating_result
         | is_terminating_result -- See Note [exprOkForSpeculation and type classes]
@@ -1699,7 +1683,7 @@ app_ok fun_ok primop_ok fun args
 
     -- Even if a function call itself is OK, any unlifted
     -- args are still evaluated eagerly and must be checked
-    args_ok = and (zipWith arg_ok arg_tys args)
+    args_ok = all2Prefix arg_ok arg_tys args
     arg_ok :: PiTyVarBinder -> CoreExpr -> Bool
     arg_ok (Named _) _ = True   -- A type argument
     arg_ok (Anon ty _) arg      -- A term argument
@@ -1708,6 +1692,17 @@ app_ok fun_ok primop_ok fun args
        | otherwise
        = expr_ok fun_ok primop_ok arg
 
+    -- Used for DataCon worker arguments
+    fields_ok str_marks = all3Prefix field_ok str_marks arg_tys args
+    field_ok :: StrictnessMark -> PiTyVarBinder -> CoreExpr -> Bool
+    field_ok _   (Named _)   _ = True
+    field_ok str (Anon ty _) arg
+       | NotMarkedStrict <- str                 -- iff it's a lazy field
+       , definitelyLiftedType (scaledThing ty)  -- and its type is lifted
+       = True                                   -- then the worker app does not eval
+       | otherwise
+       = expr_ok fun_ok primop_ok arg
+
 -----------------------------
 altsAreExhaustive :: [Alt b] -> Bool
 -- True  <=> the case alternatives are definitely exhaustive
@@ -1933,12 +1928,14 @@ exprIsConLike = exprIsHNFlike isConLikeId isConLikeUnfolding
 -- or PAPs.
 --
 exprIsHNFlike :: HasDebugCallStack => (Var -> Bool) -> (Unfolding -> Bool) -> CoreExpr -> Bool
-exprIsHNFlike is_con is_con_unf = is_hnf_like
+exprIsHNFlike is_con is_con_unf e
+  = -- pprTraceWith "hnf" (\r -> ppr r <+> ppr e) $
+    is_hnf_like e
   where
     is_hnf_like (Var v) -- NB: There are no value args at this point
-      =  id_app_is_value v 0 -- Catches nullary constructors,
-                             --      so that [] and () are values, for example
-                             -- and (e.g.) primops that don't have unfoldings
+      =  id_app_is_value v [] -- Catches nullary constructors,
+                              --      so that [] and () are values, for example
+                              -- and (e.g.) primops that don't have unfoldings
       || is_con_unf (idUnfolding v)
         -- Check the thing's unfolding; it might be bound to a value
         --   or to a guaranteed-evaluated variable (isEvaldUnfolding)
@@ -1962,7 +1959,7 @@ exprIsHNFlike is_con is_con_unf = is_hnf_like
                                       -- See Note [exprIsHNF Tick]
     is_hnf_like (Cast e _)       = is_hnf_like e
     is_hnf_like (App e a)
-      | isValArg a               = app_is_value e 1
+      | isValArg a               = app_is_value e [a]
       | otherwise                = is_hnf_like e
     is_hnf_like (Let _ e)        = is_hnf_like e  -- Lazy let(rec)s don't affect us
     is_hnf_like (Case e b _ as)
@@ -1970,26 +1967,53 @@ exprIsHNFlike is_con is_con_unf = is_hnf_like
       = is_hnf_like rhs
     is_hnf_like _                = False
 
-    -- 'n' is the number of value args to which the expression is applied
-    -- And n>0: there is at least one value argument
-    app_is_value :: CoreExpr -> Int -> Bool
-    app_is_value (Var f)    nva = id_app_is_value f nva
-    app_is_value (Tick _ f) nva = app_is_value f nva
-    app_is_value (Cast f _) nva = app_is_value f nva
-    app_is_value (App f a)  nva
-      | isValArg a              =
-        app_is_value f (nva + 1) &&
-        not (needsCaseBinding (exprType a) a)
-          -- For example  f (x /# y)  where f has arity two, and the first
-          -- argument is unboxed. This is not a value!
-          -- But  f 34#  is a value.
-          -- NB: Check app_is_value first, the arity check is cheaper
-      | otherwise               = app_is_value f nva
-    app_is_value _          _   = False
-
-    id_app_is_value id n_val_args
-       = is_con id
-       || idArity id > n_val_args
+    -- Collect arguments through Casts and Ticks and call id_app_is_value
+    app_is_value :: CoreExpr -> [CoreArg] -> Bool
+    app_is_value (Var f)    as = id_app_is_value f as
+    app_is_value (Tick _ f) as = app_is_value f as
+    app_is_value (Cast f _) as = app_is_value f as
+    app_is_value (App f a)  as | isValArg a = app_is_value f (a:as)
+                               | otherwise  = app_is_value f as
+    app_is_value _          _  = False
+
+    id_app_is_value id val_args
+       -- First handle saturated applications of DataCons with strict fields
+       | Just dc <- isDataConWorkId_maybe id              -- DataCon
+       , Just str_marks <- dataConRepStrictness_maybe dc  -- with strict fields
+       , assert (val_args `leLength` str_marks) True
+       , val_args `equalLength` str_marks                 -- in a saturated app
+       = all3Prefix check_field str_marks (mapMaybe anonPiTyBinderType_maybe  (collectPiTyBinders (idType id))) val_args
+
+       -- Now all applications except saturated DataCon apps with strict fields
+       |  idArity id > length val_args
+            -- PAP: Check unlifted val_args
+       || is_con id && isNothing (isDataConWorkId_maybe id >>= dataConRepStrictness_maybe)
+            -- Either a lazy DataCon or a CONLIKE.
+            -- Hence we only need to check unlifted val_args here.
+            -- NB: We assume that CONLIKEs are lazy, which is their entire
+            --     point.
+       = all2Prefix check_arg (mapMaybe anonPiTyBinderType_maybe  (collectPiTyBinders (idType id))) val_args
+
+       | otherwise
+       = False
+       where
+         -- fun_ty  = idType id
+         -- arg_tys = collectPiTyBinders fun_ty
+         -- val_arg_tys = mapMaybe anonPiTyBinderType_maybe  arg_tys
+         -- val_arg_tys = map exprType val_args, but much less costly.
+         -- The obvious definition regresses T16577 by 30% so we don't do it.
+
+         check_arg a_ty a = mightBeUnliftedType a_ty ==> is_hnf_like a
+          -- Check unliftedness; for example f (x /# 12#) where f has arity two,
+          -- and the first argument is unboxed. This is not a value!
+          -- But  f 34#  is a value, so check args for HNFs.
+          -- NB: We check arity (and CONLIKEness) first because it's cheaper
+          --     and we reject quickly on saturated apps.
+         check_field str a_ty a
+           = isMarkedStrict str || mightBeUnliftedType a_ty ==> is_hnf_like a
+         a ==> b = not a || b
+         infixr 1 ==>
+{-# INLINE exprIsHNFlike #-}
 
 {-
 Note [exprIsHNF Tick]
@@ -2551,7 +2575,7 @@ This means the seqs on x and y both become no-ops and compared to the first vers
 
 The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated.
 But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated
-already at the call site because of the Strict Field Invariant! See Note [Strict Field Invariant] for more in this.
+already at the call site because of the Strict Field Invariant! See Note [STG Strict Field Invariant] for more in this.
 This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good.
 
 We only apply this when we think there is a benefit in doing so however. There are a number of cases in which


=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -1,4 +1,5 @@
 {-# LANGUAGE BangPatterns #-}
+{-# LANGUAGE ViewPatterns #-}
 
 {-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-}
 
@@ -97,7 +98,8 @@ The goal of this pass is to prepare for code generation.
     (The code generator can't deal with anything else.)
     Type lambdas are ok, however, because the code gen discards them.
 
-5.  [Not any more; nuked Jun 2002] Do the seq/par munging.
+5.  Flatten nested lets as much as possible.
+    See Note [Floating in CorePrep].
 
 6.  Clone all local Ids.
     This means that all such Ids are unique, rather than the
@@ -217,7 +219,7 @@ corePrepPgm logger cp_cfg pgm_cfg
         binds_out = initUs_ us $ do
                       floats1 <- corePrepTopBinds initialCorePrepEnv binds
                       floats2 <- corePrepTopBinds initialCorePrepEnv implicit_binds
-                      return (deFloatTop (floats1 `appendFloats` floats2))
+                      return (deFloatTop (floats1 `zipFloats` floats2))
 
     endPassIO logger (cpPgm_endPassConfig pgm_cfg)
               binds_out []
@@ -244,7 +246,7 @@ corePrepTopBinds initialCorePrepEnv binds
                                  -- Only join points get returned this way by
                                  -- cpeBind, and no join point may float to top
                                floatss <- go env' binds
-                               return (floats `appendFloats` floatss)
+                               return (floats `zipFloats` floatss)
 
 mkDataConWorkers :: Bool -> ModLocation -> [TyCon] -> [CoreBind]
 -- See Note [Data constructor workers]
@@ -268,7 +270,48 @@ mkDataConWorkers generate_debug_info mod_loc data_tycons
              LexicalFastString $ mkFastString $ renderWithContext defaultSDocContext $ ppr name
            span1 file = realSrcLocSpan $ mkRealSrcLoc (mkFastString file) 1 1
 
-{-
+{- Note [Floating in CorePrep]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ANFisation produces a lot of nested lets that obscures values:
+  let v = (:) (f 14) [] in e
+  ==> { ANF in CorePrep }
+  let v = let sat = f 14 in (:) sat [] in e
+Here, `v` is not a value anymore, and we'd allocate a thunk closure for `v` that
+allocates a thunk for `sat` and then allocates the cons cell.
+Very often (caveat in Wrinkle (FCP1) below), `v` is actually used and the
+allocation of the thunk closure was in vain.
+Hence we carry around a bunch of floated bindings with us so that we can leave
+behind more values:
+  let v = let sat = f 14 in (:) sat [] in e
+  ==> { Float sat }
+  let sat = f 14 in
+  let v = (:) sat [] in e
+Floating to top-level can make an asymptotic difference, because `sat` becomes
+an SAT; See Note [Floating out of top level bindings].
+For nested let bindings, we have to keep in mind Note [Core letrec invariant]
+and may exploit strict contexts; See Note [wantFloatLocal].
+
+There are 3 main categories of floats, encoded in the `FloatingBind` type:
+
+  * `Float`: A floated binding, as `sat` above.
+    These come in different flavours as described by their `FloatInfo` and
+    `BindInfo`, which captures how far the binding can be floated and whether or
+    not we want to case-bind. See Note [BindInfo and FloatInfo].
+  * `UnsafeEqualityCase`: Used for floating around unsafeEqualityProof bindings;
+    see (U3) of Note [Implementing unsafeCoerce].
+    It's simply like any other ok-for-spec-eval Float (see `mkFloat`) that has
+    a non-DEFAULT Case alternative to bind the unsafe coercion field of the Refl
+    constructor.
+  * `FloatTick`: A floated `Tick`. See Note [Floating Ticks in CorePrep].
+
+Wrinkles:
+ (FCP1)
+    Local floating as above is not necessarily an optimisation if `v` doesn't
+    up being evaluated. In that case, we allocate a 4 word closure for `v` but
+    won't need to allocate separate closures for `sat` (2 words) and the `(:)`
+    conapp (3 words), thus saving at least one word, depending on how much the
+    set of FVs between `sat` and `(:)` overlap.
+
 Note [Floating out of top level bindings]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 NB: we do need to float out of top-level bindings
@@ -557,7 +600,7 @@ cpeBind top_lvl env (NonRec bndr rhs)
              floats1 | triv_rhs, isInternalName (idName bndr)
                      = floats
                      | otherwise
-                     = addFloat floats new_float
+                     = snocFloat floats new_float
 
              new_float = mkFloat env dmd is_unlifted bndr1 rhs1
 
@@ -578,15 +621,18 @@ cpeBind top_lvl env (Rec pairs)
        ; stuff <- zipWithM (cpePair top_lvl Recursive topDmd False env')
                            bndrs1 rhss
 
-       ; let (floats_s, rhss1) = unzip stuff
-             -- Glom all floats into the Rec, *except* FloatStrings which can
-             -- (and must, because unlifted!) float further.
-             (string_floats, all_pairs) =
-               foldrOL add_float (emptyFloats, bndrs1 `zip` rhss1)
-                                 (concatFloats floats_s)
+       ; let (zipManyFloats -> floats, rhss1) = unzip stuff
+             -- Glom all floats into the Rec, *except* FloatStrings; see
+             -- see Note [ANF-ising literal string arguments], Wrinkle (FS1)
+             is_lit (Float (NonRec _ rhs) CaseBound TopLvlFloatable) = exprIsTickedString rhs
+             is_lit _                                                = False
+             (string_floats, top) = partitionOL is_lit (top_floats floats)
+             floats'   = floats { top_floats = top }
+             all_pairs = foldrOL add_float (bndrs1 `zip` rhss1) (getFloats floats')
        -- use env below, so that we reset cpe_rec_ids
        ; return (extendCorePrepEnvList env (bndrs `zip` bndrs1),
-                 string_floats `addFloat` FloatLet (Rec all_pairs),
+                 snocFloat (emptyFloats { top_floats = string_floats })
+                           (Float (Rec all_pairs) LetBound TopLvlFloatable),
                  Nothing) }
 
   | otherwise -- See Note [Join points and floating]
@@ -604,10 +650,11 @@ cpeBind top_lvl env (Rec pairs)
 
         -- Flatten all the floats, and the current
         -- group into a single giant Rec
-    add_float (FloatLet (NonRec b r)) (ss, prs2) = (ss, (b,r)    : prs2)
-    add_float (FloatLet (Rec prs1))   (ss, prs2) = (ss, prs1    ++ prs2)
-    add_float s at FloatString{}         (ss, prs2) = (addFloat ss s, prs2)
-    add_float b                       _          = pprPanic "cpeBind" (ppr b)
+    add_float (Float bind bound _) prs2
+      | bound /= CaseBound = case bind of
+          NonRec x e -> (x,e) : prs2
+          Rec prs1 -> prs1 ++ prs2
+    add_float f                       _    = pprPanic "cpeBind" (ppr f)
 
 ---------------
 cpePair :: TopLevelFlag -> RecFlag -> Demand -> Bool
@@ -620,7 +667,8 @@ cpePair top_lvl is_rec dmd is_unlifted env bndr rhs
     do { (floats1, rhs1) <- cpeRhsE env rhs
 
        -- See if we are allowed to float this stuff out of the RHS
-       ; (floats2, rhs2) <- float_from_rhs floats1 rhs1
+       ; let dec = want_float_from_rhs floats1 rhs1
+       ; (floats2, rhs2) <- executeFloatDecision dec floats1 rhs1
 
        -- Make the arity match up
        ; (floats3, rhs3)
@@ -629,9 +677,11 @@ cpePair top_lvl is_rec dmd is_unlifted env bndr rhs
                else warnPprTrace True "CorePrep: silly extra arguments:" (ppr bndr) $
                                -- Note [Silly extra arguments]
                     (do { v <- newVar (idType bndr)
-                        ; let float = mkFloat env topDmd False v rhs2
-                        ; return ( addFloat floats2 float
-                                 , cpeEtaExpand arity (Var v)) })
+                        ; let float@(Float (NonRec v' _) _ _) =
+                                mkFloat env topDmd False v rhs2
+                        -- v' has demand info and possibly evaldUnfolding
+                        ; return ( snocFloat floats2 float
+                                 , cpeEtaExpand arity (Var v')) })
 
         -- Wrap floating ticks
        ; let (floats4, rhs4) = wrapTicks floats3 rhs3
@@ -640,35 +690,30 @@ cpePair top_lvl is_rec dmd is_unlifted env bndr rhs
   where
     arity = idArity bndr        -- We must match this arity
 
-    ---------------------
-    float_from_rhs floats rhs
-      | isEmptyFloats floats = return (emptyFloats, rhs)
-      | isTopLevel top_lvl   = float_top    floats rhs
-      | otherwise            = float_nested floats rhs
-
-    ---------------------
-    float_nested floats rhs
-      | wantFloatNested is_rec dmd is_unlifted floats rhs
-                  = return (floats, rhs)
-      | otherwise = dontFloat floats rhs
-
-    ---------------------
-    float_top floats rhs
-      | allLazyTop floats
-      = return (floats, rhs)
-
-      | otherwise
-      = dontFloat floats rhs
-
-dontFloat :: Floats -> CpeRhs -> UniqSM (Floats, CpeBody)
--- Non-empty floats, but do not want to float from rhs
--- So wrap the rhs in the floats
--- But: rhs1 might have lambdas, and we can't
---      put them inside a wrapBinds
-dontFloat floats1 rhs
-  = do { (floats2, body) <- rhsToBody rhs
-        ; return (emptyFloats, wrapBinds floats1 $
-                               wrapBinds floats2 body) }
+    want_float_from_rhs floats rhs
+      | isTopLevel top_lvl = FloatSome TopLvlFloatable
+      | otherwise          = wantFloatLocal is_rec dmd is_unlifted floats rhs
+
+data FloatDecision
+  = FloatNone
+  | FloatAll
+  | FloatSome !FloatInfo -- ^ Float all bindings with <= this info
+
+executeFloatDecision :: FloatDecision -> Floats -> CpeRhs -> UniqSM (Floats, CpeRhs)
+executeFloatDecision dec floats rhs = do
+  let (float,stay) = case dec of
+        _ | isEmptyFloats floats -> (emptyFloats,emptyFloats)
+        FloatNone                -> (emptyFloats, floats)
+        FloatAll                 -> (floats, emptyFloats)
+        FloatSome info           -> partitionFloats info floats
+  -- Wrap `stay` around `rhs`.
+  -- NB: `rhs` might have lambdas, and we can't
+  --     put them inside a wrapBinds, which expects a `CpeBody`.
+  if isEmptyFloats stay -- Fast path where we don't need to call `rhsToBody`
+    then return (float, rhs)
+    else do
+      (floats', body) <- rhsToBody rhs
+      return (float, wrapBinds stay $ wrapBinds floats' body)
 
 {- Note [Silly extra arguments]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -754,14 +799,14 @@ cpeRhsE env (Let bind body)
        ; (body_floats, body') <- cpeRhsE env' body
        ; let expr' = case maybe_bind' of Just bind' -> Let bind' body'
                                          Nothing    -> body'
-       ; return (bind_floats `appendFloats` body_floats, expr') }
+       ; return (bind_floats `appFloats` body_floats, expr') }
 
 cpeRhsE env (Tick tickish expr)
   -- Pull out ticks if they are allowed to be floated.
   | tickishFloatable tickish
   = do { (floats, body) <- cpeRhsE env expr
          -- See [Floating Ticks in CorePrep]
-       ; return (unitFloat (FloatTick tickish) `appendFloats` floats, body) }
+       ; return (FloatTick tickish `consFloat` floats, body) }
   | otherwise
   = do { body <- cpeBodyNF env expr
        ; return (emptyFloats, mkTick tickish' body) }
@@ -805,12 +850,12 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con bs _])
        ; (floats_rhs, rhs) <- cpeBody env rhs
          -- ... but we want to float `floats_rhs` as in (U3) so that rhs' might
          -- become a value
-       ; let case_float = FloatCase scrut bndr con bs True
-         -- NB: True <=> ok-for-spec; it is OK to "evaluate" the proof eagerly.
+       ; let case_float = UnsafeEqualityCase scrut bndr con bs
+         -- NB: It is OK to "evaluate" the proof eagerly.
          --     Usually there's the danger that we float the unsafeCoerce out of
          --     a branching Case alt. Not so here, because the regular code path
          --     for `cpeRhsE Case{}` will not float out of alts.
-             floats = addFloat floats_scrut case_float `appendFloats` floats_rhs
+             floats = snocFloat floats_scrut case_float `appFloats` floats_rhs
        ; return (floats, rhs) }
 
 cpeRhsE env (Case scrut bndr ty alts)
@@ -859,7 +904,7 @@ cpeBody :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeBody)
 cpeBody env expr
   = do { (floats1, rhs) <- cpeRhsE env expr
        ; (floats2, body) <- rhsToBody rhs
-       ; return (floats1 `appendFloats` floats2, body) }
+       ; return (floats1 `appFloats` floats2, body) }
 
 --------
 rhsToBody :: CpeRhs -> UniqSM (Floats, CpeBody)
@@ -882,7 +927,7 @@ rhsToBody expr@(Lam {})   -- See Note [No eta reduction needed in rhsToBody]
   | otherwise                   -- Some value lambdas
   = do { let rhs = cpeEtaExpand (exprArity expr) expr
        ; fn <- newVar (exprType rhs)
-       ; let float = FloatLet (NonRec fn rhs)
+       ; let float = Float (NonRec fn rhs) LetBound TopLvlFloatable
        ; return (unitFloat float, Var fn) }
   where
     (bndrs,_) = collectBinders expr
@@ -1125,7 +1170,8 @@ cpeApp top_env expr
         :: CorePrepEnv
         -> [ArgInfo]                  -- The arguments (inner to outer)
         -> CpeApp                     -- The function
-        -> Floats
+        -> Floats                     -- INVARIANT: These floats don't bind anything that is in the CpeApp!
+                                      -- Just stuff floated out from the head of the application.
         -> [Demand]
         -> Maybe Arity
         -> UniqSM (CpeApp
@@ -1170,7 +1216,7 @@ cpeApp top_env expr
                    (ss1 : ss_rest, False) -> (ss1,    ss_rest)
                    ([],            _)     -> (topDmd, [])
         (fs, arg') <- cpeArg top_env ss1 arg
-        rebuild_app' env as (App fun' arg') (fs `appendFloats` floats) ss_rest rt_ticks (req_depth-1)
+        rebuild_app' env as (App fun' arg') (fs `zipFloats` floats) ss_rest rt_ticks (req_depth-1)
 
       CpeCast co
         -> rebuild_app' env as (Cast fun' co) floats ss rt_ticks req_depth
@@ -1182,7 +1228,7 @@ cpeApp top_env expr
            rebuild_app' env as fun' floats ss (tickish:rt_ticks) req_depth
         | otherwise
         -- See [Floating Ticks in CorePrep]
-        -> rebuild_app' env as fun' (addFloat floats (FloatTick tickish)) ss rt_ticks req_depth
+        -> rebuild_app' env as fun' (snocFloat floats (FloatTick tickish)) ss rt_ticks req_depth
 
 isLazyExpr :: CoreExpr -> Bool
 -- See Note [lazyId magic] in GHC.Types.Id.Make
@@ -1261,8 +1307,7 @@ Other relevant Notes:
  * Note [runRW arg] below, describing a non-obvious case where the
    late-inlining could go wrong.
 
-
- Note [runRW arg]
+Note [runRW arg]
 ~~~~~~~~~~~~~~~~~~~
 Consider the Core program (from #11291),
 
@@ -1294,7 +1339,6 @@ the function and the arguments) will forgo binding it to a variable. By
 contrast, in the non-bottoming case of `hello` above  the function will be
 deemed non-trivial and consequently will be case-bound.
 
-
 Note [Simplification of runRW#]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider the program,
@@ -1408,8 +1452,7 @@ But with -O0, there is no FloatOut, so CorePrep must do the ANFisation to
     foo = Foo s
 
 (String literals are the only kind of binding allowed at top-level and hence
-their floats are `OkToSpec` like lifted bindings, whereas all other unlifted
-floats are `IfUnboxedOk` so that they don't float to top-level.)
+their `FloatInfo` is `TopLvlFloatable`.)
 
 This appears to lead to bad code if the arg is under a lambda, because CorePrep
 doesn't float out of RHSs, e.g., (T23270)
@@ -1432,24 +1475,13 @@ But actually, it doesn't, because "turtle"# is already an HNF. Here is the Cmm:
 
 Wrinkles:
 
-(FS1) It is crucial that we float out String literals out of RHSs that could
-      become values, e.g.,
-
-        let t = case "turtle"# of s { __DEFAULT -> MkT s }
-        in f t
-
-      where `MkT :: Addr# -> T`. We want
-
-        let s = "turtle"#; t = MkT s
-        in f t
-
-      because the former allocates an extra thunk for `t`.
-      Normally, the `case turtle# of s ...` becomes a `FloatCase` and
-      we don't float `FloatCase` outside of (recursive) RHSs, so we get the
-      former program (this is the 'allLazyNested' test in 'wantFloatNested').
-      That is what we use `FloatString` for: It is essentially a `FloatCase`
-      which is always ok-to-spec/can be regarded as a non-allocating value and
-      thus be floated aggressively to expose more value bindings.
+(FS1) We detect string literals in `cpeBind Rec{}` and float them out anyway;
+      otherwise we'd try to bind a string literal in a letrec, violating
+      Note [Core letrec invariant]. Since we know that literals don't have
+      free variables, we float further.
+      Arguably, we could just as well relax the letrec invariant for
+      string literals, or anthing that is a value (lifted or not).
+      This is tracked in #24036.
 -}
 
 -- This is where we arrange that a non-trivial argument is let-bound
@@ -1459,10 +1491,9 @@ cpeArg env dmd arg
   = do { (floats1, arg1) <- cpeRhsE env arg     -- arg1 can be a lambda
        ; let arg_ty      = exprType arg1
              is_unlifted = isUnliftedType arg_ty
-             want_float  = wantFloatNested NonRecursive dmd is_unlifted
-       ; (floats2, arg2) <- if want_float floats1 arg1
-                            then return (floats1, arg1)
-                            else dontFloat floats1 arg1
+             dec         = wantFloatLocal NonRecursive dmd is_unlifted
+                                                  floats1 arg1
+       ; (floats2, arg2) <- executeFloatDecision dec floats1 arg1
                 -- Else case: arg1 might have lambdas, and we can't
                 --            put them inside a wrapBinds
 
@@ -1474,8 +1505,10 @@ cpeArg env dmd arg
          else do { v <- newVar arg_ty
                  -- See Note [Eta expansion of arguments in CorePrep]
                  ; let arg3 = cpeEtaExpandArg env arg2
-                       arg_float = mkFloat env dmd is_unlifted v arg3
-                 ; return (addFloat floats2 arg_float, varToCoreExpr v) }
+                       arg_float@(Float (NonRec v' _) _ _) =
+                         mkFloat env dmd is_unlifted v arg3
+                       -- v' has demand info and possibly evaldUnfolding
+                 ; return (snocFloat floats2 arg_float, varToCoreExpr v') }
        }
 
 cpeEtaExpandArg :: CorePrepEnv -> CoreArg -> CoreArg
@@ -1508,20 +1541,6 @@ See Note [Eta expansion for join points] in GHC.Core.Opt.Arity
 Eta expanding the join point would introduce crap that we can't
 generate code for
 
-Note [Floating unlifted arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider    C (let v* = expensive in v)
-
-where the "*" indicates "will be demanded".  Usually v will have been
-inlined by now, but let's suppose it hasn't (see #2756).  Then we
-do *not* want to get
-
-     let v* = expensive in C v
-
-because that has different strictness.  Hence the use of 'allLazy'.
-(NB: the let v* turns into a FloatCase, in mkLocalNonRec.)
-
-
 ------------------------------------------------------------------------------
 -- Building the saturated syntax
 -- ---------------------------------------------------------------------------
@@ -1791,159 +1810,299 @@ of the very function whose termination properties we are exploiting.
 It is also similar to Note [Do not strictify a DFun's parameter dictionaries],
 where marking recursive DFuns (of undecidable *instances*) strict in dictionary
 *parameters* leads to quite the same change in termination as above.
+
+Note [BindInfo and FloatInfo]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The `BindInfo` of a `Float` describes whether it is a (let-bound) value
+(`BoundVal`), or otherwise whether it will be `CaseBound` or `LetBound`.
+We want to case-bind iff the binding is either ok-for-spec-eval, unlifted,
+strictly used (and perhaps lifted), or a literal string (Wrinkle (FI1) below).
+
+The `FloatInfo` of a `Float` describes how far it will float.
+
+  * Any binding is at least `StrictContextFloatable`, meaning we may float it
+    out of a strict context such as `f <>` where `f` is strict.
+
+  * A binding is `LazyContextFloatable` if we may float it out of a lazy context
+    such as `let x = <> in Just x`.
+    Counterexample: A strict or unlifted binding that isn't ok-for-spec-eval
+                    such as `case divInt# x y of r -> ...`.
+
+  * A binding is `TopLvlFloatable` if it is `LazyContextFloatable` and also can
+    be bound at the top level.
+    Counterexample: A strict or unlifted binding (ok-for-spec-eval or not)
+                    such as `case x +# y of r -> ...`.
+
+All lazy (and hence lifted) bindings are `TopLvlFloatable`.
+See also Note [Floats and FloatDecision] for how we maintain whole groups of
+floats and how far they go.
+
+Wrinkles:
+ (FI1)
+    String literals are a somewhat weird outlier; they qualify as
+    'TopLvlFloatable' and will be floated quite aggressively, but the code
+    generator expects unboxed values to be case-bound (in contrast to boxed
+    values). This requires a special case in 'mkFloat'.
+    See also Wrinkle (FS1) of Note [ANF-ising literal string arguments].
+
+Note [Floats and FloatDecision]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We have a special datatype `Floats` for modelling a telescope of `FloatingBind`.
+It has one `OrdList` per class of `FloatInfo` with the following meaning:
+
+  * Any binding in `top_floats` is at least `TopLvlFloatable`;
+    any binding in `lzy_floats` is at least `LazyContextFloatable`;
+    any binding in `str_floats` is at least `StrictContextFloatable`.
+
+  * `str_floats` is nested inside `lzy_floats`, is nested inside `top_floats`.
+    No binding in a "higher" `FloatInfo` class may scope over bindings in
+    a "lower" class; for example, no binding in `top_floats` may scope over a
+    binding in `str_floats`.
+
+(It is always safe to put the whole telescope in `str_floats`.)
+There are several operations for creating and combining `Floats` that maintain
+these properties.
+
+When deciding whether we want to float out a `Floats` out of a binding context
+such as `let x = <> in e` (let), `f <>` (app), or `x = <>; ...` (top-level),
+we consult the `FloatInfo` of the `Floats` (e.g., the highest class of non-empty
+`OrdList`):
+
+  * If we want to float to the top-level (`x = <>; ...`), we may take all
+    bindings that are `TopLvlFloatable` and leave the rest inside the binding.
+    This is encoded in a `FloatDecision` of `FloatSome TopLvlFloatable`.
+  * If we want to float locally (let or app), then the floating decision is
+    `FloatAll` or `FloatNone`; see Note [wantFloatLocal].
+
+`executeFloatDecision` is then used to act on the particular `FloatDecision`.
 -}
 
+-- See Note [BindInfo and FloatInfo]
+data BindInfo
+  = CaseBound -- ^ A strict binding
+  | LetBound  -- ^ A lazy or value binding
+  | BoundVal  -- ^ A value binding (E.g., a 'LetBound' that is an HNF)
+  deriving Eq
+
+-- See Note [BindInfo and FloatInfo]
+data FloatInfo
+  = TopLvlFloatable
+  -- ^ Anything that can be bound at top-level, such as arbitrary lifted
+  -- bindings or anything that responds True to `exprIsHNF`, such as literals or
+  -- saturated DataCon apps where unlifted or strict args are values.
+
+  | LazyContextFloatable
+  -- ^ Anything that can be floated out of a lazy context.
+  -- In addition to any 'TopLvlFloatable' things, this includes (unlifted)
+  -- bindings that are ok-for-spec that we intend to case-bind.
+
+  | StrictContextFloatable
+  -- ^ Anything that can be floated out of a strict evaluation context.
+  -- That is possible for all bindings; this is the Top element of 'FloatInfo'.
+
+  deriving (Eq, Ord)
+
+instance Outputable BindInfo where
+  ppr CaseBound = text "Case"
+  ppr LetBound  = text "Let"
+  ppr BoundVal  = text "Letv"
+
+instance Outputable FloatInfo where
+  ppr TopLvlFloatable = text "top-lvl"
+  ppr LazyContextFloatable = text "lzy-ctx"
+  ppr StrictContextFloatable = text "str-ctx"
+
+-- See Note [Floating in CorePrep]
+-- and Note [BindInfo and FloatInfo]
 data FloatingBind
-  -- | Rhs of bindings are CpeRhss
-  -- They are always of lifted type;
-  -- unlifted ones are done with FloatCase
-  = FloatLet CoreBind
-
-  -- | Float a literal string binding.
-  -- INVARIANT: The `CoreExpr` matches `Lit (LitString bs)`.
-  --            It's just more convenient to keep around the expr rather than
-  --            the wrapped `bs` and reallocate the expr.
-  -- This is a special case of `FloatCase` that is unconditionally ok-for-spec.
-  -- We want to float out strings quite aggressively out of RHSs if doing so
-  -- saves allocation of a thunk ('wantFloatNested'); see Wrinkle (FS1)
-  -- in Note [ANF-ising literal string arguments].
-  | FloatString !CoreExpr !Id
-
-  | FloatCase
-       CpeBody         -- ^ Scrutinee
-       Id              -- ^ Case binder
-       AltCon [Var]    -- ^ Single alternative
-       Bool            -- ^ Ok-for-speculation; False of a strict,
-                       --   but lifted binding that is not OK for
-                       --   Note [Speculative evaluation].
-
-  -- | See Note [Floating Ticks in CorePrep]
+  = Float !CoreBind !BindInfo !FloatInfo
+  | UnsafeEqualityCase !CoreExpr !CoreBndr !AltCon ![CoreBndr]
   | FloatTick CoreTickish
 
-data Floats = Floats OkToSpec (OrdList FloatingBind)
+-- See Note [Floats and FloatDecision]
+data Floats
+  = Floats
+  { top_floats :: !(OrdList FloatingBind)
+  , lzy_floats :: !(OrdList FloatingBind)
+  , str_floats :: !(OrdList FloatingBind)
+  }
 
 instance Outputable FloatingBind where
-  ppr (FloatLet b) = ppr b
-  ppr (FloatString e b) = text "string" <> braces (ppr b <> char '=' <> ppr e)
-  ppr (FloatCase r b k bs ok) = text "case" <> braces (ppr ok) <+> ppr r
+  ppr (Float b bi fi) = ppr bi <+> ppr fi <+> ppr b
+  ppr (FloatTick t) = ppr t
+  ppr (UnsafeEqualityCase scrut b k bs) = text "case" <+> ppr scrut
                                 <+> text "of"<+> ppr b <> text "@"
                                 <> case bs of
                                    [] -> ppr k
                                    _  -> parens (ppr k <+> ppr bs)
-  ppr (FloatTick t) = ppr t
 
 instance Outputable Floats where
-  ppr (Floats flag fs) = text "Floats" <> brackets (ppr flag) <+>
-                         braces (vcat (map ppr (fromOL fs)))
-
-instance Outputable OkToSpec where
-  ppr OkToSpec    = text "OkToSpec"
-  ppr IfUnliftedOk = text "IfUnliftedOk"
-  ppr NotOkToSpec = text "NotOkToSpec"
-
--- Can we float these binds out of the rhs of a let?  We cache this decision
--- to avoid having to recompute it in a non-linear way when there are
--- deeply nested lets.
-data OkToSpec
-   = OkToSpec           -- ^ Lazy bindings of lifted type. Float as you please
-   | IfUnliftedOk       -- ^ A mixture of lazy lifted bindings and n
-                        -- ok-to-speculate unlifted bindings.
-                        -- Float out of lets, but not to top-level!
-   | NotOkToSpec        -- ^ Some not-ok-to-speculate unlifted bindings
+  ppr (Floats top lzy str) = text "Floats" <> braces (vcat (map ppr [top, lzy, str]))
 
 mkFloat :: CorePrepEnv -> Demand -> Bool -> Id -> CpeRhs -> FloatingBind
-mkFloat env dmd is_unlifted bndr rhs
-  | Lit LitString{} <- rhs = FloatString rhs bndr
-
-  | is_strict || ok_for_spec
-  , not is_hnf             = FloatCase rhs bndr DEFAULT [] ok_for_spec
-      -- See Note [Speculative evaluation]
-      -- Don't make a case for a HNF binding, even if it's strict
-      -- Otherwise we get  case (\x -> e) of ...!
-
-  | is_unlifted            = FloatCase rhs bndr DEFAULT [] True
-      -- we used to assertPpr ok_for_spec (ppr rhs) here, but it is now disabled
-      -- because exprOkForSpeculation isn't stable under ANF-ing. See for
-      -- example #19489 where the following unlifted expression:
-      --
-      --    GHC.Prim.(#|_#) @LiftedRep @LiftedRep @[a_ax0] @[a_ax0]
-      --                    (GHC.Types.: @a_ax0 a2_agq a3_agl)
-      --
-      -- is ok-for-spec but is ANF-ised into:
-      --
-      --    let sat = GHC.Types.: @a_ax0 a2_agq a3_agl
-      --    in GHC.Prim.(#|_#) @LiftedRep @LiftedRep @[a_ax0] @[a_ax0] sat
-      --
-      -- which isn't ok-for-spec because of the let-expression.
-
-  | is_hnf                 = FloatLet (NonRec bndr                       rhs)
-  | otherwise              = FloatLet (NonRec (setIdDemandInfo bndr dmd) rhs)
-                   -- See Note [Pin demand info on floats]
+mkFloat env dmd is_unlifted bndr rhs = pprTraceWith "mkFloat" ppr $
+  Float (NonRec bndr2 rhs) bound info
   where
+    (bound,info)
+      | is_hnf, is_boxed         = (BoundVal, TopLvlFloatable)
+          -- is_lifted: We currently don't allow unboxed values at the
+          --            top-level or inside letrecs
+          --            (but SG thinks that in principle, we should)
+      | exprIsTickedString rhs   = (CaseBound, TopLvlFloatable)
+          -- See Note [BindInfo and FloatInfo], Wrinkle (FI1)
+          -- for why it's not BoundVal
+      | is_lifted,   ok_for_spec = (CaseBound, TopLvlFloatable)
+      | is_unlifted, ok_for_spec = (CaseBound, LazyContextFloatable)
+          -- See Note [Speculative evaluation]
+          -- Ok-for-spec-eval things will be case-bound, lifted or not.
+          -- But when it's lifted we are ok with floating it to top-level
+          -- (where it is actually bound lazily).
+      | is_unlifted || is_strict = (CaseBound, StrictContextFloatable)
+          -- These will never be floated out of a lazy RHS context
+      | otherwise                = assertPpr is_lifted (ppr rhs) $
+                                   (LetBound, TopLvlFloatable)
+          -- And these float freely but can't be speculated, hence LetBound
+
+    is_lifted   = not is_unlifted
+    is_boxed    = isBoxedType (idType bndr)
     is_hnf      = exprIsHNF rhs
     is_strict   = isStrUsedDmd dmd
     ok_for_spec = exprOkForSpecEval (not . is_rec_call) rhs
     is_rec_call = (`elemUnVarSet` cpe_rec_ids env)
 
+    bndr1 = bndr `setIdDemandInfo` dmd -- See Note [Pin demand info on floats]
+    bndr2
+      | is_hnf
+      -- Otherwise, exprIsHNF must be conservative when bndr occurs as a strict
+      -- field arg. Result: More allocation in $walexGetByte.
+      = bndr1 `setIdUnfolding` evaldUnfolding
+      | otherwise
+      = bndr1
+
 emptyFloats :: Floats
-emptyFloats = Floats OkToSpec nilOL
+emptyFloats = Floats nilOL nilOL nilOL
 
 isEmptyFloats :: Floats -> Bool
-isEmptyFloats (Floats _ bs) = isNilOL bs
+isEmptyFloats (Floats b1 b2 b3) = isNilOL b1 && isNilOL b2 && isNilOL b3
 
-wrapBinds :: Floats -> CpeBody -> CpeBody
-wrapBinds (Floats _ binds) body
-  = foldrOL mk_bind body binds
-  where
-    mk_bind (FloatCase rhs bndr con bs _) body = Case rhs bndr (exprType body) [Alt con bs body]
-    mk_bind (FloatString rhs bndr)        body = Case rhs bndr (exprType body) [Alt DEFAULT [] body]
-    mk_bind (FloatLet bind)               body = Let bind body
-    mk_bind (FloatTick tickish)           body = mkTick tickish body
-
-addFloat :: Floats -> FloatingBind -> Floats
-addFloat (Floats ok_to_spec floats) new_float
-  = Floats (combine ok_to_spec (check new_float)) (floats `snocOL` new_float)
-  where
-    check FloatLet {}   = OkToSpec
-    check FloatTick{}   = OkToSpec
-    check FloatString{} = OkToSpec
-    check (FloatCase _ _ _ _ ok_for_spec)
-      | ok_for_spec     = IfUnliftedOk
-      | otherwise       = NotOkToSpec
-        -- The ok-for-speculation flag says that it's safe to
-        -- float this Case out of a let, and thereby do it more eagerly
-        -- We need the IfUnliftedOk flag because it's never ok to float
-        -- an unlifted binding to the top level.
-        -- There is one exception: String literals! But those will become
-        -- FloatString and thus OkToSpec.
-        -- See Note [ANF-ising literal string arguments]
+getFloats :: Floats -> OrdList FloatingBind
+getFloats (Floats top lzy str) = top `appOL` lzy `appOL` str
 
 unitFloat :: FloatingBind -> Floats
-unitFloat = addFloat emptyFloats
+unitFloat = snocFloat emptyFloats
 
-appendFloats :: Floats -> Floats -> Floats
-appendFloats (Floats spec1 floats1) (Floats spec2 floats2)
-  = Floats (combine spec1 spec2) (floats1 `appOL` floats2)
-
-concatFloats :: [Floats] -> OrdList FloatingBind
-concatFloats = foldr (\ (Floats _ bs1) bs2 -> appOL bs1 bs2) nilOL
+wrapBinds :: Floats -> CpeBody -> CpeBody
+wrapBinds floats body
+  = -- pprTraceWith "wrapBinds" (\res -> ppr floats $$ ppr body $$ ppr res) $
+    foldrOL mk_bind body (getFloats floats)
+  where
+    mk_bind f@(Float bind CaseBound _) body
+      | NonRec bndr rhs <- bind
+      = mkDefaultCase rhs bndr body
+      | otherwise
+      = pprPanic "wrapBinds" (ppr f)
+    mk_bind (Float bind _ _) body
+      = Let bind body
+    mk_bind (UnsafeEqualityCase scrut b con bs) body
+      = mkSingleAltCase scrut b con bs body
+    mk_bind (FloatTick tickish) body
+      = mkTick tickish body
+
+-- | Computes the \"worst\" 'FloatInfo' for a 'Floats', e.g., the 'FloatInfo'
+-- of the innermost 'OrdList' which is non-empty.
+floatsInfo :: Floats -> FloatInfo
+floatsInfo floats
+  | not (isNilOL (str_floats floats))  = StrictContextFloatable
+  | not (isNilOL (lzy_floats floats))  = LazyContextFloatable
+  | otherwise                          = TopLvlFloatable
+
+-- | Append a 'FloatingBind' `b` to a 'Floats' telescope `bs` that may reference any
+-- binding of the 'Floats'.
+-- This function picks the appropriate 'OrdList' in `bs` that `b` is appended to,
+-- respecting its 'FloatInfo' and scoping.
+snocFloat :: Floats -> FloatingBind -> Floats
+snocFloat floats fb@(Float bind bound info)
+  | CaseBound <- bound
+  , TopLvlFloatable <- info
+  , NonRec _ rhs <- bind
+  , exprIsTickedString rhs
+  -- Always insert Literal (strings) at the front.
+  -- They won't scope over any existing binding in `floats`.
+  = floats { top_floats = fb `consOL` top_floats floats }
+  | otherwise
+  = snoc_at floats fb (max (floatsInfo floats) info)
+      -- max: Respect scoping, hence `floatsInfo`, and respect info of `fb`
+snocFloat floats fb at UnsafeEqualityCase{} -- treat it like an ok-for-spec Float
+  = snoc_at floats fb (max (floatsInfo floats) LazyContextFloatable)
+snocFloat floats fb at FloatTick{}
+  = snoc_at floats fb (floatsInfo floats) -- Ticks are simply snoced
+
+snoc_at :: Floats -> FloatingBind -> FloatInfo -> Floats
+snoc_at floats fb info = case info of
+  TopLvlFloatable        -> floats { top_floats   = top_floats floats `snocOL` fb }
+  LazyContextFloatable   -> floats { lzy_floats   = lzy_floats floats `snocOL` fb }
+  StrictContextFloatable -> floats { str_floats = str_floats floats `snocOL` fb }
+
+-- | Put all floats with 'FloatInfo' less than or equal to the given one in the
+-- left partition, the rest in the right. Morally,
+--
+-- > partitionFloats info fs = partition (<= info) fs
+--
+partitionFloats :: FloatInfo -> Floats -> (Floats, Floats)
+partitionFloats info (Floats top lzy str) = case info of
+  TopLvlFloatable        -> (Floats top nilOL nilOL, Floats nilOL lzy   str)
+  LazyContextFloatable   -> (Floats top lzy   nilOL, Floats nilOL nilOL str)
+  StrictContextFloatable -> (Floats top lzy   str,   Floats nilOL nilOL nilOL)
+
+push_down_floats :: FloatInfo -> Floats -> Floats
+push_down_floats info fs@(Floats top lzy str) = case info of
+  TopLvlFloatable        -> fs
+  LazyContextFloatable   -> Floats nilOL (top `appOL` lzy) str
+  StrictContextFloatable -> Floats nilOL nilOL             (getFloats fs)
+
+-- | Cons a 'FloatingBind' `b` to a 'Floats' telescope `bs` which scopes over
+-- `b`. This function appropriately pushes around bindings in the 'OrdList's of
+-- `bs` so that `b` is the very first binding in the resulting telescope.
+consFloat :: FloatingBind -> Floats -> Floats
+consFloat fb at FloatTick{} floats =
+  floats { top_floats = fb `consOL` top_floats floats }
+consFloat fb@(Float _ _ info) floats = case info of
+  TopLvlFloatable        -> floats' { top_floats = fb `consOL` top_floats floats' }
+  LazyContextFloatable   -> floats' { lzy_floats = fb `consOL` lzy_floats floats' }
+  StrictContextFloatable -> floats' { str_floats = fb `consOL` str_floats floats' }
+  where
+    floats' = push_down_floats info floats
+consFloat fb at UnsafeEqualityCase{} floats = -- like the LazyContextFloatable Float case
+  floats' { lzy_floats = fb `consOL` lzy_floats      floats' }
+  where
+    floats' = push_down_floats LazyContextFloatable floats
+
+-- | Append two telescopes, nesting the right inside the left.
+appFloats :: Floats -> Floats -> Floats
+appFloats outer inner =
+  -- After having pushed down floats in inner, we can simply zip with up.
+  zipFloats outer (push_down_floats (floatsInfo outer) inner)
+
+-- | Zip up two telescopes which don't scope over each other.
+zipFloats :: Floats -> Floats -> Floats
+zipFloats floats1 floats2
+  = Floats
+  { top_floats = top_floats floats1 `appOL` top_floats floats2
+  , lzy_floats = lzy_floats floats1 `appOL` lzy_floats floats2
+  , str_floats = str_floats floats1 `appOL` str_floats floats2
+  }
 
-combine :: OkToSpec -> OkToSpec -> OkToSpec
-combine NotOkToSpec _  = NotOkToSpec
-combine _ NotOkToSpec  = NotOkToSpec
-combine IfUnliftedOk _ = IfUnliftedOk
-combine _ IfUnliftedOk = IfUnliftedOk
-combine _ _            = OkToSpec
+zipManyFloats :: [Floats] -> Floats
+zipManyFloats = foldr zipFloats emptyFloats
 
 deFloatTop :: Floats -> [CoreBind]
--- For top level only; we don't expect any FloatCases
-deFloatTop (Floats _ floats)
-  = foldrOL get [] floats
+-- For top level only; we don't expect any Strict or LazyContextFloatable 'FloatInfo'
+deFloatTop floats
+  = foldrOL get [] (getFloats floats)
   where
-    get (FloatLet b)               bs = get_bind b                 : bs
-    get (FloatString body var)     bs = get_bind (NonRec var body) : bs
-    get (FloatCase body var _ _ _) bs = get_bind (NonRec var body) : bs
-    get b _ = pprPanic "corePrepPgm" (ppr b)
+    get (Float b _ TopLvlFloatable) bs
+      = get_bind b : bs
+    get b _  = pprPanic "corePrepPgm" (ppr b)
 
     -- See Note [Dead code in CorePrep]
     get_bind (NonRec x e) = NonRec x (occurAnalyseExpr e)
@@ -1951,25 +2110,80 @@ deFloatTop (Floats _ floats)
 
 ---------------------------------------------------------------------------
 
-wantFloatNested :: RecFlag -> Demand -> Bool -> Floats -> CpeRhs -> Bool
-wantFloatNested is_rec dmd rhs_is_unlifted floats rhs
-  =  isEmptyFloats floats
-  || isStrUsedDmd dmd
-  || rhs_is_unlifted
-  || (allLazyNested is_rec floats && exprIsHNF rhs)
-        -- Why the test for allLazyNested?
-        --      v = f (x `divInt#` y)
-        -- we don't want to float the case, even if f has arity 2,
-        -- because floating the case would make it evaluated too early
-
-allLazyTop :: Floats -> Bool
-allLazyTop (Floats OkToSpec _) = True
-allLazyTop _                   = False
-
-allLazyNested :: RecFlag -> Floats -> Bool
-allLazyNested _      (Floats OkToSpec    _)  = True
-allLazyNested _      (Floats NotOkToSpec _)  = False
-allLazyNested is_rec (Floats IfUnliftedOk _) = isNonRec is_rec
+{- Note [wantFloatLocal]
+~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+  let x = let y = e1 in e2
+  in e
+Do we want to float out `y` out of `x`?
+(Similarly for `(\x. e) (let y = e1 in e2)`.)
+`wantFloatLocal` is concerned with answering this question.
+It considers the Demand on `x`, whether or not `e2` is unlifted and the
+`FloatInfo` of the `y` binding (e.g., it might itself be unlifted, a value,
+strict, or ok-for-spec)
+
+We float out if ...
+  1. ... the binding context is strict anyway, so either `x` is used strictly
+     or has unlifted type.
+     Doing so is trivially sound and won`t increase allocations, so we
+     return `FloatAll`.
+  2. ... `e2` becomes a value in doing so, in which case we won`t need to
+     allocate a thunk for `x`/the arg that closes over the FVs of `e1`.
+     In general, this is only sound if `y=e1` is `LazyContextFloatable`.
+     (See Note [BindInfo and FloatInfo].)
+     Nothing is won if `x` doesn't become a value, so we return `FloatNone`
+     if there are any float is `StrictContextFloatable`, and return `FloatAll`
+     otherwise.
+
+To elaborate on (2), consider the case when the floated binding is
+`e1 = divInt# a b`, e.g., not `LazyContextFloatable`:
+  let x = f (a `divInt#` b)
+  in e
+this ANFises to
+  let x = case a `divInt#` b of r { __DEFAULT -> f r }
+  in e
+If `x` is used lazily, we may not float `r` further out.
+A float binding `x +# y` is OK, though, and in so every ok-for-spec-eval
+binding is `LazyContextFloatable`.
+
+Wrinkles:
+
+ (W1) When the outer binding is a letrec, i.e.,
+        letrec x = case a +# b of r { __DEFAULT -> f y r }
+               y = [x]
+        in e
+      we don't want to float `LazyContextFloatable` bindings such as `r` either
+      and require `TopLvlFloatable` instead.
+      The reason is that we don't track FV of FloatBindings, so we would need
+      to park them in the letrec,
+        letrec r = a +# b -- NB: r`s RHS might scope over x and y
+               x = f y r
+               y = [x]
+        in e
+      and now we have violated Note [Core letrec invariant].
+      So we preempt this case in `wantFloatLocal`, responding `FloatNone` unless
+      all floats are `TopLvlFloatable`.
+-}
+
+
+
+wantFloatLocal :: RecFlag -> Demand -> Bool -> Floats -> CpeRhs -> FloatDecision
+-- See Note [wantFloatLocal]
+wantFloatLocal is_rec rhs_dmd rhs_is_unlifted floats rhs
+  |  isEmptyFloats floats -- Well yeah...
+  || isStrUsedDmd rhs_dmd -- Case (1) of Note [wantFloatLocal]
+  || rhs_is_unlifted      -- dito
+  || (floatsInfo floats <= max_float_info && exprIsHNF rhs)
+                          -- Case (2) of Note [wantFloatLocal]
+  = FloatAll
+
+  | otherwise
+  = FloatNone
+  where
+    max_float_info | isRec is_rec = TopLvlFloatable
+                   | otherwise    = LazyContextFloatable
+                    -- See Note [wantFloatLocal], Wrinkle (W1)
+                    -- for 'is_rec'
 
 {-
 ************************************************************************
@@ -2224,26 +2438,33 @@ newVar ty
 
 -- | Like wrapFloats, but only wraps tick floats
 wrapTicks :: Floats -> CoreExpr -> (Floats, CoreExpr)
-wrapTicks (Floats flag floats0) expr =
-    (Floats flag (toOL $ reverse floats1), foldr mkTick expr (reverse ticks1))
-  where (floats1, ticks1) = foldlOL go ([], []) $ floats0
+wrapTicks floats expr
+  | (floats1, ticks1) <- fold_fun go floats
+  = (floats1, foldrOL mkTick expr ticks1)
+  where fold_fun f floats =
+           let (top, ticks1) = foldlOL f (nilOL,nilOL) (top_floats floats)
+               (lzy, ticks2) = foldlOL f (nilOL,ticks1) (lzy_floats floats)
+               (str, ticks3) = foldlOL f (nilOL,ticks2) (str_floats floats)
+           in (Floats top lzy str, ticks3)
         -- Deeply nested constructors will produce long lists of
         -- redundant source note floats here. We need to eliminate
         -- those early, as relying on mkTick to spot it after the fact
         -- can yield O(n^3) complexity [#11095]
-        go (floats, ticks) (FloatTick t)
+        go (flt_binds, ticks) (FloatTick t)
           = assert (tickishPlace t == PlaceNonLam)
-            (floats, if any (flip tickishContains t) ticks
-                     then ticks else t:ticks)
-        go (floats, ticks) f at FloatString{}
-          = (f:floats, ticks) -- don't need to wrap the tick around the string; nothing to execute.
-        go (floats, ticks) f
-          = (foldr wrap f (reverse ticks):floats, ticks)
-
-        wrap t (FloatLet bind)           = FloatLet (wrapBind t bind)
-        wrap t (FloatCase r b con bs ok) = FloatCase (mkTick t r) b con bs ok
-        wrap _ other                     = pprPanic "wrapTicks: unexpected float!"
-                                             (ppr other)
+            (flt_binds, if any (flip tickishContains t) ticks
+                        then ticks else ticks `snocOL` t)
+        go (flt_binds, ticks) f at UnsafeEqualityCase{}
+          -- unsafe equality case will be erased; don't wrap anything!
+          = (flt_binds `snocOL` f, ticks)
+        go (flt_binds, ticks) f@(Float _ BoundVal _)
+          -- don't need to wrap the tick around a value; nothing to execute.
+          = (flt_binds `snocOL` f, ticks)
+        go (flt_binds, ticks) f at Float{}
+          = (flt_binds `snocOL` foldrOL wrap f ticks, ticks)
+
+        wrap t (Float bind bound info) = Float (wrapBind t bind) bound info
+        wrap _ f                 = pprPanic "Unexpected FloatingBind" (ppr f)
         wrapBind t (NonRec binder rhs) = NonRec binder (mkTick t rhs)
         wrapBind t (Rec pairs)         = Rec (mapSnd (mkTick t) pairs)
 


=====================================
compiler/GHC/Data/OrdList.hs
=====================================
@@ -16,8 +16,8 @@ module GHC.Data.OrdList (
         OrdList, pattern NilOL, pattern ConsOL, pattern SnocOL,
         nilOL, isNilOL, unitOL, appOL, consOL, snocOL, concatOL, lastOL,
         headOL,
-        mapOL, mapOL', fromOL, toOL, foldrOL, foldlOL, reverseOL, fromOLReverse,
-        strictlyEqOL, strictlyOrdOL
+        mapOL, mapOL', fromOL, toOL, foldrOL, foldlOL,
+        partitionOL, reverseOL, fromOLReverse, strictlyEqOL, strictlyOrdOL
 ) where
 
 import GHC.Prelude
@@ -220,6 +220,25 @@ foldlOL k z (Snoc xs x) = let !z' = (foldlOL k z xs) in k z' x
 foldlOL k z (Two b1 b2) = let !z' = (foldlOL k z b1) in foldlOL k z' b2
 foldlOL k z (Many xs)   = foldl' k z xs
 
+partitionOL :: (a -> Bool) -> OrdList a -> (OrdList a, OrdList a)
+partitionOL _ None = (None,None)
+partitionOL f (One x)
+  | f x       = (One x, None)
+  | otherwise = (None, One x)
+partitionOL f (Two xs ys) = (Two ls1 ls2, Two rs1 rs2)
+  where !(!ls1,!rs1) = partitionOL f xs
+        !(!ls2,!rs2) = partitionOL f ys
+partitionOL f (Cons x xs)
+  | f x       = (Cons x ls, rs)
+  | otherwise = (ls, Cons x rs)
+  where !(!ls,!rs) = partitionOL f xs
+partitionOL f (Snoc xs x)
+  | f x       = (Snoc ls x, rs)
+  | otherwise = (ls, Snoc rs x)
+  where !(!ls,!rs) = partitionOL f xs
+partitionOL f (Many xs) = (toOL ls, toOL rs)
+  where !(!ls,!rs) = NE.partition f xs
+
 toOL :: [a] -> OrdList a
 toOL [] = None
 toOL [x] = One x


=====================================
compiler/GHC/Stg/InferTags.hs
=====================================
@@ -64,8 +64,8 @@ With nofib being ~0.3% faster as well.
 
 See Note [Tag inference passes] for how we proceed to generate and use this information.
 
-Note [Strict Field Invariant]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [STG Strict Field Invariant]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As part of tag inference we introduce the Strict Field Invariant.
 Which consists of us saying that:
 
@@ -124,15 +124,33 @@ Note that there are similar constraints around Note [CBV Function Ids].
 
 Note [How untagged pointers can end up in strict fields]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Since the resolution of #20749 where Core passes assume that DataCon workers
+evaluate their strict fields, it is pretty simple to see how the Simplifier
+might exploit that knowledge to drop evals. Example:
+
+  data MkT a = MkT !a
+  f :: [Int] -> T [Int]
+  f xs = xs `seq` MkT xs
+
+in Core we will have
+
+  f = \xs -> MkT @[Int] xs
+
+No eval left there.
+
 Consider
   data Set a = Tip | Bin !a (Set a) (Set a)
 
 We make a wrapper for Bin that evaluates its arguments
   $WBin x a b = case x of xv -> Bin xv a b
 Here `xv` will always be evaluated and properly tagged, just as the
-Strict Field Invariant requires.
+Note [STG Strict Field Invariant] requires.
+
+But alas, the Simplifier can destroy the invariant: see #15696.
+Indeed, as Note [Strict fields in Core] explains, Core passes
+assume that Data constructor workers evaluate their strict fields,
+so the Simplifier will drop seqs freely.
 
-But alas the Simplifier can destroy the invariant: see #15696.
 We start with
   thk = f ()
   g x = ...(case thk of xv -> Bin xv Tip Tip)...
@@ -153,7 +171,7 @@ Now you can see that the argument of Bin, namely thk, points to the
 thunk, not to the value as it did before.
 
 In short, although it may be rare, the output of optimisation passes
-cannot guarantee to obey the Strict Field Invariant. For this reason
+cannot guarantee to obey the Note [STG Strict Field Invariant]. For this reason
 we run tag inference. See Note [Tag inference passes].
 
 Note [Tag inference passes]
@@ -163,7 +181,7 @@ Tag inference proceeds in two passes:
   The result is then attached to /binders/.
   This is implemented by `inferTagsAnal` in GHC.Stg.InferTags
 * The second pass walks over the AST checking if the Strict Field Invariant is upheld.
-  See Note [Strict Field Invariant].
+  See Note [STG Strict Field Invariant].
   If required this pass modifies the program to uphold this invariant.
   Tag information is also moved from /binders/ to /occurrences/ during this pass.
   This is done by `GHC.Stg.InferTags.Rewrite (rewriteTopBinds)`.


=====================================
compiler/GHC/Stg/InferTags/Rewrite.hs
=====================================
@@ -65,7 +65,7 @@ The work of this pass is simple:
 * For any strict field we check if the argument is known to be properly tagged.
 * If it's not known to be properly tagged, we wrap the whole thing in a case,
   which will force the argument before allocation.
-This is described in detail in Note [Strict Field Invariant].
+This is described in detail in Note [STG Strict Field Invariant].
 
 The only slight complication is that we have to make sure not to invalidate free
 variable analysis in the process.
@@ -218,7 +218,7 @@ When compiling bytecode we call myCoreToStg to get STG code first.
 myCoreToStg in turn calls out to stg2stg which runs the STG to STG
 passes followed by free variables analysis and the tag inference pass including
 it's rewriting phase at the end.
-Running tag inference is important as it upholds Note [Strict Field Invariant].
+Running tag inference is important as it upholds Note [STG Strict Field Invariant].
 While code executed by GHCi doesn't take advantage of the SFI it can call into
 compiled code which does. So it must still make sure that the SFI is upheld.
 See also #21083 and #22042.


=====================================
compiler/GHC/Tc/TyCl/Build.hs
=====================================
@@ -183,14 +183,15 @@ buildDataCon fam_envs dc_bang_opts src_name declared_infix prom_info src_bangs
               tag = lookupNameEnv_NF tag_map src_name
               -- See Note [Constructor tag allocation], fixes #14657
               data_con = mkDataCon src_name declared_infix prom_info
-                                   src_bangs field_lbls
+                                   src_bangs impl_bangs str_marks field_lbls
                                    univ_tvs ex_tvs
                                    noConcreteTyVars
                                    user_tvbs eq_spec ctxt
                                    arg_tys res_ty NoPromInfo rep_tycon tag
                                    stupid_ctxt dc_wrk dc_rep
               dc_wrk = mkDataConWorkId work_name data_con
-              dc_rep = initUs_ us (mkDataConRep dc_bang_opts fam_envs wrap_name data_con)
+              (dc_rep, impl_bangs, str_marks) =
+                 initUs_ us (mkDataConRep dc_bang_opts fam_envs wrap_name data_con)
 
         ; traceIf (text "buildDataCon 2" <+> ppr src_name)
         ; return data_con }


=====================================
compiler/GHC/Types/Demand.hs
=====================================
@@ -1385,33 +1385,8 @@ arguments.  That is the job of dmdTransformDataConSig.  More precisely,
  * it returns the demands on the arguments;
    in the above example that is [SL, A]
 
-Nasty wrinkle. Consider this code (#22475 has more realistic examples but
-assume this is what the demand analyser sees)
-
-   data T = MkT !Int Bool
-   get :: T -> Bool
-   get (MkT _ b) = b
-
-   foo = let v::Int = I# 7
-             t::T   = MkT v True
-         in get t
-
-Now `v` is unused by `get`, /but/ we can't give `v` an Absent demand,
-else we'll drop the binding and replace it with an error thunk.
-Then the code generator (more specifically GHC.Stg.InferTags.Rewrite)
-will add an extra eval of MkT's argument to give
-   foo = let v::Int = error "absent"
-             t::T   = case v of v' -> MkT v' True
-         in get t
-
-Boo!  Because of this extra eval (added in STG-land), the truth is that `MkT`
-may (or may not) evaluate its arguments (as established in #21497). Hence the
-use of `bump` in dmdTransformDataConSig, which adds in a `C_01` eval. The
-`C_01` says "may or may not evaluate" which is absolutely faithful to what
-InferTags.Rewrite does.
-
-In particular it is very important /not/ to make that a `C_11` eval,
-see Note [Data-con worker strictness].
+When the data constructor worker has strict fields, they act as additional
+seqs; hence we add an additional `C_11` eval.
 -}
 
 {- *********************************************************************
@@ -1611,6 +1586,29 @@ a bad fit because
    expression may not throw a precise exception (increasing precision of the
    analysis), but that's just a favourable guess.
 
+Note [Side-effects and strictness]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Due to historic reasons and the continued effort not to cause performance
+regressions downstream, Strictness Analysis is currently prone to discarding
+observable side-effects (other than precise exceptions, see
+Note [Precise exceptions and strictness analysis]) in some cases. For example,
+  f :: MVar () -> Int -> IO Int
+  f mv x = putMVar mv () >> (x `seq` return x)
+The call to `putMVar` is an observable side-effect. Yet, Strictness Analysis
+currently concludes that `f` is strict in `x` and uses call-by-value.
+That means `f mv (error "boom")` will error out with the imprecise exception
+rather performing the side-effect.
+
+This is a conscious violation of the semantics described in the paper
+"a semantics for imprecise exceptions"; so it would be great if we could
+identify the offending primops and extend the idea in
+Note [Which scrutinees may throw precise exceptions] to general side-effects.
+
+Unfortunately, the existing has-side-effects classification for primops is
+too conservative, listing `writeMutVar#` and even `readMutVar#` as
+side-effecting. That is due to #3207. A possible way forward is described in
+#17900, but no effort has been so far towards a resolution.
+
 Note [Exceptions and strictness]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We used to smart about catching exceptions, but we aren't anymore.
@@ -2327,7 +2325,7 @@ dmdTransformDataConSig str_marks sd = case viewProd arity body_sd of
     mk_body_ty n dmds = DmdType nopDmdEnv (zipWith (bump n) str_marks dmds)
     bump n str dmd | isMarkedStrict str = multDmd n (plusDmd str_field_dmd dmd)
                    | otherwise          = multDmd n dmd
-    str_field_dmd = C_01 :* seqSubDmd -- Why not C_11? See Note [Data-con worker strictness]
+    str_field_dmd = C_11 :* seqSubDmd -- See Note [Strict fields in Core]
 
 -- | A special 'DmdTransformer' for dictionary selectors that feeds the demand
 -- on the result into the indicated dictionary component (if saturated).


=====================================
compiler/GHC/Types/Id/Info.hs
=====================================
@@ -260,7 +260,7 @@ The invariants around the arguments of call by value function like Ids are then:
   * Any `WorkerLikeId`
   * Some `JoinId` bindings.
 
-This works analogous to the Strict Field Invariant. See also Note [Strict Field Invariant].
+This works analogous to the Strict Field Invariant. See also Note [STG Strict Field Invariant].
 
 To make this work what we do is:
 * During W/W and SpecConstr any worker/specialized binding we introduce


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -58,7 +58,7 @@ import GHC.Core.Coercion
 import GHC.Core.Reduction
 import GHC.Core.Make
 import GHC.Core.FVs     ( mkRuleInfo )
-import GHC.Core.Utils   ( exprType, mkCast, mkDefaultCase, coreAltsType )
+import GHC.Core.Utils   ( exprType, mkCast, coreAltsType )
 import GHC.Core.Unfold.Make
 import GHC.Core.SimpleOpt
 import GHC.Core.TyCon
@@ -594,8 +594,12 @@ mkDataConWorkId wkr_name data_con
   = mkGlobalId (DataConWorkId data_con) wkr_name wkr_ty alg_wkr_info
 
   where
-    tycon  = dataConTyCon data_con  -- The representation TyCon
-    wkr_ty = dataConRepType data_con
+    tycon     = dataConTyCon data_con  -- The representation TyCon
+    wkr_ty    = dataConRepType data_con
+    univ_tvs  = dataConUnivTyVars data_con
+    ex_tcvs   = dataConExTyCoVars data_con
+    arg_tys   = dataConRepArgTys  data_con  -- Should be same as dataConOrigArgTys
+    str_marks = dataConRepStrictness data_con
 
     ----------- Workers for data types --------------
     alg_wkr_info = noCafIdInfo
@@ -603,12 +607,19 @@ mkDataConWorkId wkr_name data_con
                    `setInlinePragInfo`     wkr_inline_prag
                    `setUnfoldingInfo`      evaldUnfolding  -- Record that it's evaluated,
                                                            -- even if arity = 0
+                   `setDmdSigInfo`         wkr_sig
+                      -- Workers eval their strict fields
+                      -- See Note [Strict fields in Core]
                    `setLFInfo`             wkr_lf_info
-          -- No strictness: see Note [Data-con worker strictness] in GHC.Core.DataCon
 
     wkr_inline_prag = defaultInlinePragma { inl_rule = ConLike }
     wkr_arity = dataConRepArity data_con
 
+    wkr_sig = mkClosedDmdSig wkr_dmds topDiv
+    wkr_dmds = map mk_dmd str_marks
+    mk_dmd MarkedStrict    = evalDmd
+    mk_dmd NotMarkedStrict = topDmd
+
     -- See Note [LFInfo of DataCon workers and wrappers]
     wkr_lf_info
       | wkr_arity == 0 = LFCon data_con
@@ -616,9 +627,6 @@ mkDataConWorkId wkr_name data_con
                                             -- LFInfo stores post-unarisation arity
 
     ----------- Workers for newtypes --------------
-    univ_tvs = dataConUnivTyVars data_con
-    ex_tcvs  = dataConExTyCoVars data_con
-    arg_tys  = dataConRepArgTys  data_con  -- Should be same as dataConOrigArgTys
     nt_work_info = noCafIdInfo          -- The NoCaf-ness is set by noCafIdInfo
                   `setArityInfo` 1      -- Arity 1
                   `setInlinePragInfo`     dataConWrapperInlinePragma
@@ -786,10 +794,10 @@ mkDataConRep :: DataConBangOpts
              -> FamInstEnvs
              -> Name
              -> DataCon
-             -> UniqSM DataConRep
+             -> UniqSM (DataConRep, [HsImplBang], [StrictnessMark])
 mkDataConRep dc_bang_opts fam_envs wrap_name data_con
   | not wrapper_reqd
-  = return NoDataConRep
+  = return (NoDataConRep, arg_ibangs, rep_strs)
 
   | otherwise
   = do { wrap_args <- mapM (newLocal (fsLit "conrep")) wrap_arg_tys
@@ -853,11 +861,8 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
 
        ; return (DCR { dcr_wrap_id = wrap_id
                      , dcr_boxer   = mk_boxer boxers
-                     , dcr_arg_tys = rep_tys
-                     , dcr_stricts = rep_strs
-                       -- For newtypes, dcr_bangs is always [HsLazy].
-                       -- See Note [HsImplBangs for newtypes].
-                     , dcr_bangs   = arg_ibangs }) }
+                     , dcr_arg_tys = rep_tys }
+                , arg_ibangs, rep_strs) }
 
   where
     (univ_tvs, ex_tvs, eq_spec, theta, orig_arg_tys, _orig_res_ty)
@@ -907,8 +912,8 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
                      -- (Most) newtypes have only a worker, with the exception
                      -- of some newtypes written with GADT syntax.
                      -- See dataConUserTyVarsNeedWrapper below.
-         && (any isBanged (ev_ibangs ++ arg_ibangs)))
-                     -- Some forcing/unboxing (includes eq_spec)
+         && (any isUnpacked (ev_ibangs ++ arg_ibangs)))
+                     -- Some unboxing (includes eq_spec)
       || isFamInstTyCon tycon -- Cast result
       || (dataConUserTyVarsNeedWrapper data_con
                      -- If the data type was written with GADT syntax and
@@ -1185,7 +1190,7 @@ dataConArgRep arg_ty HsLazy
   = ([(arg_ty, NotMarkedStrict)], (unitUnboxer, unitBoxer))
 
 dataConArgRep arg_ty (HsStrict _)
-  = ([(arg_ty, MarkedStrict)], (seqUnboxer, unitBoxer))
+  = ([(arg_ty, MarkedStrict)], (unitUnboxer, unitBoxer)) -- Seqs are inserted in STG
 
 dataConArgRep arg_ty (HsUnpack Nothing)
   = dataConArgUnpack arg_ty
@@ -1215,9 +1220,6 @@ wrapCo co rep_ty (unbox_rep, box_rep)  -- co :: arg_ty ~ rep_ty
                ; return (rep_ids, rep_expr `Cast` mkSymCo sco) }
 
 ------------------------
-seqUnboxer :: Unboxer
-seqUnboxer v = return ([v], mkDefaultCase (Var v) v)
-
 unitUnboxer :: Unboxer
 unitUnboxer v = return ([v], \e -> e)
 


=====================================
compiler/GHC/Utils/Misc.hs
=====================================
@@ -27,7 +27,7 @@ module GHC.Utils.Misc (
 
         dropWhileEndLE, spanEnd, last2, lastMaybe, onJust,
 
-        List.foldl1', foldl2, count, countWhile, all2,
+        List.foldl1', foldl2, count, countWhile, all2, all2Prefix, all3Prefix,
 
         lengthExceeds, lengthIs, lengthIsNot,
         lengthAtLeast, lengthAtMost, lengthLessThan,
@@ -663,6 +663,29 @@ all2 _ []     []     = True
 all2 p (x:xs) (y:ys) = p x y && all2 p xs ys
 all2 _ _      _      = False
 
+all2Prefix :: (a -> b -> Bool) -> [a] -> [b] -> Bool
+-- ^ `all2Prefix p xs ys` is a fused version of `and $ zipWith2 p xs ys`.
+-- So if one list is shorter than the other, `p` is assumed to be `True` for the
+-- suffix.
+all2Prefix p = foldr (\x go ys' -> case ys' of (y:ys'') -> p x y && go ys''; _ -> True) (\_ -> True)
+{-# INLINE all2Prefix #-}
+-- all2Prefix p xs ys = go xs ys
+--   where go (x:xs) (y:ys) = p x y && go xs ys
+--         go _      _      = True
+-- {-# INLINABLE all2Prefix #-}
+
+all3Prefix :: (a -> b -> c -> Bool) -> [a] -> [b] -> [c] -> Bool
+-- ^ `all3Prefix p xs ys zs` is a fused version of `and $ zipWith3 p xs ys zs`.
+-- So if one list is shorter than the others, `p` is assumed to be `True` for
+-- the suffix.
+all3Prefix p xs ys zs = foldr (\y go xs' zs' -> case (xs',zs') of (x:xs'',z:zs'') -> p x y z && go xs'' zs''; _ -> False) (\_ _ -> True) ys xs zs
+{-# INLINE all3Prefix #-}
+-- all3Prefix p xs ys zs = go xs ys zs
+--   where
+--     go (x:xs) (y:ys) (z:zs) = p x y z && go xs ys zs
+--     go _      _      _      = True
+-- {-# INLINABLE all3Prefix #-}
+
 -- Count the number of times a predicate is true
 
 count :: (a -> Bool) -> [a] -> Int


=====================================
testsuite/tests/simplCore/should_compile/T18013.stderr
=====================================
@@ -17,6 +17,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: Class op . (BUILTIN)
@@ -25,6 +27,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Arrow (BUILTIN)
 Rule fired: Class op $p1Arrow (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
@@ -38,6 +42,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op first (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
@@ -48,6 +54,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @(_, ()) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: Class op . (BUILTIN)
@@ -56,6 +64,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op . (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op $p1Applicative (BUILTIN)
@@ -70,6 +80,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Arrow (BUILTIN)
 Rule fired: Class op $p1Arrow (BUILTIN)
 Rule fired: Class op id (BUILTIN)
@@ -83,6 +95,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op ||| (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op $p1Applicative (BUILTIN)
@@ -98,6 +112,8 @@ Rule fired: Class op $p1Applicative (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
+Rule fired: mkRule @((), _) (T18013a)
+Rule fired: Class op fmap (BUILTIN)
 Rule fired: Class op $p1Monad (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: Class op . (BUILTIN)
@@ -108,22 +124,6 @@ Rule fired: Class op >>= (BUILTIN)
 Rule fired: Class op pure (BUILTIN)
 Rule fired: mkRule @((), _) (T18013a)
 Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @(_, ()) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
-Rule fired: mkRule @((), _) (T18013a)
-Rule fired: Class op fmap (BUILTIN)
 Rule fired: mkRule @(_, ()) (T18013a)
 Rule fired: Class op fmap (BUILTIN)
 Rule fired: mkRule @(_, ()) (T18013a)
@@ -138,9 +138,9 @@ mapMaybeRule [InlPrag=[2]]
   :: forall a b. Rule IO a b -> Rule IO (Maybe a) (Maybe b)
 [GblId,
  Arity=1,
- Str=<1!P(L,LC(S,C(1,C(1,P(L,1L)))))>,
- Unf=Unf{Src=StableSystem, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True,
+ Str=<1!P(SL,LC(S,C(1,C(1,P(L,1L)))))>,
+ Unf=Unf{Src=StableSystem, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
          Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
          Tmpl= \ (@a) (@b) (f [Occ=Once1!] :: Rule IO a b) ->
                  case f of { Rule @s ww ww1 [Occ=OnceL1!] ->
@@ -219,36 +219,41 @@ mapMaybeRule
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule4 :: GHC.Prim.Addr#
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 0}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 20 0}]
 T18013.$trModule4 = "main"#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule3 :: GHC.Types.TrName
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule3 = GHC.Types.TrNameS T18013.$trModule4
 
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule2 :: GHC.Prim.Addr#
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 0}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 30 0}]
 T18013.$trModule2 = "T18013"#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule1 :: GHC.Types.TrName
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule1 = GHC.Types.TrNameS T18013.$trModule2
 
 -- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
 T18013.$trModule :: GHC.Types.Module
 [GblId,
- Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
-         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
+ Unf=Unf{Src=<vanilla>, TopLvl=True,
+         Value=True, ConLike=True, WorkFree=True, Expandable=True,
+         Guidance=IF_ARGS [] 10 10}]
 T18013.$trModule
   = GHC.Types.Module T18013.$trModule3 T18013.$trModule1
 


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -418,7 +418,10 @@ test('T21391', normal, compile, ['-O -dcore-lint'])
 test('T22112', [ grep_errmsg('never matches') ], compile, ['-O -dsuppress-uniques -dno-typeable-binds -fexpose-all-unfoldings -ddump-simpl'])
 test('T21391a', normal, compile, ['-O -dcore-lint'])
 # We don't want to see a thunk allocation for the insertBy expression after CorePrep.
-test('T21392', [ grep_errmsg(r'sat.* :: \[\(.*Unique, .*Int\)\]'), expect_broken(21392) ], compile, ['-O -ddump-prep -dno-typeable-binds -dsuppress-uniques'])
+# Unfortunately, this test is no longer broken after we made workers strict in strict fields,
+# so it is no longer a reproducer for T21392. Still, it doesn't hurt if we test that we don't
+# regress again.
+test('T21392', [ grep_errmsg(r'sat.* :: \[\(.*Unique, .*Int\)\]') ], compile, ['-O -ddump-prep -dno-typeable-binds -dsuppress-uniques'])
 test('T21689', [extra_files(['T21689a.hs'])], multimod_compile, ['T21689', '-v0 -O'])
 test('T21801', normal, compile, ['-O -dcore-lint'])
 test('T21848', [grep_errmsg(r'SPEC wombat') ], compile, ['-O -ddump-spec'])


=====================================
testsuite/tests/simplStg/should_compile/inferTags002.stderr
=====================================
@@ -1,88 +1,30 @@
 
-==================== Output Cmm ====================
-[M.$WMkT_entry() { //  [R3, R2]
-         { info_tbls: [(cym,
-                        label: block_cym_info
-                        rep: StackRep [False]
-                        srt: Nothing),
-                       (cyp,
-                        label: M.$WMkT_info
-                        rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 15} }
-                        srt: Nothing),
-                       (cys,
-                        label: block_cys_info
-                        rep: StackRep [False]
-                        srt: Nothing)]
-           stack_info: arg_space: 8
-         }
-     {offset
-       cyp: // global
-           if ((Sp + -16) < SpLim) (likely: False) goto cyv; else goto cyw;
-       cyv: // global
-           R1 = M.$WMkT_closure;
-           call (stg_gc_fun)(R3, R2, R1) args: 8, res: 0, upd: 8;
-       cyw: // global
-           I64[Sp - 16] = cym;
-           R1 = R2;
-           P64[Sp - 8] = R3;
-           Sp = Sp - 16;
-           if (R1 & 7 != 0) goto cym; else goto cyn;
-       cyn: // global
-           call (I64[R1])(R1) returns to cym, args: 8, res: 8, upd: 8;
-       cym: // global
-           I64[Sp] = cys;
-           _sy8::P64 = R1;
-           R1 = P64[Sp + 8];
-           P64[Sp + 8] = _sy8::P64;
-           call stg_ap_0_fast(R1) returns to cys, args: 8, res: 8, upd: 8;
-       cys: // global
-           Hp = Hp + 24;
-           if (Hp > HpLim) (likely: False) goto cyA; else goto cyz;
-       cyA: // global
-           HpAlloc = 24;
-           call stg_gc_unpt_r1(R1) returns to cys, args: 8, res: 8, upd: 8;
-       cyz: // global
-           I64[Hp - 16] = M.MkT_con_info;
-           P64[Hp - 8] = P64[Sp + 8];
-           P64[Hp] = R1;
-           R1 = Hp - 15;
-           Sp = Sp + 16;
-           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
-     }
- },
- section ""data" . M.$WMkT_closure" {
-     M.$WMkT_closure:
-         const M.$WMkT_info;
- }]
-
-
-
 ==================== Output Cmm ====================
 [M.f_entry() { //  [R2]
-         { info_tbls: [(cyK,
-                        label: block_cyK_info
+         { info_tbls: [(cAs,
+                        label: block_info
                         rep: StackRep []
                         srt: Nothing),
-                       (cyN,
+                       (cAv,
                         label: M.f_info
                         rep: HeapRep static { Fun {arity: 1 fun_type: ArgSpec 5} }
                         srt: Nothing)]
            stack_info: arg_space: 8
          }
      {offset
-       cyN: // global
-           if ((Sp + -8) < SpLim) (likely: False) goto cyO; else goto cyP;
-       cyO: // global
+       _lbl_: // global
+           if ((Sp + -8) < SpLim) (likely: False) goto cAw; else goto cAx;
+       _lbl_: // global
            R1 = M.f_closure;
            call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
-       cyP: // global
-           I64[Sp - 8] = cyK;
+       _lbl_: // global
+           I64[Sp - 8] = cAs;
            R1 = R2;
            Sp = Sp - 8;
-           if (R1 & 7 != 0) goto cyK; else goto cyL;
-       cyL: // global
-           call (I64[R1])(R1) returns to cyK, args: 8, res: 8, upd: 8;
-       cyK: // global
+           if (R1 & 7 != 0) goto cAs; else goto cAt;
+       _lbl_: // global
+           call (I64[R1])(R1) returns to cAs, args: 8, res: 8, upd: 8;
+       _lbl_: // global
            R1 = P64[R1 + 15];
            Sp = Sp + 8;
            call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
@@ -97,47 +39,47 @@
 
 ==================== Output Cmm ====================
 [M.MkT_entry() { //  [R3, R2]
-         { info_tbls: [(cz1,
-                        label: block_cz1_info
+         { info_tbls: [(cAJ,
+                        label: block_info
                         rep: StackRep [False]
                         srt: Nothing),
-                       (cz4,
+                       (cAM,
                         label: M.MkT_info
                         rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 15} }
                         srt: Nothing),
-                       (cz7,
-                        label: block_cz7_info
+                       (cAP,
+                        label: block_info
                         rep: StackRep [False]
                         srt: Nothing)]
            stack_info: arg_space: 8
          }
      {offset
-       cz4: // global
-           if ((Sp + -16) < SpLim) (likely: False) goto cza; else goto czb;
-       cza: // global
+       _lbl_: // global
+           if ((Sp + -16) < SpLim) (likely: False) goto cAS; else goto cAT;
+       _lbl_: // global
            R1 = M.MkT_closure;
            call (stg_gc_fun)(R3, R2, R1) args: 8, res: 0, upd: 8;
-       czb: // global
-           I64[Sp - 16] = cz1;
+       _lbl_: // global
+           I64[Sp - 16] = cAJ;
            R1 = R2;
            P64[Sp - 8] = R3;
            Sp = Sp - 16;
-           if (R1 & 7 != 0) goto cz1; else goto cz2;
-       cz2: // global
-           call (I64[R1])(R1) returns to cz1, args: 8, res: 8, upd: 8;
-       cz1: // global
-           I64[Sp] = cz7;
-           _tyf::P64 = R1;
+           if (R1 & 7 != 0) goto cAJ; else goto cAK;
+       _lbl_: // global
+           call (I64[R1])(R1) returns to cAJ, args: 8, res: 8, upd: 8;
+       _lbl_: // global
+           I64[Sp] = cAP;
+           __locVar_::P64 = R1;
            R1 = P64[Sp + 8];
-           P64[Sp + 8] = _tyf::P64;
-           call stg_ap_0_fast(R1) returns to cz7, args: 8, res: 8, upd: 8;
-       cz7: // global
+           P64[Sp + 8] = __locVar_::P64;
+           call stg_ap_0_fast(R1) returns to cAP, args: 8, res: 8, upd: 8;
+       _lbl_: // global
            Hp = Hp + 24;
-           if (Hp > HpLim) (likely: False) goto czf; else goto cze;
-       czf: // global
+           if (Hp > HpLim) (likely: False) goto cAX; else goto cAW;
+       _lbl_: // global
            HpAlloc = 24;
-           call stg_gc_unpt_r1(R1) returns to cz7, args: 8, res: 8, upd: 8;
-       cze: // global
+           call stg_gc_unpt_r1(R1) returns to cAP, args: 8, res: 8, upd: 8;
+       _lbl_: // global
            I64[Hp - 16] = M.MkT_con_info;
            P64[Hp - 8] = P64[Sp + 8];
            P64[Hp] = R1;
@@ -155,14 +97,14 @@
 
 ==================== Output Cmm ====================
 [M.MkT_con_entry() { //  []
-         { info_tbls: [(czl,
+         { info_tbls: [(cB3,
                         label: M.MkT_con_info
                         rep: HeapRep 2 ptrs { Con {tag: 0 descr:"main:M.MkT"} }
                         srt: Nothing)]
            stack_info: arg_space: 8
          }
      {offset
-       czl: // global
+       _lbl_: // global
            R1 = R1 + 1;
            call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
      }


=====================================
testsuite/tests/stranal/sigs/T16859.stderr
=====================================
@@ -4,7 +4,7 @@ T16859.bar: <1!A><L>
 T16859.baz: <1L><1!P(L)><1C(1,L)>
 T16859.buz: <1!P(L,L)>
 T16859.foo: <1L><L>
-T16859.mkInternalName: <1!P(L)><1L><1L>
+T16859.mkInternalName: <1!P(L)><SL><SL>
 T16859.n_loc: <1!P(A,A,A,1L)>
 T16859.n_occ: <1!P(A,1!P(L,L),A,A)>
 T16859.n_sort: <1!P(1L,A,A,A)>



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/69c44754f3136fb0cba4ba01b66c349c6ec470ee...3f334d12b35c33794fd38a91574a1091b2d9e992

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/69c44754f3136fb0cba4ba01b66c349c6ec470ee...3f334d12b35c33794fd38a91574a1091b2d9e992
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231001/8e68e938/attachment-0001.html>


More information about the ghc-commits mailing list