[Git][ghc/ghc][master] Clone CoVars in CorePrep
Marge Bot (@marge-bot)
gitlab at gitlab.haskell.org
Wed Apr 17 00:09:06 UTC 2024
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC
Commits:
9d38bfa0 by Simon Peyton Jones at 2024-04-16T20:07:51-04:00
Clone CoVars in CorePrep
This MR addresses #24463. It's all explained in the new
Note [Cloning CoVars and TyVars]
- - - - -
5 changed files:
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/CoreToStg/Prep.hs
- + testsuite/tests/core-to-stg/T24463.hs
- testsuite/tests/core-to-stg/all.T
Changes:
=====================================
compiler/GHC/Core/TyCo/Subst.hs
=====================================
@@ -20,7 +20,7 @@ module GHC.Core.TyCo.Subst
extendSubstInScope, extendSubstInScopeList, extendSubstInScopeSet,
extendTCvSubst, extendTCvSubstWithClone,
extendCvSubst, extendCvSubstWithClone,
- extendTvSubst, extendTvSubstBinderAndInScope, extendTvSubstWithClone,
+ extendTvSubst, extendTvSubstWithClone,
extendTvSubstList, extendTvSubstAndInScope,
extendTCvSubstList,
unionSubst, zipTyEnv, zipCoEnv,
@@ -372,13 +372,6 @@ extendTvSubst (Subst in_scope ids tvs cvs) tv ty
= assert (isTyVar tv) $
Subst in_scope ids (extendVarEnv tvs tv ty) cvs
-extendTvSubstBinderAndInScope :: Subst -> PiTyBinder -> Type -> Subst
-extendTvSubstBinderAndInScope subst (Named (Bndr v _)) ty
- = assert (isTyVar v )
- extendTvSubstAndInScope subst v ty
-extendTvSubstBinderAndInScope subst (Anon {}) _
- = subst
-
extendTvSubstWithClone :: Subst -> TyVar -> TyVar -> Subst
-- Adds a new tv -> tv mapping, /and/ extends the in-scope set with the clone
-- Does not look in the kind of the new variable;
=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -205,8 +205,7 @@ module GHC.Core.Type (
zapSubst, getSubstInScope, setInScope, getSubstRangeTyCoFVs,
extendSubstInScope, extendSubstInScopeList, extendSubstInScopeSet,
extendTCvSubst, extendCvSubst,
- extendTvSubst, extendTvSubstBinderAndInScope,
- extendTvSubstList, extendTvSubstAndInScope,
+ extendTvSubst, extendTvSubstList, extendTvSubstAndInScope,
extendTCvSubstList,
extendTvSubstWithClone,
extendTCvSubstWithClone,
=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -35,6 +35,7 @@ import GHC.Core.Utils
import GHC.Core.Opt.Arity
import GHC.Core.Lint ( EndPassConfig(..), endPassIO )
import GHC.Core
+import GHC.Core.Subst
import GHC.Core.Make hiding( FloatBind(..) ) -- We use our own FloatBind here
import GHC.Core.Type
import GHC.Core.Coercion
@@ -56,7 +57,6 @@ import GHC.Utils.Logger
import GHC.Types.Demand
import GHC.Types.Var
-import GHC.Types.Var.Env
import GHC.Types.Id
import GHC.Types.Id.Info
import GHC.Types.Id.Make ( realWorldPrimId )
@@ -103,25 +103,7 @@ The goal of this pass is to prepare for code generation.
5. ANF-isation results in additional bindings that can obscure values.
We float these out; see Note [Floating in CorePrep].
-6. Clone all local Ids.
- This means that all such Ids are unique, rather than the
- weaker guarantee of no clashes which the simplifier provides.
- And that is what the code generator needs.
-
- We don't clone TyVars or CoVars. The code gen doesn't need that,
- and doing so would be tiresome because then we'd need
- to substitute in types and coercions.
-
- We need to clone ids for two reasons:
- + Things associated with labels in the final code must be truly unique in
- order to avoid labels being shadowed in the final output.
- + Even binders without info tables like function arguments or alternative
- bound binders must be unique at least in their type/unique combination.
- We only emit a single declaration for each binder when compiling to C
- so if binders are not unique we would either get duplicate declarations
- or misstyped variables. The later happend in #22402.
- + We heavily use unique-keyed maps in the backend which can go wrong when
- ids with the same unique are meant to represent the same variable.
+6. Clone all local Ids. See Note [Cloning in CorePrep]
7. Give each dynamic CCall occurrence a fresh unique; this is
rather like the cloning step above.
@@ -178,6 +160,65 @@ Here is the syntax of the Core produced by CorePrep:
We define a synonym for each of these non-terminals. Functions
with the corresponding name produce a result in that syntax.
+
+Note [Cloning in CorePrep]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+In CorePrep we
+* Always clone non-CoVar Ids, so each has a unique Unique
+* Sometimes clone CoVars and TyVars
+
+We always clone non-CoVarIds, for three reasons
+
+1. Things associated with labels in the final code must be truly unique in
+ order to avoid labels being shadowed in the final output.
+
+2. Even binders without info tables like function arguments or alternative
+ bound binders must be unique at least in their type/unique combination.
+ We only emit a single declaration for each binder when compiling to C
+ so if binders are not unique we would either get duplicate declarations
+ or misstyped variables. The later happend in #22402.
+
+3. We heavily use unique-keyed maps in the backend which can go wrong when
+ ids with the same unique are meant to represent the same variable.
+
+Generally speaking we don't clone TyVars or CoVars. The code gen doesn't need
+that (they are erased), and doing so would be tiresome because then we'd need
+to substitute in types and coercions. But sometimes need to: see
+Note [Cloning CoVars and TyVars]
+
+Note [Cloning CoVars and TyVars]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Normally we don't need to clone TyVars and CoVars, but there is one occasion
+when we do (see #24463). When we have
+ case unsafeEqualityProof ... of UnsafeRefl g -> ...
+we try to float it, using UnsafeEqualityCase.
+Why? See (U3) in Note [Implementing unsafeCoerce]
+
+Alas, floating it widens the scope of `g`, and that led to catastrophe in
+#24463, when two identically-named g's shadowed.
+
+Solution: clone `g`; see `cpCloneCoVarBndr`.
+
+BUT once we clone `g` we must apply the cloning substitution to all types
+and coercions. But that in turn means that, given a binder like
+ /\ (a :: kind |> g). blah
+we must substitute in a's kind, and hence need to substitute for `a`
+itself in `blah`.
+
+So our plan is:
+ * Maintain a full Subst in `cpe_subst`
+
+ * Clone a CoVar when we we meet an `isUnsafeEqualityCase`;
+ otherwise TyVar/CoVar binders are never cloned.
+
+ * So generally the TCvSubst is empty
+
+ * Apply the substitution to type and coercion arguments in Core; but
+ happily `substTy` has a no-op short-cut for an empty TCvSubst, so this
+ is usually very cheap.
+
+ * In `cpCloneBndr`, for a tyvar/covar binder, check for an empty substitution;
+ in that case just do nothing
-}
type CpeArg = CoreExpr -- Non-terminal 'arg'
@@ -763,10 +804,10 @@ cpeRhsE :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs)
-- For example
-- f (g x) ===> ([v = g x], f v)
-cpeRhsE _ (Type ty)
- = return (emptyFloats, Type ty)
-cpeRhsE _ (Coercion co)
- = return (emptyFloats, Coercion co)
+cpeRhsE env (Type ty)
+ = return (emptyFloats, Type (cpSubstTy env ty))
+cpeRhsE env (Coercion co)
+ = return (emptyFloats, Coercion (cpSubstCo env co))
cpeRhsE env expr@(Lit lit)
| LitNumber LitNumBigNat i <- lit
= cpeBigNatLit env i
@@ -799,7 +840,7 @@ cpeRhsE env (Tick tickish expr)
cpeRhsE env (Cast expr co)
= do { (floats, expr') <- cpeRhsE env expr
- ; return (floats, Cast expr' co) }
+ ; return (floats, Cast expr' (cpSubstCo env co)) }
cpeRhsE env expr@(Lam {})
= do { let (bndrs,body) = collectBinders expr
@@ -807,7 +848,7 @@ cpeRhsE env expr@(Lam {})
; body' <- cpeBodyNF env' body
; return (emptyFloats, mkLams bndrs' body') }
-cpeRhsE env (Case scrut bndr _ alts@[Alt con bs _])
+cpeRhsE env (Case scrut bndr _ alts@[Alt con [covar] _])
-- See (U3) in Note [Implementing unsafeCoerce]
-- We need make the Case float, otherwise we get
-- let x = case ... of UnsafeRefl co ->
@@ -823,14 +864,18 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con bs _])
-- (such as `print003`).
| Just rhs <- isUnsafeEqualityCase scrut bndr alts
= do { (floats_scrut, scrut) <- cpeBody env scrut
- ; (env, bndr) <- cpCloneBndr env bndr
- ; (env, bs) <- cpCloneBndrs env bs
+
+ ; (env, bndr') <- cpCloneBndr env bndr
+ ; (env, covar') <- cpCloneCoVarBndr env covar
+ -- Important: here we clone the CoVar
+ -- See Note [Cloning CoVars and TyVars]
+
-- Up until here this should do exactly the same as the regular code
-- path of `cpeRhsE Case{}`.
; (floats_rhs, rhs) <- cpeBody env rhs
-- ... but we want to float `floats_rhs` as in (U3) so that rhs' might
-- become a value
- ; let case_float = UnsafeEqualityCase scrut bndr con bs
+ ; let case_float = UnsafeEqualityCase scrut bndr' con [covar']
-- NB: It is OK to "evaluate" the proof eagerly.
-- Usually there's the danger that we float the unsafeCoerce out of
-- a branching Case alt. Not so here, because the regular code path
@@ -849,7 +894,7 @@ cpeRhsE env (Case scrut bndr ty alts)
where err = mkImpossibleExpr ty "cpeRhsE: missing case alternative"
; alts'' <- mapM (sat_alt env') alts'
- ; return (floats, Case scrut' bndr2 ty alts'') }
+ ; return (floats, Case scrut' bndr2 (cpSubstTy env ty) alts'') }
where
sat_alt env (Alt con bs rhs)
= do { (env2, bs') <- cpCloneBndrs env bs
@@ -1184,10 +1229,14 @@ cpeApp top_env expr
in rebuild_app' env (a : as) tick_fun floats ss rt_ticks req_depth
CpeApp (Type arg_ty)
- -> rebuild_app' env as (App fun' (Type arg_ty)) floats ss rt_ticks req_depth
+ -> rebuild_app' env as (App fun' (Type arg_ty')) floats ss rt_ticks req_depth
+ where
+ arg_ty' = cpSubstTy env arg_ty
CpeApp (Coercion co)
- -> rebuild_app' env as (App fun' (Coercion co)) floats (drop 1 ss) rt_ticks req_depth
+ -> rebuild_app' env as (App fun' (Coercion co')) floats (drop 1 ss) rt_ticks req_depth
+ where
+ co' = cpSubstCo env co
CpeApp arg -> do
let (ss1, ss_rest) -- See Note [lazyId magic] in GHC.Types.Id.Make
@@ -1199,7 +1248,10 @@ cpeApp top_env expr
rebuild_app' env as (App fun' arg') (fs `zipFloats` floats) ss_rest rt_ticks (req_depth-1)
CpeCast co
- -> rebuild_app' env as (Cast fun' co) floats ss rt_ticks req_depth
+ -> rebuild_app' env as (Cast fun' co') floats ss rt_ticks req_depth
+ where
+ co' = cpSubstCo env co
+
-- See Note [Ticks and mandatory eta expansion]
CpeTick tickish
| tickishPlace tickish == PlaceRuntime
@@ -2220,6 +2272,7 @@ binding for data constructors; see Note [Data constructor workers].
Note [CorePrep inlines trivial CoreExpr not Id]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+TODO
Why does cpe_env need to be an IdEnv CoreExpr, as opposed to an
IdEnv Id? Naively, we might conjecture that trivial updatable thunks
as per Note [Inlining in CorePrep] always have the form
@@ -2259,11 +2312,11 @@ data CorePrepEnv
-- the case where a function we think should bottom
-- unexpectedly returns.
- , cpe_env :: IdEnv CoreExpr -- Clone local Ids
- -- ^ This environment is used for three operations:
+ , cpe_subst :: Subst
+ -- ^ The IdEnv part of the substitution is used for three operations:
--
-- 1. To support cloning of local Ids so that they are
- -- all unique (see item (6) of CorePrep overview).
+ -- all unique (see Note [Cloning in CorePrep])
--
-- 2. To support beta-reduction of runRW, see
-- Note [runRW magic] and Note [runRW arg].
@@ -2271,6 +2324,9 @@ data CorePrepEnv
-- 3. To let us inline trivial RHSs of non top-level let-bindings,
-- see Note [lazyId magic], Note [Inlining in CorePrep]
-- and Note [CorePrep inlines trivial CoreExpr not Id] (#12076)
+ --
+ -- The TyCoVar part of the substitution is used only for
+ -- Note [Cloning CoVars and TyVars]
, cpe_rec_ids :: UnVarSet -- Faster OutIdSet; See Note [Speculative evaluation]
}
@@ -2278,33 +2334,48 @@ data CorePrepEnv
mkInitialCorePrepEnv :: CorePrepConfig -> CorePrepEnv
mkInitialCorePrepEnv cfg = CPE
{ cpe_config = cfg
- , cpe_env = emptyVarEnv
+ , cpe_subst = emptySubst
, cpe_rec_ids = emptyUnVarSet
}
extendCorePrepEnv :: CorePrepEnv -> Id -> Id -> CorePrepEnv
-extendCorePrepEnv cpe id id'
- = cpe { cpe_env = extendVarEnv (cpe_env cpe) id (Var id') }
+extendCorePrepEnv cpe@(CPE { cpe_subst = subst }) id id'
+ = cpe { cpe_subst = subst2 }
+ where
+ subst1 = extendSubstInScope subst id'
+ subst2 = extendIdSubst subst1 id (Var id')
+
+extendCorePrepEnvList :: CorePrepEnv -> [(Id,Id)] -> CorePrepEnv
+extendCorePrepEnvList cpe@(CPE { cpe_subst = subst }) prs
+ = cpe { cpe_subst = subst2 }
+ where
+ subst1 = extendSubstInScopeList subst (map snd prs)
+ subst2 = extendIdSubstList subst1 [(id, Var id') | (id,id') <- prs]
extendCorePrepEnvExpr :: CorePrepEnv -> Id -> CoreExpr -> CorePrepEnv
extendCorePrepEnvExpr cpe id expr
- = cpe { cpe_env = extendVarEnv (cpe_env cpe) id expr }
-
-extendCorePrepEnvList :: CorePrepEnv -> [(Id,Id)] -> CorePrepEnv
-extendCorePrepEnvList cpe prs
- = cpe { cpe_env = extendVarEnvList (cpe_env cpe)
- (map (\(id, id') -> (id, Var id')) prs) }
+ = cpe { cpe_subst = extendIdSubst (cpe_subst cpe) id expr }
lookupCorePrepEnv :: CorePrepEnv -> Id -> CoreExpr
lookupCorePrepEnv cpe id
- = case lookupVarEnv (cpe_env cpe) id of
- Nothing -> Var id
- Just exp -> exp
+ = case lookupIdSubst_maybe (cpe_subst cpe) id of
+ Just e -> e
+ Nothing -> Var id
+ -- Do not use GHC.Core.Subs.lookupIdSubst because that is a no-op on GblIds;
+ -- and Tidy has made top-level externally-visible Ids into GblIds
enterRecGroupRHSs :: CorePrepEnv -> [OutId] -> CorePrepEnv
enterRecGroupRHSs env grp
= env { cpe_rec_ids = extendUnVarSetList grp (cpe_rec_ids env) }
+cpSubstTy :: CorePrepEnv -> Type -> Type
+cpSubstTy (CPE { cpe_subst = subst }) ty = substTy subst ty
+ -- substTy has a short-cut if the TCvSubst is empty
+
+cpSubstCo :: CorePrepEnv -> Coercion -> Coercion
+cpSubstCo (CPE { cpe_subst = subst }) co = substCo subst co
+ -- substCo has a short-cut if the TCvSubst is empty
+
------------------------------------------------------------------------------
-- Cloning binders
-- ---------------------------------------------------------------------------
@@ -2312,12 +2383,31 @@ enterRecGroupRHSs env grp
cpCloneBndrs :: CorePrepEnv -> [InVar] -> UniqSM (CorePrepEnv, [OutVar])
cpCloneBndrs env bs = mapAccumLM cpCloneBndr env bs
+cpCloneCoVarBndr :: CorePrepEnv -> InVar -> UniqSM (CorePrepEnv, OutVar)
+-- Clone the CoVar
+-- See Note [Cloning CoVars and TyVars]
+cpCloneCoVarBndr env@(CPE { cpe_subst = subst }) covar
+ = assertPpr (isCoVar covar) (ppr covar) $
+ do { uniq <- getUniqueM
+ ; let covar1 = setVarUnique covar uniq
+ covar2 = updateVarType (substTy subst) covar1
+ subst1 = extendTCvSubstWithClone subst covar covar2
+ ; return (env { cpe_subst = subst1 }, covar2) }
+
cpCloneBndr :: CorePrepEnv -> InVar -> UniqSM (CorePrepEnv, OutVar)
-cpCloneBndr env bndr
- | isTyCoVar bndr
- = return (env, bndr)
- | otherwise
- = do { bndr' <- clone_it bndr
+-- See Note [Cloning in CorePrep]
+cpCloneBndr env@(CPE { cpe_subst = subst }) bndr
+ | isTyCoVar bndr -- See Note [Cloning CoVars and TyVars]
+ = if isEmptyTCvSubst subst -- The common case
+ then return (env { cpe_subst = extendSubstInScope subst bndr }, bndr)
+ else -- No need to clone the Unique; but we must apply the substitution
+ let bndr1 = updateVarType (substTy subst) bndr
+ subst1 = extendTCvSubstWithClone subst bndr bndr1
+ in return (env { cpe_subst = subst1 }, bndr1)
+
+ | otherwise -- A non-CoVar Id
+ = do { bndr1 <- clone_it bndr
+ ; let bndr2 = updateIdTypeAndMult (substTy subst) bndr1
-- Drop (now-useless) rules/unfoldings
-- See Note [Drop unfoldings and rules]
@@ -2327,10 +2417,10 @@ cpCloneBndr env bndr
; let !unfolding' = trimUnfolding (realIdUnfolding bndr)
-- Simplifier will set the Id's unfolding
- bndr'' = bndr' `setIdUnfolding` unfolding'
- `setIdSpecialisation` emptyRuleInfo
+ bndr3 = bndr2 `setIdUnfolding` unfolding'
+ `setIdSpecialisation` emptyRuleInfo
- ; return (extendCorePrepEnv env bndr bndr'', bndr'') }
+ ; return (extendCorePrepEnv env bndr bndr3, bndr3) }
where
clone_it bndr
| isLocalId bndr
=====================================
testsuite/tests/core-to-stg/T24463.hs
=====================================
@@ -0,0 +1,16 @@
+{-# LANGUAGE GADTs #-}
+{-# LANGUAGE PatternSynonyms #-}
+{-# LANGUAGE ViewPatterns #-}
+
+module T24463 where
+
+import Unsafe.Coerce (unsafeCoerce)
+
+data Term where
+ BinaryTerm :: !arg1 -> !arg2 -> Term
+
+f :: Term -> (b, c)
+f (BinaryTerm t1 t2) = (unsafeCoerce t1, unsafeCoerce t2)
+
+pattern P :: b -> c -> Term
+pattern P t1 t2 <- (f -> (t1, t2))
=====================================
testsuite/tests/core-to-stg/all.T
=====================================
@@ -4,3 +4,4 @@ test('T19700', normal, compile, ['-O'])
test('T23270', [grep_errmsg(r'patError')], compile, ['-O0 -dsuppress-uniques -ddump-prep'])
test('T23914', normal, compile, ['-O'])
test('T14895', normal, compile, ['-O -ddump-stg-final -dno-typeable-binds -dsuppress-uniques'])
+test('T24463', normal, compile, ['-O'])
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/9d38bfa0c0f910208822579acaa999f87c2f8c65
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/9d38bfa0c0f910208822579acaa999f87c2f8c65
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240416/95a65a4b/attachment-0001.html>
More information about the ghc-commits
mailing list