[Git][ghc/ghc][master] 2 commits: Improve duplicate elimination in SpecConstr

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Fri Dec 8 10:48:01 UTC 2023



Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
28811f88 by Simon Peyton Jones at 2023-12-08T05:47:18-05:00
Improve duplicate elimination in SpecConstr

This partially fixes #24229.

See the new Note [Pattern duplicate elimination] in SpecConstr

- - - - -
fec7894f by Simon Peyton Jones at 2023-12-08T05:47:18-05:00
Make SpecConstr deal with casts better

This patch does two things, to fix #23209:

* It improves SpecConstr so that it no longer quantifies over
  coercion variables.  See Note [SpecConstr and casts]

* It improves the rule matcher to deal nicely with the case where
  the rule does not quantify over coercion variables, but the the
  template has a cast in it.  See Note [Casts in the template]

- - - - -


16 changed files:

- compiler/GHC/Core/Opt/Simplify/Env.hs
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Rules.hs
- compiler/GHC/Core/SimpleOpt.hs
- compiler/GHC/Core/Subst.hs
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Unify.hs
- compiler/GHC/Tc/Utils/TcType.hs
- + testsuite/tests/simplCore/should_compile/T23209.hs
- + testsuite/tests/simplCore/should_compile/T23209_Aux.hs
- + testsuite/tests/simplCore/should_compile/T24229a.hs
- + testsuite/tests/simplCore/should_compile/T24229a.stderr
- + testsuite/tests/simplCore/should_compile/T24229b.hs
- + testsuite/tests/simplCore/should_compile/T24229b.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Opt/Simplify/Env.hs
=====================================
@@ -1237,9 +1237,8 @@ See also Note [Return type for join points] and Note [Join points and case-of-ca
 -}
 
 getSubst :: SimplEnv -> Subst
-getSubst (SimplEnv { seInScope = in_scope, seTvSubst = tv_env
-                      , seCvSubst = cv_env })
-  = mkSubst in_scope tv_env cv_env emptyIdSubstEnv
+getSubst (SimplEnv { seInScope = in_scope, seTvSubst = tv_env, seCvSubst = cv_env })
+  = mkTCvSubst in_scope tv_env cv_env
 
 substTy :: HasDebugCallStack => SimplEnv -> Type -> Type
 substTy env ty = Type.substTy (getSubst env) ty


=====================================
compiler/GHC/Core/Opt/SpecConstr.hs
=====================================
@@ -67,7 +67,6 @@ import GHC.Types.Unique.FM
 import GHC.Types.Unique( hasKey )
 
 import GHC.Data.Maybe     ( orElse, catMaybes, isJust, isNothing )
-import GHC.Data.Pair
 import GHC.Data.FastString
 
 import GHC.Utils.Misc
@@ -81,8 +80,8 @@ import GHC.Builtin.Names ( specTyConKey )
 import GHC.Exts( SpecConstrAnnotation(..) )
 import GHC.Serialized   ( deserializeWithData )
 
-import Control.Monad    ( zipWithM )
-import Data.List (nubBy, sortBy, partition, dropWhileEnd, mapAccumL )
+import Control.Monad
+import Data.List ( sortBy, partition, dropWhileEnd, mapAccumL )
 import Data.Maybe( mapMaybe )
 import Data.Ord( comparing )
 import Data.Tuple
@@ -2246,7 +2245,7 @@ Wrinkles:
 
 * The list of argument patterns, cp_args, is no longer than the
   visible lambdas of the binding, ri_arg_occs.  This is done via
-  the zipWithM in callToPats.
+  the zipWithM in callToPat.
 
 * The list of argument patterns can certainly be shorter than the
   lambdas in the function definition (under-saturated).  For example
@@ -2256,7 +2255,7 @@ Wrinkles:
 
 * In fact we deliberately shrink the list of argument patterns,
   cp_args, by trimming off all the boring ones at the end (see
-  `dropWhileEnd is_boring` in callToPats).  Since the RULE only
+  `dropWhileEnd is_boring` in callToPat).  Since the RULE only
   applies when it is saturated, this shrinking makes the RULE more
   applicable.  But it does mean that the argument patterns do not
   necessarily saturate the lambdas of the function.
@@ -2299,63 +2298,48 @@ Note [SpecConstr and casts]
 Consider (#14270) a call like
 
     let f = e
-    in ... f (K @(a |> co)) ...
+    in ... f (K @(a |> cv)) ...
 
-where 'co' is a coercion variable not in scope at f's definition site.
+where 'cv' is a coercion variable not in scope at f's definition site.
 If we aren't careful we'll get
 
-    let $sf a co = e (K @(a |> co))
-        RULE "SC:f" forall a co.  f (K @(a |> co)) = $sf a co
+    let $sf a cv = e (K @(a |> cv))
+        RULE "SC:f" forall a cv.  f (K @(a |> cv)) = $sf a co
         f = e
     in ...
 
-But alas, when we match the call we won't bind 'co', because type-matching
-(for good reasons) discards casts).
-
-I don't know how to solve this, so for now I'm just discarding any
-call patterns that
-  * Mentions a coercion variable in a type argument
-  * That is not in scope at the binding of the function
-
-I think this is very rare.
-
-It is important (e.g. #14936) that this /only/ applies to
-coercions mentioned in casts.  We don't want to be discombobulated
-by casts in terms!  For example, consider
-   f ((e1,e2) |> sym co)
-where, say,
-   f  :: Foo -> blah
-   co :: Foo ~R (Int,Int)
-
-Here we definitely do want to specialise for that pair!  We do not
-match on the structure of the coercion; instead we just match on a
-coercion variable, so the RULE looks like
-
-   forall (x::Int, y::Int, co :: (Int,Int) ~R Foo)
-     f ((x,y) |> co) = $sf x y co
-
-Often the body of f looks like
-   f arg = ...(case arg |> co' of
-                (x,y) -> blah)...
-
-so that the specialised f will turn into
-   $sf x y co = let arg = (x,y) |> co
-                in ...(case arg>| co' of
-                         (x,y) -> blah)....
-
-which will simplify to not use 'co' at all.  But we can't guarantee
-that co will end up unused, so we still pass it.  Absence analysis
-may remove it later.
-
-Note that this /also/ discards the call pattern if we have a cast in a
-/term/, although in fact Rules.match does make a very flaky and
-fragile attempt to match coercions.  e.g. a call like
-    f (Maybe Age) (Nothing |> co) blah
-    where co :: Maybe Int ~ Maybe Age
-will be discarded.  It's extremely fragile to match on the form of a
-coercion, so I think it's better just not to try.  A more complicated
-alternative would be to discard calls that mention coercion variables
-only in kind-casts, but I'm doing the simple thing for now.
+But alas, when we match the call we may fail to bind 'co', because the rule
+matcher in GHC.Core.Rules cannot reliably bind coercion variables that appear
+in casts (see Note [Casts in the template] in GHC.Core.Rules).
+
+This seems intractable (see #23209). So:
+
+* Key point: we /never/ quantify over coercion variables in a SpecConstr rule.
+  If we would need to quantify over a coercion variable, we just discard the
+  call pattern. See the test for `bad_covars` in callToPat.
+
+* However (#14936) we /do/ still allow casts in call patterns. For example
+     f ((e1,e2) |> sym co)
+  where, say,
+     f  :: Foo -> blah   -- Foo is a newtype
+     f = f_rhs
+     co :: Foo ~R (Int,Int)
+  We want to specialise on that pair!
+
+So for our function f, we might generate
+  RULE forall x y.  f ((x,y) |> co) = $sf x y
+  $sf x y = f_rhs ((x,y) |> co)
+
+This works provided the free vars of `co` are either in-scope at the
+definition of `f`, or quantified. For the latter, suppose `f` was polymorphic:
+
+     f2  :: Foo2 a -> blah   -- Foo is a newtype
+     f2 = f2_rhs
+     co2 :: Foo a ~R (a,a)
+
+Then it's fine for `co2` to mention `a`.  We'll get
+  RULE forall a (x::a) (y::a).  f2 @a ((x,y) |> co2) = $sf2 a x y
+  $sf2 @a x y = f2_rhs ((x,y) |> co2)
 -}
 
 data CallPat = CP { cp_qvars :: [Var]           -- Quantified variables
@@ -2381,19 +2365,23 @@ callsToNewPats :: ScEnv -> Id
 -- The "New" in the name means "patterns that are not already covered
 -- by an existing specialisation"
 callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
-  = do  { mb_pats <- mapM (callToPats env bndr_occs) calls
+  = do  { mb_pats <- mapM (callToPat env bndr_occs) calls
 
         ; let have_boring_call = any isNothing mb_pats
 
               good_pats :: [CallPat]
               good_pats = catMaybes mb_pats
 
+              in_scope = getSubstInScope (sc_subst env)
+
               -- Remove patterns we have already done
               new_pats = filterOut is_done good_pats
-              is_done p = any (samePat p . os_pat) done_specs
+              is_done p = any is_better done_specs
+                 where
+                   is_better done = betterPat in_scope (os_pat done) p
 
               -- Remove duplicates
-              non_dups = nubBy samePat new_pats
+              non_dups = subsumePats in_scope new_pats
 
               -- Remove ones that have too many worker variables
               small_pats = filterOut too_many_worker_args non_dups
@@ -2410,6 +2398,10 @@ callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
               (pats_were_discarded, trimmed_pats) = trim_pats env fn spec_info small_pats
 
 --        ; pprTraceM "callsToPats" (vcat [ text "calls to" <+> ppr fn <> colon <+> ppr calls
+--                                        , text "good_pats:" <+> ppr good_pats
+--                                        , text "new_pats:" <+> ppr new_pats
+--                                        , text "non_dups:" <+> ppr non_dups
+--                                        , text "small_pats:" <+> ppr small_pats
 --                                        , text "done_specs:" <+> ppr (map os_pat done_specs)
 --                                        , text "trimmed_pats:" <+> ppr trimmed_pats ])
 
@@ -2477,12 +2469,12 @@ trim_pats env fn (SI { si_n_specs = done_spec_count }) pats
                , text "Discarding:" <+> ppr (drop n_remaining sorted_pats) ]
 
 
-callToPats :: ScEnv -> [ArgOcc] -> Call -> UniqSM (Maybe CallPat)
+callToPat :: ScEnv -> [ArgOcc] -> Call -> UniqSM (Maybe CallPat)
         -- The [Var] is the variables to quantify over in the rule
         --      Type variables come first, since they may scope
         --      over the following term variables
         -- The [CoreExpr] are the argument patterns for the rule
-callToPats env bndr_occs call@(Call fn args con_env)
+callToPat env bndr_occs call@(Call fn args con_env)
   = do  { let in_scope = getSubstInScope (sc_subst env)
 
         ; arg_tripples <- zipWith3M (argToPat env in_scope con_env) args bndr_occs (map (const NotMarkedStrict) args)
@@ -2513,32 +2505,25 @@ callToPats env bndr_occs call@(Call fn args con_env)
                 -- See Note [Free type variables of the qvar types]
                 -- See Note [Shadowing] at the top
 
-              (ktvs, ids)   = partition isTyVar qvars
-              qvars'        = scopedSort ktvs ++ map sanitise ids
+              (qktvs, qids) = partition isTyVar qvars
+              qvars'        = scopedSort qktvs ++ map sanitise qids
                 -- Order into kind variables, type variables, term variables
                 -- The kind of a type variable may mention a kind variable
                 -- and the type of a term variable may mention a type variable
 
-              sanitise id   = updateIdTypeAndMult expandTypeSynonyms id
+              sanitise id = updateIdTypeAndMult expandTypeSynonyms id
                 -- See Note [Free type variables of the qvar types]
 
-
         -- Check for bad coercion variables: see Note [SpecConstr and casts]
-        ; let bad_covars :: CoVarSet
-              bad_covars = mapUnionVarSet get_bad_covars pats
-              get_bad_covars :: CoreArg -> CoVarSet
-              get_bad_covars (Type ty) = filterVarSet bad_covar (tyCoVarsOfType ty)
-              get_bad_covars _         = emptyVarSet
-              bad_covar v = isId v && not (is_in_scope v)
-
-        ; warnPprTrace (not (isEmptyVarSet bad_covars))
+        ; let bad_covars = filter isCoVar qids
+        ; warnPprTrace (not (null bad_covars))
               "SpecConstr: bad covars"
               (ppr bad_covars $$ ppr call) $
 
-          if interesting && isEmptyVarSet bad_covars
+          if interesting && null bad_covars
           then do { let cp_res = CP { cp_qvars = qvars', cp_args = pats
                                     , cp_strict_args = concat cbv_ids }
---                  ; pprTraceM "callToPatsOut" $
+--                  ; pprTraceM "callToPatOut" $
 --                    vcat [ text "fn:" <+> ppr fn
 --                         , text "args:" <+> ppr args
 --                         , text "bndr_occs:" <+> ppr bndr_occs
@@ -2606,39 +2591,16 @@ argToPat1 env in_scope val_env (Let _ arg) arg_occ arg_str
         -- Here we can specialise for f (v,w)
         -- because the rule-matcher will look through the let.
 
-{- Disabled; see Note [Matching cases] in "GHC.Core.Rules"
-argToPat env in_scope val_env (Case scrut _ _ [(_, _, rhs)]) arg_occ
-  | exprOkForSpeculation scrut  -- See Note [Matching cases] in "GHC.Core.Rules"
-  = argToPat env in_scope val_env rhs arg_occ
--}
-
+   -- Casts: see Note [SpecConstr and casts]
 argToPat1 env in_scope val_env (Cast arg co) arg_occ arg_str
   | not (ignoreType env ty2)
   = do  { (interesting, arg', strict_args) <- argToPat env in_scope val_env arg arg_occ arg_str
         ; if not interesting then
                 wildCardPat ty2 arg_str
-          else do
-        { -- Make a wild-card pattern for the coercion
-          uniq <- getUniqueM
-        ; let co_name = mkSysTvName uniq (fsLit "sg")
-              co_var  = mkCoVar co_name (mkCoercionType Representational ty1 ty2)
-        ; return (interesting, Cast arg' (mkCoVarCo co_var), strict_args) } }
+          else
+                return (interesting, Cast arg' co, strict_args) }
   where
-    Pair ty1 ty2 = coercionKind co
-
-
-
-{-      Disabling lambda specialisation for now
-        It's fragile, and the spec_loop can be infinite
-argToPat in_scope val_env arg arg_occ
-  | is_value_lam arg
-  = return (True, arg)
-  where
-    is_value_lam (Lam v e)         -- Spot a value lambda, even if
-        | isId v       = True      -- it is inside a type lambda
-        | otherwise    = is_value_lam e
-    is_value_lam other = False
--}
+    ty2 = coercionRKind co
 
   -- Check for a constructor application
   -- NB: this *precedes* the Var case, so that we catch nullary constrs
@@ -2727,6 +2689,25 @@ argToPat1 env in_scope val_env (Var v) arg_occ arg_str
         --       f x y = letrec g z = ... in g (x,y)
         -- We don't want to specialise for that *particular* x,y
 
+
+{- Disabled; see Note [Matching cases] in "GHC.Core.Rules"
+argToPat env in_scope val_env (Case scrut _ _ [(_, _, rhs)]) arg_occ
+  | exprOkForSpeculation scrut  -- See Note [Matching cases] in "GHC.Core.Rules"
+  = argToPat env in_scope val_env rhs arg_occ
+-}
+
+{-      Disabling lambda specialisation for now
+        It's fragile, and the spec_loop can be infinite
+argToPat in_scope val_env arg arg_occ
+  | is_value_lam arg
+  = return (True, arg)
+  where
+    is_value_lam (Lam v e)         -- Spot a value lambda, even if
+        | isId v       = True      -- it is inside a type lambda
+        | otherwise    = is_value_lam e
+    is_value_lam other = False
+-}
+
   -- The default case: make a wild-card
   -- We use this for coercions too
 argToPat1 _env _in_scope _val_env arg _arg_occ arg_str
@@ -2790,40 +2771,69 @@ valueIsWorkFree :: Value -> Bool
 valueIsWorkFree LambdaVal       = True
 valueIsWorkFree (ConVal _ args) = all exprIsWorkFree args
 
-samePat :: CallPat -> CallPat -> Bool
-samePat (CP { cp_qvars = vs1, cp_args = as1 })
-        (CP { cp_qvars = vs2, cp_args = as2 })
-  = all2 same as1 as2
+betterPat :: InScopeSet -> CallPat -> CallPat -> Bool
+-- pat1    f @a   (Just @a   (x::a))
+--      is better than
+-- pat2    f @Int (Just @Int (x::Int))
+-- That is, we can instantiate pat1 to get pat2
+-- See Note [Pattern duplicate elimination]
+betterPat is (CP { cp_qvars = vs1, cp_args = as1 })
+             (CP { cp_qvars = vs2, cp_args = as2 })
+  = case matchExprs ise vs1 as1 as2 of
+      Just (_, ms) -> all exprIsTrivial ms
+      Nothing      -> False
+  where
+    ise = ISE (is `extendInScopeSetList` vs2) (const noUnfolding)
+
+subsumePats :: InScopeSet -> [CallPat] -> [CallPat]
+-- Remove any patterns subsumed by others
+-- See Note [Pattern duplicate elimination]
+subsumePats is pats = foldr add [] pats
   where
-    -- If the args are the same, their strictness marks will be too so we don't compare those.
-    same (Var v1) (Var v2)
-        | v1 `elem` vs1 = v2 `elem` vs2
-        | v2 `elem` vs2 = False
-        | otherwise     = v1 == v2
-
-    same (Lit l1)    (Lit l2)    = l1==l2
-    same (App f1 a1) (App f2 a2) = same f1 f2 && same a1 a2
-
-    same (Type {}) (Type {}) = True     -- Note [Ignore type differences]
-    same (Coercion {}) (Coercion {}) = True
-    same (Tick _ e1) e2 = same e1 e2  -- Ignore casts and notes
-    same (Cast e1 _) e2 = same e1 e2
-    same e1 (Tick _ e2) = same e1 e2
-    same e1 (Cast e2 _) = same e1 e2
-
-    same e1 e2 = warnPprTrace (bad e1 || bad e2) "samePat" (ppr e1 $$ ppr e2) $
-                 False  -- Let, lambda, case should not occur
-    bad (Case {}) = True
-    bad (Let {})  = True
-    bad (Lam {})  = True
-    bad _other    = False
+    add :: CallPat -> [CallPat] -> [CallPat]
+    add ci [] = [ci]
+    add ci1 (ci2:cis) | betterPat is ci2 ci1 = ci2:cis
+                      | betterPat is ci1 ci2 = ci1:cis
+                      | otherwise             = ci2 : add ci1 cis
 
 {-
-Note [Ignore type differences]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We do not want to generate specialisations where the call patterns
-differ only in their type arguments!  Not only is it utterly useless,
-but it also means that (with polymorphic recursion) we can generate
-an infinite number of specialisations. Example is Data.Sequence.adjustTree,
-I think.
+Note [Pattern duplicate elimination]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider f :: (a,a) -> blah, and two calls
+   f @Int  (x,y)
+   f @Bool (p,q)
+
+The danger is that we'll generate two *essentially identical* specialisations,
+both for pairs, but with different types instantiating `a` (see #24229).
+
+But we'll only make a `CallPat` for an argument (a,b) if `foo` scrutinises
+that argument.  So SpecConstr should never need to specialise f's polymorphic
+type arguments.  Even with only one of these calls we should be able to
+generalise to the `CallPat`
+
+  cp_qvars = [a, r::a, s::a], cp_args = [@a (r,s)]
+
+Doing so isn't trivial, though.
+
+For now we content ourselves with a simpler plan: eliminate a call pattern
+if another pattern subsumes it; this is done by `subsumePats`.
+For example here are two patterns
+
+  cp_qvars = [a, r::a, s::a],  cp_args = [@a (r,s)]
+  cp_qvars = [x::Int, y::Int], cp_args = [@Int (x,y)]
+
+The first can be instantiated to the second, /by instantiating types only/.
+This subsumption relationship is checked by `betterPat`.  Note that if
+we have
+
+  cp_qvars = [a, r::a, s::a], cp_args = [@a (r,s)]
+  cp_qvars = [],              cp_args = [@Bool (True,False)]
+
+the first does *not* subsume the second; the second is more specific.
+
+In our initial example with `f @Int` and `f @Bool` neither subsumes the other,
+so we will get two essentially-identical specialisations. Boo.  We rely on our
+crude throttling mechanisms to stop this getting out of control -- with
+polymorphic recursion we can generate an infinite number of specialisations.
+Example is Data.Sequence.adjustTree, I think.
 -}


=====================================
compiler/GHC/Core/Rules.hs
=====================================
@@ -9,7 +9,7 @@
 -- The 'CoreRule' datatype itself is declared elsewhere.
 module GHC.Core.Rules (
         -- ** Looking up rules
-        lookupRule,
+        lookupRule, matchExprs,
 
         -- ** RuleBase, RuleEnv
         RuleBase, RuleEnv(..), mkRuleEnv, emptyRuleEnv,
@@ -86,6 +86,7 @@ import GHC.Data.Maybe
 import GHC.Data.Bag
 import GHC.Data.List.SetOps( hasNoDups )
 
+import GHC.Utils.FV( filterFV, fvVarSet )
 import GHC.Utils.Misc as Utils
 import GHC.Utils.Outputable
 import GHC.Utils.Panic
@@ -720,15 +721,23 @@ matchN  :: InScopeEnv
 -- trailing ones, returning the result of applying the rule to a prefix
 -- of the actual arguments.
 
-matchN (ISE in_scope id_unf) rule_name tmpl_vars tmpl_es target_es rhs
+matchN ise _rule_name tmpl_vars tmpl_es target_es rhs
+  = do { (bind_wrapper, matched_es) <- matchExprs ise tmpl_vars tmpl_es target_es
+       ; return (bind_wrapper $
+                 mkLams tmpl_vars rhs `mkApps` matched_es) }
+
+matchExprs :: InScopeEnv -> [Var] -> [CoreExpr] -> [CoreExpr]
+           -> Maybe (BindWrapper, [CoreExpr])  -- 1-1 with the [Var]
+matchExprs (ISE in_scope id_unf) tmpl_vars tmpl_es target_es
   = do  { rule_subst <- match_exprs init_menv emptyRuleSubst tmpl_es target_es
         ; let (_, matched_es) = mapAccumL (lookup_tmpl rule_subst)
                                           (mkEmptySubst in_scope) $
                                 tmpl_vars `zip` tmpl_vars1
-              bind_wrapper = rs_binds rule_subst
+
+        ; let bind_wrapper = rs_binds rule_subst
                              -- Floated bindings; see Note [Matching lets]
-       ; return (bind_wrapper $
-                 mkLams tmpl_vars rhs `mkApps` matched_es) }
+
+        ; return (bind_wrapper, matched_es) }
   where
     (init_rn_env, tmpl_vars1) = mapAccumL rnBndrL (mkRnEnv2 in_scope) tmpl_vars
                   -- See Note [Cloning the template binders]
@@ -739,7 +748,7 @@ matchN (ISE in_scope id_unf) rule_name tmpl_vars tmpl_es target_es rhs
                    , rv_unf   = id_unf }
 
     lookup_tmpl :: RuleSubst -> Subst -> (InVar,OutVar) -> (Subst, CoreExpr)
-                   -- Need to return a RuleSubst solely for the benefit of mk_fake_ty
+                   -- Need to return a RuleSubst solely for the benefit of fake_ty
     lookup_tmpl (RS { rs_tv_subst = tv_subst, rs_id_subst = id_subst })
                 tcv_subst (tmpl_var, tmpl_var1)
         | isId tmpl_var1
@@ -768,7 +777,6 @@ matchN (ISE in_scope id_unf) rule_name tmpl_vars tmpl_es target_es rhs
     unbound tmpl_var
        = pprPanic "Template variable unbound in rewrite rule" $
          vcat [ text "Variable:" <+> ppr tmpl_var <+> dcolon <+> ppr (varType tmpl_var)
-              , text "Rule" <+> pprRuleName rule_name
               , text "Rule bndrs:" <+> ppr tmpl_vars
               , text "LHS args:" <+> ppr tmpl_es
               , text "Actual args:" <+> ppr target_es ]
@@ -960,45 +968,78 @@ where 'co' is non-reflexive, we simply fail.  You might wonder about
 but the Simplifer pushes the casts in an application to to the
 right, if it can, so this doesn't really arise.
 
-Note [Coercion arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~
-What if we have (f co) in the template, where the 'co' is a coercion
-argument to f?  Right now we have nothing in place to ensure that a
-coercion /argument/ in the template is a variable.  We really should,
-perhaps by abstracting over that variable.
-
-C.f. the treatment of dictionaries in GHC.HsToCore.Binds.decompseRuleLhs.
-
-For now, though, we simply behave badly, by failing in match_co.
-We really should never rely on matching the structure of a coercion
-(which is just a proof).
-
 Note [Casts in the template]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider the definition
+This Note concerns `matchTemplateCast`.  Consider the definition
   f x = e,
 and SpecConstr on call pattern
   f ((e1,e2) |> co)
 
-We'll make a RULE
+The danger is that We'll make a RULE
    RULE forall a,b,g.  f ((a,b)|> g) = $sf a b g
    $sf a b g = e[ ((a,b)|> g) / x ]
 
-So here is the invariant:
+This requires the rule-matcher to bind the coercion variable `g`.
+That is Very Deeply Suspicious:
+
+* It would be unreasonable to match on a structured coercion in a pattern,
+  such as    RULE   forall g.  f (x |> Sym g) = ...
+  because the strucure of a coercion is arbitrary and may change -- it's their
+  /type/ that matters.
+
+* We considered insisting that in a template, in a cast (e |> co), the the cast
+  `co` is always a /variable/ cv.  That looks a bit more plausible, but #23209
+  (and related tickets) shows that it's very fragile.  For example suppose `e`
+  is a variable `f`, and the simplifier has an unconditional substitution
+     [f :-> g |> co2]
+  Now the rule LHS becomes (f |> (co2 ; cv)); not a coercion variable any more!
+
+In short, it is Very Deeply Suspicious for a rule to quantify over a coercion
+variable.  And SpecConstr no longer does so: see Note [SpecConstr and casts] in
+SpecConstr.
 
-  In the template, in a cast (e |> co),
-  the cast `co` is always a /variable/.
+It is, however, OK for a cast to appear in a template.  For example
+    newtype N a = MkN (a,a)    -- Axiom ax:N a :: (a,a) ~R N a
+    f :: N a -> bah
+    RULE forall b x:b y:b. f @b ((x,y) |> (axN @b)) = ...
 
-Matching should bind that variable to an actual coercion, so that we
-can use it in $sf.  So a Cast on the LHS (the template) calls
-match_co, which succeeds when the template cast is a variable -- which
-it always is.  That is why match_co has so few cases.
+When matching we can just move these casts to the other side:
+    match (tmpl |> co) tgt  -->   match tmpl (tgt |> sym co)
+See matchTemplateCast.
+
+Wrinkles:
+
+(CT1) We need to be careful about scoping, and to match left-to-right, so that we
+  know the substitution [a :-> b] before we meet (co :: (a,a) ~R N a), and so we
+  can apply that substitition
+
+(CT2) Annoyingly, we still want support one case in which the RULE quantifies
+  over a coercion variable: the dreaded map/coerce RULE.
+  See Note [Getting the map/coerce RULE to work] in GHC.Core.SimpleOpt.
+
+  Since that can happen, matchTemplateCast laboriously checks whether the
+  coercion mentions a template coercion variable; and if so does the Very Deeply
+  Suspicious `match_co` instead.  It works fine for map/coerce, where the
+  coercion is always a variable and will (robustly) remain so.
 
 See also
 * Note [Coercion arguments]
 * Note [Matching coercion variables] in GHC.Core.Unify.
 * Note [Cast swizzling on rule LHSs] in GHC.Core.Opt.Simplify.Utils:
   sm_cast_swizzle is switched off in the template of a RULE
+
+Note [Coercion arguments]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+What if we have (f (Coercion co)) in the template, where the 'co' is a coercion
+argument to f?  Right now we have nothing in place to ensure that a
+coercion /argument/ in the template is a variable.  We really should,
+perhaps by abstracting over that variable.
+
+C.f. the treatment of dictionaries in GHC.HsToCore.Binds.decompseRuleLhs.
+
+For now, though, we simply behave badly, by failing in match_co.
+We really should never rely on matching the structure of a coercion
+(which is just a proof).
 -}
 
 ----------------------
@@ -1060,14 +1101,7 @@ match renv subst e1 (Cast e2 co2) mco
     -- This is important: see Note [Cancel reflexive casts]
 
 match renv subst (Cast e1 co1) e2 mco
-  = -- See Note [Casts in the template]
-    do { let co2 = case mco of
-                     MRefl   -> mkRepReflCo (exprType e2)
-                     MCo co2 -> co2
-       ; subst1 <- match_co renv subst co1 co2
-         -- If match_co succeeds, then (exprType e1) = (exprType e2)
-         -- Hence the MRefl in the next line
-       ; match renv subst1 e1 e2 MRefl }
+  = matchTemplateCast renv subst e1 co1 e2 mco
 
 ------------------------ Literals ---------------------
 match _ subst (Lit lit1) (Lit lit2) mco
@@ -1290,7 +1324,7 @@ match renv subst (Lam x1 e1) e2 mco
         in_scope_env = ISE in_scope (rv_unf renv)
         -- extendInScopeSetSet: The InScopeSet of rn_env is not necessarily
         -- a superset of the free vars of e2; it is only guaranteed a superset of
-        -- applyng the (rnEnvR rn_env) substitution to e2. But exprIsLambda_maybe
+        -- applying the (rnEnvR rn_env) substitution to e2. But exprIsLambda_maybe
         -- wants an in-scope set that includes all the free vars of its argument.
         -- Hence adding adding (exprFreeVars casted_e2) to the in-scope set (#23630)
   , Just (x2, e2', ts) <- exprIsLambda_maybe in_scope_env casted_e2
@@ -1449,6 +1483,40 @@ Hence
 -}
 
 -------------
+matchTemplateCast
+    :: RuleMatchEnv -> RuleSubst
+    -> CoreExpr -> Coercion
+    -> CoreExpr -> MCoercion
+    -> Maybe RuleSubst
+matchTemplateCast renv subst e1 co1 e2 mco
+  | isEmptyVarSet $ fvVarSet $
+    filterFV (`elemVarSet` rv_tmpls renv) $    -- Check that the coercion does not
+    tyCoFVsOfCo substed_co                     -- mention any of the template variables
+  = -- This is the good path
+    -- See Note [Casts in the template]
+    match renv subst e1 e2 (checkReflexiveMCo (mkTransMCoL mco (mkSymCo substed_co)))
+
+  | otherwise
+  = -- This is the Deeply Suspicious Path
+    do { let co2 = case mco of
+                     MRefl   -> mkRepReflCo (exprType e2)
+                     MCo co2 -> co2
+       ; subst1 <- match_co renv subst co1 co2
+         -- If match_co succeeds, then (exprType e1) = (exprType e2)
+         -- Hence the MRefl in the next line
+       ; match renv subst1 e1 e2 MRefl }
+  where
+    substed_co = substCo current_subst co1
+
+    current_subst :: Subst
+    current_subst = mkTCvSubst (rnInScopeSet (rv_lcl renv))
+                               (rs_tv_subst subst)
+                               emptyCvSubstEnv
+       -- emptyCvSubstEnv: ugh!
+       -- If there were any CoVar substitutions they would be in
+       -- rs_id_subst; but we don't expect there to be any; see
+       -- Note [Casts in the template]
+
 match_co :: RuleMatchEnv
          -> RuleSubst
          -> Coercion


=====================================
compiler/GHC/Core/SimpleOpt.hs
=====================================
@@ -818,35 +818,40 @@ The naive core produced for this is
 
 This matches literal uses of `map coerce` in code, but that's not what we
 want. We want it to match, say, `map MkAge` (where newtype Age = MkAge Int)
-too. Some of this is addressed by compulsorily unfolding coerce on the LHS,
-yielding
+too.  Achieving all this is surprisingly tricky:
 
-  forall a b (dict :: Coercible * a b).
-    map @a @b (\(x :: a) -> case dict of
-      MkCoercible (co :: a ~R# b) -> x |> co) = ...
+(MC1) We must compulsorily unfold MkAge to a cast.
+      See Note [Compulsory newtype unfolding] in GHC.Types.Id.Make
 
-Getting better. But this isn't exactly what gets produced. This is because
-Coercible essentially has ~R# as a superclass, and superclasses get eagerly
-extracted during solving. So we get this:
+(MC2) We must compulsorily unfolding coerce on the rule LHS, yielding
+        forall a b (dict :: Coercible * a b).
+          map @a @b (\(x :: a) -> case dict of
+            MkCoercible (co :: a ~R# b) -> x |> co) = ...
 
-  forall a b (dict :: Coercible * a b).
-    case Coercible_SCSel @* @a @b dict of
-      _ [Dead] -> map @a @b (\(x :: a) -> case dict of
-                               MkCoercible (co :: a ~R# b) -> x |> co) = ...
-
-Unfortunately, this still abstracts over a Coercible dictionary. We really
-want it to abstract over the ~R# evidence. So, we have Desugar.unfold_coerce,
-which transforms the above to (see also Note [Desugaring coerce as cast] in
-Desugar)
-
-  forall a b (co :: a ~R# b).
-    let dict = MkCoercible @* @a @b co in
-    case Coercible_SCSel @* @a @b dict of
-      _ [Dead] -> map @a @b (\(x :: a) -> case dict of
-         MkCoercible (co :: a ~R# b) -> x |> co) = let dict = ... in ...
-
-Now, we need simpleOptExpr to fix this up. It does so by taking three
-separate actions:
+  Getting better. But this isn't exactly what gets produced. This is because
+  Coercible essentially has ~R# as a superclass, and superclasses get eagerly
+  extracted during solving. So we get this:
+
+    forall a b (dict :: Coercible * a b).
+      case Coercible_SCSel @* @a @b dict of
+        _ [Dead] -> map @a @b (\(x :: a) -> case dict of
+                                 MkCoercible (co :: a ~R# b) -> x |> co) = ...
+
+  Unfortunately, this still abstracts over a Coercible dictionary. We really
+  want it to abstract over the ~R# evidence. So, we have Desugar.unfold_coerce,
+  which transforms the above to
+  Desugar)
+
+    forall a b (co :: a ~R# b).
+      let dict = MkCoercible @* @a @b co in
+      case Coercible_SCSel @* @a @b dict of
+        _ [Dead] -> map @a @b (\(x :: a) -> case dict of
+           MkCoercible (co :: a ~R# b) -> x |> co) = let dict = ... in ...
+
+  See Note [Desugaring coerce as cast] in GHC.HsToCore
+
+(MC3) Now, we need simpleOptExpr to fix this up. It does so by taking three
+  separate actions:
   1. Inline certain non-recursive bindings. The choice whether to inline
      is made in simple_bind_pair. Note the rather specific check for
      MkCoercible in there.
@@ -858,6 +863,10 @@ separate actions:
      just packed and inline them. This is also done in simple_opt_expr's
      `go` function.
 
+(MC4) The map/coerce rule is the only compelling reason for having a RULE that
+  quantifies over a coercion variable, something that is otherwise Very Deeply
+  Suspicous.  See Note [Casts in the template] in GHC.Core.Rules. Ugh!
+
 This is all a fair amount of special-purpose hackery, but it's for
 a good cause. And it won't hurt other RULES and such that it comes across.
 


=====================================
compiler/GHC/Core/Subst.hs
=====================================
@@ -19,7 +19,7 @@ module GHC.Core.Subst (
         substTickish, substDVarSet, substIdInfo,
 
         -- ** Operations on substitutions
-        emptySubst, mkEmptySubst, mkSubst, mkOpenSubst, isEmptySubst,
+        emptySubst, mkEmptySubst, mkTCvSubst, mkOpenSubst, isEmptySubst,
         extendIdSubst, extendIdSubstList, extendTCvSubst, extendTvSubstList,
         extendIdSubstWithClone,
         extendSubst, extendSubstList, extendSubstWithVar,


=====================================
compiler/GHC/Core/TyCo/Subst.hs
=====================================
@@ -14,7 +14,7 @@ module GHC.Core.TyCo.Subst
         Subst(..), TvSubstEnv, CvSubstEnv, IdSubstEnv,
         emptyIdSubstEnv, emptyTvSubstEnv, emptyCvSubstEnv, composeTCvSubst,
         emptySubst, mkEmptySubst, isEmptyTCvSubst, isEmptySubst,
-        mkSubst, mkTvSubst, mkCvSubst, mkIdSubst,
+        mkTCvSubst, mkTvSubst, mkCvSubst, mkIdSubst,
         getTvSubstEnv, getIdSubstEnv,
         getCvSubstEnv, getSubstInScope, setInScope, getSubstRangeTyCoFVs,
         isInScope, elemSubst, notElemSubst, zapSubst,
@@ -271,8 +271,8 @@ isEmptyTCvSubst :: Subst -> Bool
 isEmptyTCvSubst (Subst _ _ tv_env cv_env)
   = isEmptyVarEnv tv_env && isEmptyVarEnv cv_env
 
-mkSubst :: InScopeSet -> TvSubstEnv -> CvSubstEnv -> IdSubstEnv -> Subst
-mkSubst in_scope tvs cvs ids = Subst in_scope ids tvs cvs
+mkTCvSubst :: InScopeSet -> TvSubstEnv -> CvSubstEnv -> Subst
+mkTCvSubst in_scope tvs cvs = Subst in_scope emptyIdSubstEnv tvs cvs
 
 mkIdSubst :: InScopeSet -> IdSubstEnv -> Subst
 mkIdSubst in_scope ids = Subst in_scope ids emptyTvSubstEnv emptyCvSubstEnv


=====================================
compiler/GHC/Core/Type.hs
=====================================
@@ -198,7 +198,7 @@ module GHC.Core.Type (
         -- ** Manipulating type substitutions
         emptyTvSubstEnv, emptySubst, mkEmptySubst,
 
-        mkSubst, zipTvSubst, mkTvSubstPrs,
+        mkTCvSubst, zipTvSubst, mkTvSubstPrs,
         zipTCvSubst,
         notElemSubst,
         getTvSubstEnv,


=====================================
compiler/GHC/Core/Unify.hs
=====================================
@@ -1481,7 +1481,7 @@ getSubst :: UMEnv -> UM Subst
 getSubst env = do { tv_env <- getTvSubstEnv
                   ; cv_env <- getCvSubstEnv
                   ; let in_scope = rnInScopeSet (um_rn_env env)
-                  ; return (mkSubst in_scope tv_env cv_env emptyIdSubstEnv) }
+                  ; return (mkTCvSubst in_scope tv_env cv_env) }
 
 extendTvEnv :: TyVar -> Type -> UM ()
 extendTvEnv tv ty = UM $ \state ->


=====================================
compiler/GHC/Tc/Utils/TcType.hs
=====================================
@@ -166,7 +166,7 @@ module GHC.Tc.Utils.TcType (
   extendSubstInScopeList, extendSubstInScopeSet, extendTvSubstAndInScope,
   Type.lookupTyVar, Type.extendTCvSubst, Type.substTyVarBndr,
   Type.extendTvSubst,
-  isInScope, mkSubst, mkTvSubst, zipTyEnv, zipCoEnv,
+  isInScope, mkTCvSubst, mkTvSubst, zipTyEnv, zipCoEnv,
   Type.substTy, substTys, substScaledTys, substTyWith, substTyWithCoVars,
   substTyAddInScope,
   substTyUnchecked, substTysUnchecked, substScaledTyUnchecked,


=====================================
testsuite/tests/simplCore/should_compile/T23209.hs
=====================================
@@ -0,0 +1,12 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# OPTIONS_GHC -O2 #-}
+
+-- This gave a Lint crash
+
+module T23209 where
+
+import T23209_Aux
+
+f a = let w = if a then Allocator (ArrayWriter s)
+                   else Allocator (ArrayWriter e)
+      in case combine w w of


=====================================
testsuite/tests/simplCore/should_compile/T23209_Aux.hs
=====================================
@@ -0,0 +1,19 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# OPTIONS_GHC -O #-}
+module T23209_Aux where
+
+newtype I = MkI { uI :: () -> () }
+newtype ArrayWriter = ArrayWriter (() -> I)
+data Allocator = Allocator !ArrayWriter
+
+combine :: Allocator -> Allocator -> (# () -> () #)
+combine (Allocator (ArrayWriter w1)) (Allocator (ArrayWriter w2)) =
+  (# \s -> id' (uI (w1 ()) (uI (w2 ()) s)) #)
+
+e, s :: () -> I
+e x = MkI id
+s x = MkI id
+{-# NOINLINE s #-}
+
+id' :: () -> ()
+id' x = x


=====================================
testsuite/tests/simplCore/should_compile/T24229a.hs
=====================================
@@ -0,0 +1,14 @@
+module T24229a where
+
+newtype N a = MkN a
+
+foo :: Int -> N (a,a) -> Maybe (a,a)
+foo 0 (MkN p)     = Just p
+foo n (MkN (x,y)) = foo (n-1) (MkN (y,x))
+
+-- We should generate ONE specialisation for $wfoo,
+-- and it should fire TWICE, regardless of the order
+-- of the following two definitions.
+
+wombat1 = foo 20 (MkN ("yes", "no"))
+wombat2 xs ys = foo 3 (MkN (xs, ys))


=====================================
testsuite/tests/simplCore/should_compile/T24229a.stderr
=====================================
@@ -0,0 +1,38 @@
+
+==================== Tidy Core ====================
+Result size of Tidy Core = {terms: 79, types: 106, coercions: 8, joins: 0/0}
+
+Rec {
+foo_$s$wfoo
+  = \ @a sc sc1 sc2 ->
+      case sc2 of ds {
+        __DEFAULT -> foo_$s$wfoo sc1 sc (-# ds 1#);
+        0# -> (# (sc, sc1) #)
+      }
+end Rec }
+
+foo
+  = \ @a ds ds1 ->
+      case ds of { I# ww ->
+      case ww of ds2 {
+        __DEFAULT -> case ds1 `cast` <Co:4> :: ... of { (x, y) -> case foo_$s$wfoo y x (-# ds2 1#) of { (# ww1 #) -> Just ww1 } };
+        0# -> Just (ds1 `cast` <Co:4> :: ...)
+      }
+      }
+
+wombat7 = "yes"#
+
+wombat6 = unpackCString# wombat7
+
+wombat5 = "no"#
+
+wombat4 = unpackCString# wombat5
+
+wombat1 = case foo_$s$wfoo wombat6 wombat4 20# of { (# ww #) -> Just ww }
+
+wombat8 = I# 3#
+
+wombat2 = \ @a xs ys -> case foo_$s$wfoo xs ys 3# of { (# ww #) -> Just ww }
+
+
+


=====================================
testsuite/tests/simplCore/should_compile/T24229b.hs
=====================================
@@ -0,0 +1,13 @@
+module T24229b where
+
+newtype N a = MkN a
+
+foo :: Int -> N (a,a) -> Maybe (a,a)
+foo 0 (MkN p)     = Just p
+foo n (MkN (x,y)) = foo (n-1) (MkN (y,x))
+
+-- We should generate ONE specialisation for $wfoo,
+-- and it should fire TWICE, regardless of the order
+-- of the following two definitions.
+
+wombat2 xs ys = foo 3 (MkN (xs, ys))


=====================================
testsuite/tests/simplCore/should_compile/T24229b.stderr
=====================================
@@ -0,0 +1,28 @@
+
+==================== Tidy Core ====================
+Result size of Tidy Core = {terms: 60, types: 83, coercions: 8, joins: 0/0}
+
+Rec {
+foo_$s$wfoo
+  = \ @a sc sc1 sc2 ->
+      case sc2 of ds {
+        __DEFAULT -> foo_$s$wfoo sc1 sc (-# ds 1#);
+        0# -> (# (sc, sc1) #)
+      }
+end Rec }
+
+foo
+  = \ @a ds ds1 ->
+      case ds of { I# ww ->
+      case ww of ds2 {
+        __DEFAULT -> case ds1 `cast` <Co:4> :: ... of { (x, y) -> case foo_$s$wfoo y x (-# ds2 1#) of { (# ww1 #) -> Just ww1 } };
+        0# -> Just (ds1 `cast` <Co:4> :: ...)
+      }
+      }
+
+wombat1 = I# 3#
+
+wombat2 = \ @a xs ys -> case foo_$s$wfoo xs ys 3# of { (# ww #) -> Just ww }
+
+
+


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -508,4 +508,6 @@ test('T24014', normal, compile, ['-dcore-lint'])
 test('T24029', normal, compile, [''])
 test('T21348', normal, compile, ['-O'])
 test('T21917', normal, compile, ['-O -fkeep-auto-rules -ddump-rules'])
-
+test('T23209', [extra_files(['T23209_Aux.hs'])], multimod_compile, ['T23209', '-v0 -O'])
+test('T24229a', [ grep_errmsg(r'wfoo') ], compile, ['-O2 -ddump-simpl -dno-typeable-binds -dsuppress-all -dsuppress-uniques -dppr-cols=99999'])
+test('T24229b', [ grep_errmsg(r'wfoo') ], compile, ['-O2 -ddump-simpl -dno-typeable-binds -dsuppress-all -dsuppress-uniques -dppr-cols=99999'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6329d308eb00891674754c1f58f1ee2880305a36...fec7894f74a2f1c6f12f52dab82f8765c037b937

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6329d308eb00891674754c1f58f1ee2880305a36...fec7894f74a2f1c6f12f52dab82f8765c037b937
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231208/2cfb76ae/attachment-0001.html>


More information about the ghc-commits mailing list