[Git][ghc/ghc][wip/T19044] Fix #19044 by tweaking unification in inst lookup

Richard Eisenberg gitlab at gitlab.haskell.org
Wed Dec 16 19:48:17 UTC 2020



Richard Eisenberg pushed to branch wip/T19044 at Glasgow Haskell Compiler / GHC


Commits:
a65abc31 by Richard Eisenberg at 2020-12-16T14:47:39-05:00
Fix #19044 by tweaking unification in inst lookup

See Note [Infinitary substitution in lookup] in GHC.Core.InstEnv
and Note [Unification result] in GHC.Core.Unify.

Test case: typecheck/should_compile/T190{44,52}

Close #19044
Close #19052

- - - - -


5 changed files:

- compiler/GHC/Core/InstEnv.hs
- compiler/GHC/Core/Unify.hs
- + testsuite/tests/typecheck/should_compile/T19044.hs
- + testsuite/tests/typecheck/should_compile/T19052.hs
- testsuite/tests/typecheck/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/InstEnv.hs
=====================================
@@ -760,6 +760,49 @@ When we match this against D [ty], we return the instantiating types
 where the 'Nothing' indicates that 'b' can be freely instantiated.
 (The caller instantiates it to a flexi type variable, which will
  presumably later become fixed via functional dependencies.)
+
+Note [Infinitary substitution in lookup]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+
+  class C a b
+  instance C c c
+  instance C d (Maybe d)
+  [W] C e (Maybe e)
+
+You would think we could just use the second instance, because the first doesn't
+unify. But that's just ever so slightly wrong. The reason we check for unifiers
+along with matchers is that we don't want the possibility that a type variable
+instantiation could cause an instance choice to change. Yet if we have
+  type family M = Maybe M
+and choose (e |-> M), then both instances match. This is absurd, but we cannot
+rule it out. Yet, worrying about this case is awfully inconvenient to users,
+and so we pretend the problem doesn't exist, by considering a lookup that runs into
+this occurs-check issue to indicate that an instance surely does not apply (i.e.
+is like the SurelyApart case). In the brief time that we didn't treat infinitary
+substitutions specially, two tickets were filed: #19044 and #19052, both trying
+to do Real Work.
+
+Why don't we just exclude any instances that are MaybeApart? Because we might
+have a [W] C e (F e), where F is a type family. The second instance above does
+not match, but it should be included as a future possibility. Unification will
+return MaybeApart MARTypeFamily in this case.
+
+What can go wrong with this design choice? We might get incoherence -- but not
+loss of type safety. In particular, if we have [W] C M M (for the M type family
+above), then GHC might arbitrarily choose either instance, depending on how
+M reduces (or doesn't).
+
+For type families, we can't just ignore the problem (as we essentially do here),
+because doing so would give us a hole in the type safety proof (as explored in
+Section 6 of "Closed Type Families with Overlapping Equations", POPL'14). This
+possibility of an infinitary substitution manifests as closed type families that
+look like they should reduce, but don't. Users complain: #9082 and #17311. For
+open type families, we actually can have unsoundness if we don't take infinitary
+substitutions into account: #8162. But, luckily, for class instances, we just
+risk coherence -- not great, but it seems better to give users what they likely
+want. (Also, note that this problem existed for the entire decade of 201x without
+anyone noticing, so it's manifestly not ruining anyone's day.)
 -}
 
 -- |Look up an instance in the given instance environment. The given class application must match exactly
@@ -839,8 +882,10 @@ lookupInstEnv' ie vis_mods cls tys
           -- We consider MaybeApart to be a case where the instance might
           -- apply in the future. This covers an instance like C Int and
           -- a target like [W] C (F a), where F is a type family.
-            SurelyApart -> find ms us        rest
-            _           -> find ms (item:us) rest
+            SurelyApart              -> find ms us        rest
+              -- Note [Infinitary substitution in lookup]
+            MaybeApart MARInfinite _ -> find ms us        rest
+            _                        -> find ms (item:us) rest
       where
         tpl_tv_set = mkVarSet tpl_tvs
         tys_tv_set = tyCoVarsOfTypes tys


=====================================
compiler/GHC/Core/Unify.hs
=====================================
@@ -18,7 +18,7 @@ module GHC.Core.Unify (
         tcUnifyTy, tcUnifyTyKi, tcUnifyTys, tcUnifyTyKis,
         tcUnifyTysFG, tcUnifyTyWithTFs,
         BindFlag(..),
-        UnifyResult, UnifyResultM(..),
+        UnifyResult, UnifyResultM(..), MaybeApartReason(..),
 
         -- Matching a type against a lifted type (coercion)
         liftCoMatch,
@@ -55,8 +55,7 @@ import GHC.Data.FastString
 
 import Data.List ( mapAccumL )
 import Control.Monad
-import Control.Applicative hiding ( empty )
-import qualified Control.Applicative
+import qualified Data.Semigroup as S
 
 {-
 
@@ -347,6 +346,46 @@ complete. This means that, sometimes, a closed type family does not reduce
 when it should. See test case indexed-types/should_fail/Overlap15 for an
 example.
 
+Note [Unificiation result]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+When unifying t1 ~ t2, we return
+* Unifiable s, if s is a substitution such that s(t1) is syntactically the
+  same as s(t2), modulo type-synonym expansion.
+* SurelyApart, if there is no substitution s such that s(t1) = s(t2),
+  where "=" includes type-family reductions.
+* MaybeApart mar s, when we aren't sure. `mar` is a MaybeApartReason.
+
+Examples
+* [a] ~ Maybe b: SurelyApart, because [] and Maybe can't unify
+* [(a,Int)] ~ [(Bool,b)]:  Unifiable
+* [F Int] ~ [Bool]: MaybeApart MARTypeFamily, because F Int might reduce to Bool (the unifier
+                    does not try this)
+* a ~ Maybe a: MaybeApart MARInfinite. Not Unifiable clearly, but not SurelyApart either; consider
+       a := Loop
+       where  type family Loop where Loop = Maybe Loop
+
+There is the possibility that two types are MaybeApart for *both* reasons:
+
+* (a, F Int) ~ (Maybe a, Bool)
+
+What reason should we use? The *only* consumer of the reason is described
+in Note [Infinitary substitution in lookup] in GHC.Core.InstEnv. The goal
+there is identify which instances might match a target later (but don't
+match now) -- except that we want to ignore the possibility of infinitary
+substitutions. So let's examine a concrete scenario:
+
+  class C a b c
+  instance C a (Maybe a) Bool
+  -- other instances, including one that will actually match
+  [W] C b b (F Int)
+
+Do we want the instance as a future possibility? No. The only way that
+instance can match is in the presence of an infinite type (infinitely
+nested Maybes). We thus say that MARInfinite takes precedence, so that
+InstEnv treats this case as an infinitary substitution case; the fact
+that a type family is involved is only incidental. We thus define
+the Semigroup instance for MaybeApartReason to prefer MARInfinite.
+
 Note [The substitution in MaybeApart]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The constructor MaybeApart carries data with it, typically a TvSubstEnv. Why?
@@ -391,8 +430,8 @@ tcUnifyTyWithTFs twoWay t1 t2
   = case tc_unify_tys (const BindMe) twoWay True False
                        rn_env emptyTvSubstEnv emptyCvSubstEnv
                        [t1] [t2] of
-      Unifiable  (subst, _) -> Just $ maybe_fix subst
-      MaybeApart (subst, _) -> Just $ maybe_fix subst
+      Unifiable          (subst, _) -> Just $ maybe_fix subst
+      MaybeApart _reason (subst, _) -> Just $ maybe_fix subst
       -- we want to *succeed* in questionable cases. This is a
       -- pre-unification algorithm.
       SurelyApart      -> Nothing
@@ -431,36 +470,42 @@ tcUnifyTyKis bind_fn tys1 tys2
 -- This type does double-duty. It is used in the UM (unifier monad) and to
 -- return the final result. See Note [Fine-grained unification]
 type UnifyResult = UnifyResultM TCvSubst
+
+-- | See Note [Unificiation result]
 data UnifyResultM a = Unifiable a        -- the subst that unifies the types
-                    | MaybeApart a       -- the subst has as much as we know
+                    | MaybeApart MaybeApartReason
+                                 a       -- the subst has as much as we know
                                          -- it must be part of a most general unifier
                                          -- See Note [The substitution in MaybeApart]
                     | SurelyApart
                     deriving Functor
 
+-- | Why are two types 'MaybeApart'? 'MARTypeFamily' takes precedence:
+-- This is used (only) in Note [Infinitary substitution in lookup] in GHC.Core.InstEnv
+data MaybeApartReason = MARTypeFamily   -- ^ matching e.g. F Int ~? Bool
+                      | MARInfinite     -- ^ matching e.g. a ~? Maybe a
+
+instance Outputable MaybeApartReason where
+  ppr MARTypeFamily = text "MARTypeFamily"
+  ppr MARInfinite   = text "MARInfinite"
+
+instance Semigroup MaybeApartReason where
+  -- see end of Note [Unification result] for why
+  MARTypeFamily <> r = r
+  MARInfinite   <> _ = MARInfinite
+
 instance Applicative UnifyResultM where
   pure  = Unifiable
   (<*>) = ap
 
 instance Monad UnifyResultM where
-
   SurelyApart  >>= _ = SurelyApart
-  MaybeApart x >>= f = case f x of
-                         Unifiable y -> MaybeApart y
-                         other       -> other
+  MaybeApart r1 x >>= f = case f x of
+                            Unifiable y     -> MaybeApart r1 y
+                            MaybeApart r2 y -> MaybeApart (r1 S.<> r2) y
+                            SurelyApart     -> SurelyApart
   Unifiable x  >>= f = f x
 
-instance Alternative UnifyResultM where
-  empty = SurelyApart
-
-  a@(Unifiable {})  <|> _                 = a
-  _                 <|> b@(Unifiable {})  = b
-  a@(MaybeApart {}) <|> _                 = a
-  _                 <|> b@(MaybeApart {}) = b
-  SurelyApart       <|> SurelyApart       = SurelyApart
-
-instance MonadPlus UnifyResultM
-
 -- | @tcUnifyTysFG bind_tv tys1 tys2@ attepts to find a substitution @s@ (whose
 -- domain elements all respond 'BindMe' to @bind_tv@) such that
 -- @s(tys1)@ and that of @s(tys2)@ are equal, as witnessed by the returned
@@ -530,9 +575,9 @@ tc_unify_tys bind_fn unif inj_check match_kis rn_env tv_env cv_env tys1 tys2
     kis2 = map typeKind tys2
 
 instance Outputable a => Outputable (UnifyResultM a) where
-  ppr SurelyApart    = text "SurelyApart"
-  ppr (Unifiable x)  = text "Unifiable" <+> ppr x
-  ppr (MaybeApart x) = text "MaybeApart" <+> ppr x
+  ppr SurelyApart      = text "SurelyApart"
+  ppr (Unifiable x)    = text "Unifiable" <+> ppr x
+  ppr (MaybeApart r x) = text "MaybeApart" <+> ppr r <+> ppr x
 
 {-
 ************************************************************************
@@ -773,7 +818,7 @@ this, but we mustn't map a to anything else!)
 We thus must parameterize the algorithm over whether it's being used
 for an injectivity check (refrain from looking at non-injective arguments
 to type families) or not (do indeed look at those arguments).  This is
-implemented  by the uf_inj_tf field of UmEnv.
+implemented  by the um_inj_tf field of UMEnv.
 
 (It's all a question of whether or not to include equation (7) from Fig. 2
 of [ITF].)
@@ -994,7 +1039,7 @@ unify_ty env ty1 ty2 _kco
 
             ; unify_tys env inj_tys1 inj_tys2
             ; unless (um_inj_tf env) $ -- See (end of) Note [Specification of unification]
-              don'tBeSoSure $ unify_tys env noninj_tys1 noninj_tys2 }
+              don'tBeSoSure MARTypeFamily $ unify_tys env noninj_tys1 noninj_tys2 }
 
   | Just (tc1, _) <- mb_tc_app1
   , not (isGenerativeTyCon tc1 Nominal)
@@ -1002,7 +1047,7 @@ unify_ty env ty1 ty2 _kco
     --        because the (F ty1) behaves like a variable
     --        NB: if unifying, we have already dealt
     --            with the 'ty2 = variable' case
-  = maybeApart
+  = maybeApart MARTypeFamily
 
   | Just (tc2, _) <- mb_tc_app2
   , not (isGenerativeTyCon tc2 Nominal)
@@ -1010,7 +1055,7 @@ unify_ty env ty1 ty2 _kco
     -- E.g.   unify_ty [a] (F ty2) =  MaybeApart, when unifying (only)
     --        because the (F ty2) behaves like a variable
     --        NB: we have already dealt with the 'ty1 = variable' case
-  = maybeApart
+  = maybeApart MARTypeFamily
 
   where
     mb_tc_app1 = tcSplitTyConApp_maybe ty1
@@ -1120,7 +1165,8 @@ uVar env tv1 ty kco
                       -- this is because the range of the subst is the target
                       -- type, not the template type. So, just check for
                       -- normal type equality.
-                      guard ((ty' `mkCastTy` kco) `eqType` ty)
+                      unless ((ty' `mkCastTy` kco) `eqType` ty) $
+                      surelyApart
           Nothing  -> uUnrefined env tv1' ty ty kco } -- No, continue
 
 uUnrefined :: UMEnv
@@ -1190,7 +1236,7 @@ bindTv env tv1 ty2
         -- Make sure you include 'kco' (which ty2 does) #14846
         ; occurs <- occursCheck env tv1 free_tvs2
 
-        ; if occurs then maybeApart
+        ; if occurs then maybeApart MARInfinite
                     else extendTvEnv tv1 ty2 }
 
 occursCheck :: UMEnv -> TyVar -> VarSet -> UM Bool
@@ -1274,15 +1320,6 @@ instance Monad UM where
                   do { (state', v) <- unUM m state
                      ; unUM (k v) state' })
 
--- need this instance because of a use of 'guard' above
-instance Alternative UM where
-  empty     = UM (\_ -> Control.Applicative.empty)
-  m1 <|> m2 = UM (\state ->
-                  unUM m1 state <|>
-                  unUM m2 state)
-
-instance MonadPlus UM
-
 instance MonadFail UM where
     fail _   = UM (\_ -> SurelyApart) -- failed pattern match
 
@@ -1291,9 +1328,9 @@ initUM :: TvSubstEnv  -- subst to extend
        -> UM a -> UnifyResultM a
 initUM subst_env cv_subst_env um
   = case unUM um state of
-      Unifiable (_, subst)  -> Unifiable subst
-      MaybeApart (_, subst) -> MaybeApart subst
-      SurelyApart           -> SurelyApart
+      Unifiable (_, subst)    -> Unifiable subst
+      MaybeApart r (_, subst) -> MaybeApart r subst
+      SurelyApart             -> SurelyApart
   where
     state = UMState { um_tv_env = subst_env
                     , um_cv_env = cv_subst_env }
@@ -1333,9 +1370,7 @@ checkRnEnv :: UMEnv -> VarSet -> UM ()
 checkRnEnv env varset
   | isEmptyVarSet skol_vars           = return ()
   | varset `disjointVarSet` skol_vars = return ()
-  | otherwise                         = maybeApart
-               -- ToDo: why MaybeApart?
-               -- I think SurelyApart would be right
+  | otherwise                         = surelyApart
   where
     skol_vars = um_skols env
     -- NB: That isEmptyVarSet guard is a critical optimization;
@@ -1343,10 +1378,10 @@ checkRnEnv env varset
     -- the type, often saving quite a bit of allocation.
 
 -- | Converts any SurelyApart to a MaybeApart
-don'tBeSoSure :: UM () -> UM ()
-don'tBeSoSure um = UM $ \ state ->
+don'tBeSoSure :: MaybeApartReason -> UM () -> UM ()
+don'tBeSoSure r um = UM $ \ state ->
   case unUM um state of
-    SurelyApart -> MaybeApart (state, ())
+    SurelyApart -> MaybeApart r (state, ())
     other       -> other
 
 umRnOccL :: UMEnv -> TyVar -> TyVar
@@ -1358,8 +1393,8 @@ umRnOccR env v = rnOccR (um_rn_env env) v
 umSwapRn :: UMEnv -> UMEnv
 umSwapRn env = env { um_rn_env = rnSwap (um_rn_env env) }
 
-maybeApart :: UM ()
-maybeApart = UM (\state -> MaybeApart (state, ()))
+maybeApart :: MaybeApartReason -> UM ()
+maybeApart r = UM (\state -> MaybeApart r (state, ()))
 
 surelyApart :: UM a
 surelyApart = UM (\_ -> SurelyApart)


=====================================
testsuite/tests/typecheck/should_compile/T19044.hs
=====================================
@@ -0,0 +1,20 @@
+{-# LANGUAGE FlexibleInstances #-}
+{-# LANGUAGE MultiParamTypeClasses #-}
+module T19044 where
+
+class C a b where
+  m :: a -> b
+
+instance C a a where
+  m = id
+
+instance C a (Maybe a) where
+  m _ = Nothing
+
+f :: a -> Maybe a
+f = g
+  where
+    g x = h (m x) x
+
+h :: Maybe a -> a -> Maybe a
+h = const


=====================================
testsuite/tests/typecheck/should_compile/T19052.hs
=====================================
@@ -0,0 +1,14 @@
+{-# LANGUAGE AllowAmbiguousTypes, DataKinds, FlexibleInstances, KindSignatures, MultiParamTypeClasses, ScopedTypeVariables, TypeApplications, TypeFamilies, TypeOperators, UndecidableInstances #-}
+module Overlap where
+
+import Data.Kind (Type)
+
+class Sub (xs :: [Type]) (ys :: [Type]) where
+  subIndex :: Int
+instance {-# OVERLAPPING #-} Sub xs xs where
+  subIndex = 0
+instance (ys ~ (y ': ys'), Sub xs ys') => Sub xs ys where
+  subIndex = subIndex @xs @ys' + 1
+
+subIndex1 :: forall (x :: Type) (xs :: [Type]). Int
+subIndex1 = subIndex @xs @(x ': xs)


=====================================
testsuite/tests/typecheck/should_compile/all.T
=====================================
@@ -734,6 +734,8 @@ test('T17186', normal, compile, [''])
 test('CbvOverlap', normal, compile, [''])
 test('InstanceGivenOverlap', normal, compile, [''])
 test('InstanceGivenOverlap2', normal, compile, [''])
+test('T19044', normal, compile, [''])
+test('T19052', normal, compile, [''])
 test('LocalGivenEqs', normal, compile, [''])
 test('LocalGivenEqs2', normal, compile, [''])
 test('T18891', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a65abc31007f57cff07a942eead7684c9eb6b857

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a65abc31007f57cff07a942eead7684c9eb6b857
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201216/eeca7274/attachment-0001.html>


More information about the ghc-commits mailing list