Overlapping instances in existentials
Simon Peyton-Jones
simonpj@microsoft.com
Thu, 19 Jun 2003 09:55:36 +0100
[I'm widening this to the Haskell list, because overlapping instances
are of general interest.]
| To determine (SubType y Value) which is just:
| (SubType y (Either Double BaseType))
|=20
| it seems to me that GHC should (has to?) use
| instance (SubType a b) =3D> SubType a (Either x b)
|=20
| I do not see how the other alternative is applicable:
| instance SubType a (Either a x)
Well, the "target" (Subtype y (Either Double BaseType))
can match against *both* of these instances.
It can match
instance (SubType a b) =3D> SubType a (Either x b)
by taking a=3Dy, x=3DDouble, b=3DBaseType
*If y were instantiated to Double* it could match the second instance
declaration too. Now, you say, y is an existential type variable, so it
can't be instantiated to Double --- but that's tricky to pin down.
Consider:
class Show a where show :: a -> String
instance Show a =3D> Show [a] where...
instance Show [Char] where...
f :: Show a =3D> [a] -> String
f xs =3D show xs
main =3D print (f "hello")
When doing type inference for 'f' we don't know what 'a' will be
instantiated to. Picking the first is arguably wrong if f is called at
type String (as it is here). So GHC and Hugs both bleat.
Now your situation is an *existential* type variable, not a universal
one. But still, if y *was* Double in a particular situation, then the
other instance decl "should" apply, perhaps. Or maybe not.
GHC simply does not have a satisfactory solution to the
overlapping-instance question. Nor, I think, does Hugs. A good start
would be for someone to write a "guide to the problems/design-space of
overlapping instances". (There is a ton of back email on the Haskell
list about this as raw material.) People regularly ask for extensions
and/or changes to the way GHC deals with overlap, which I mostly resist,
not purely out of laziness -- it's just that I don't want to meddle
until we can move to something that is plainly better.=20
I enclose a long comment from GHC's source code that Jeff Lewis wrote on
this topic too.
Sorry not to be more helpful. It's a bit of a swamp.
Simon=20
> | >> test.hs:26:
> | >> Could not unambiguously deduce (SubType x Value)
> | >> from the context (SubType x BaseType)
> | >> The choice of (overlapping) instance declaration
> | >> depends on the instantiation of `x'
> | >> Probable fix:
> | >> Add (SubType x Value)
> | >> to the existential context of a data constructor
> | >> Or add an instance declaration for (SubType x Value)
> | >> arising from use of `inj' at test.hs:26
> | >> In the definition of `test': inj x
> | >> Failed, modules loaded: none.
> | >> Prelude>
> |
> |
> | -- -----------------------------------------------------------
> | -- test.hs
> | -- -----------------------------------------------------------
> | module Test where
> |
> | import Control.Monad.Error
> |
> | class SubType a b where
> | inj :: a -> b
> | prj :: b -> Maybe a
> |
> | instance SubType a (Either a x) where
> | inj =3D Left
> | prj =3D either Just (const Nothing)
> |
> | instance (SubType a b) =3D> SubType a (Either x b) where
> | inj =3D Right . inj
> | prj =3D either (const Nothing) prj
> |
> | -- -----------------------------------------------------------
> |
> | type BaseType =3D Either Integer ( Either Bool () )
> |
> | type Value =3D (Either Double BaseType)
> |
> | data Foo =3D forall x. (SubType x BaseType) =3D> MkFoo x
> |
> | test :: Foo -> Value
> | test (MkFoo x) =3D inj x
\subsection{Avoiding a problem with overlapping}
Consider this little program:
\begin{pseudocode}
class C a where c :: a
class C a =3D> D a where d :: a
instance C Int where c =3D 17
instance D Int where d =3D 13
instance C a =3D> C [a] where c =3D [c]
instance ({- C [a], -} D a) =3D> D [a] where d =3D c
instance C [Int] where c =3D [37]
main =3D print (d :: [Int])
\end{pseudocode}
What do you think `main' prints (assuming we have overlapping
instances, and
all that turned on)? Well, the instance for `D' at type `[a]' is
defined to
be `c' at the same type, and we've got an instance of `C' at `[Int]', so
the
answer is `[37]', right? (the generic `C [a]' instance shouldn't apply
because
the `C [Int]' instance is more specific).
Ghc-4.04 gives `[37]', while ghc-4.06 gives `[17]', so 4.06 is wrong.
That
was easy ;-) Let's just consult hugs for good measure. Wait - if I use
old
hugs (pre-September99), I get `[17]', and stranger yet, if I use hugs98,
it
doesn't even compile! What's going on!?
What hugs complains about is the `D [a]' instance decl.
\begin{pseudocode}
ERROR "mj.hs" (line 10): Cannot build superclass instance
*** Instance : D [a]
*** Context supplied : D a
*** Required superclass : C [a]
\end{pseudocode}
You might wonder what hugs is complaining about. It's saying that you
need to add `C [a]' to the context of the `D [a]' instance (as appears
in comments). But there's that `C [a]' instance decl one line above
that says that I can reduce the need for a `C [a]' instance to the
need for a `C a' instance, and in this case, I already have the
necessary `C a' instance (since we have `D a' explicitly in the
context, and `C' is a superclass of `D').
Unfortunately, the above reasoning indicates a premature commitment to
the
generic `C [a]' instance. I.e., it prematurely rules out the more
specific
instance `C [Int]'. This is the mistake that ghc-4.06 makes. The fix
is to
add the context that hugs suggests (uncomment the `C [a]'), effectively
deferring the decision about which instance to use.
So, what's the fix? I think hugs has it right. Here's why. Let's try
something else out with ghc-4.04. Let's add the following line:
d' :: D a =3D> [a]
d' =3D c
Everyone raise their hand who thinks that `d :: [Int]' should give a
different answer from `d' :: [Int]'. Well, in ghc-4.04, it does. The
`optimization' only applies to instance decls, not to regular
bindings, giving inconsistent behavior.
Old hugs had this same bug. Here's how we fixed it: like GHC, the
list of instances for a given class is ordered, so that more specific
instances come before more generic ones. For example, the instance
list for C might contain:
..., C Int, ..., C a, ... =20
When we go to look for a `C Int' instance we'll get that one first.
But what if we go looking for a `C b' (`b' is unconstrained)? We'll
pass the `C Int' instance, and keep going. But if `b' is
unconstrained, then we don't know yet if the more specific instance
will eventually apply. GHC keeps going, and matches on the generic `C
a'. The fix is to, at each step, check to see if there's a reverse
match, and if so, abort the search. This prevents hugs from
prematurely chosing a generic instance when a more specific one
exists.
--Jeff
BUT NOTE [Nov 2001]: we must actually *unify* not reverse-match in
this test. Suppose the instance envt had
..., forall a b. C a a b, ..., forall a b c. C a b c, ...
(still most specific first)
Now suppose we are looking for (C x y Int), where x and y are
unconstrained.
C x y Int doesn't match the template {a,b} C a a b
but neither does=20
C a a b match the template {x,y} C x y Int
But still x and y might subsequently be unified so they *do* match.
Simple story: unify, don't match.