Overlapping instances in existentials

Simon Peyton-Jones simonpj@microsoft.com
Thu, 19 Jun 2003 09:55:36 +0100


[I'm widening this to the Haskell list, because overlapping instances
are of general interest.]

| To determine (SubType y Value) which is just:
|               (SubType y (Either Double BaseType))
|=20
| it seems to me that GHC should (has to?) use
|    instance (SubType a b) =3D> SubType a (Either x b)
|=20
| I do not see how the other alternative is applicable:
|       instance SubType a (Either a x)

Well, the "target" (Subtype y (Either Double BaseType))
can match against *both* of these instances.

It can match
    instance (SubType a b) =3D> SubType a (Either x b)

by taking a=3Dy, x=3DDouble, b=3DBaseType

*If y were instantiated to Double* it could match the second instance
declaration too.  Now, you say, y is an existential type variable, so it
can't be instantiated to Double --- but that's tricky to pin down.

Consider:

	class Show a where show :: a -> String
	instance Show a =3D> Show [a] where...
	instance Show [Char] where...

	f :: Show a =3D> [a] -> String
	f xs =3D show xs

	main =3D print (f "hello")

When doing type inference for 'f' we don't know what 'a' will be
instantiated to.   Picking the first is arguably wrong if f is called at
type String (as it is here).    So GHC and Hugs both bleat.

Now your situation is an *existential* type variable, not a universal
one.  But still, if y *was* Double in a particular situation, then the
other instance decl "should" apply, perhaps.  Or maybe not.


GHC simply does not have a satisfactory solution to the
overlapping-instance question.  Nor, I think, does Hugs.  A good start
would be for someone to write a "guide to the problems/design-space of
overlapping instances".  (There is a ton of back email on the Haskell
list about this as raw material.)  People regularly ask for extensions
and/or changes to the way GHC deals with overlap, which I mostly resist,
not purely out of laziness -- it's just that I don't want to meddle
until we can move to something that is plainly better.=20

I enclose a long comment from GHC's source code that Jeff Lewis wrote on
this topic too.

Sorry not to be more helpful.  It's a bit of a swamp.

Simon=20


> |  >> test.hs:26:
 > |  >>     Could not unambiguously deduce (SubType x Value)
 > |  >> 	from the context (SubType x BaseType)
 > |  >>     The choice of (overlapping) instance declaration
 > |  >> 	depends on the instantiation of `x'
 > |  >>     Probable fix:
 > |  >> 	Add (SubType x Value)
 > |  >> 	to the existential context of a data constructor
 > |  >> 	Or add an instance declaration for (SubType x Value)
 > |  >>     arising from use of `inj' at test.hs:26
 > |  >>     In the definition of `test': inj x
 > |  >> Failed, modules loaded: none.
 > |  >> Prelude>
 > |
 > |
 > | -- -----------------------------------------------------------
 > | -- test.hs
 > | -- -----------------------------------------------------------
 > | module Test where
 > |
 > | import Control.Monad.Error
 > |
 > | class SubType a b where
 > |   inj :: a -> b
 > |   prj :: b -> Maybe a
 > |
 > | instance SubType a (Either a x) where
 > |   inj       =3D Left
 > |   prj       =3D either Just (const Nothing)
 > |
 > | instance (SubType a b) =3D> SubType a (Either x b) where
 > |     inj       =3D Right . inj
 > |     prj       =3D either (const Nothing) prj
 > |
 > | -- -----------------------------------------------------------
 > |
 > | type BaseType =3D Either Integer ( Either Bool () )
 > |
 > | type Value =3D (Either Double BaseType)
 > |
 > | data Foo =3D  forall x. (SubType x BaseType)  =3D> MkFoo x
 > |
 > | test :: Foo -> Value
 > | test (MkFoo x) =3D inj x


\subsection{Avoiding a problem with overlapping}

Consider this little program:

\begin{pseudocode}
     class C a        where c :: a
     class C a =3D> D a where d :: a

     instance C Int where c =3D 17
     instance D Int where d =3D 13

     instance C a =3D> C [a] where c =3D [c]
     instance ({- C [a], -} D a) =3D> D [a] where d =3D c

     instance C [Int] where c =3D [37]

     main =3D print (d :: [Int])
\end{pseudocode}

What do you think `main' prints  (assuming we have overlapping
instances, and
all that turned on)?  Well, the instance for `D' at type `[a]' is
defined to
be `c' at the same type, and we've got an instance of `C' at `[Int]', so
the
answer is `[37]', right? (the generic `C [a]' instance shouldn't apply
because
the `C [Int]' instance is more specific).

Ghc-4.04 gives `[37]', while ghc-4.06 gives `[17]', so 4.06 is wrong.
That
was easy ;-)  Let's just consult hugs for good measure.  Wait - if I use
old
hugs (pre-September99), I get `[17]', and stranger yet, if I use hugs98,
it
doesn't even compile!  What's going on!?

What hugs complains about is the `D [a]' instance decl.

\begin{pseudocode}
     ERROR "mj.hs" (line 10): Cannot build superclass instance
     *** Instance            : D [a]
     *** Context supplied    : D a
     *** Required superclass : C [a]
\end{pseudocode}

You might wonder what hugs is complaining about.  It's saying that you
need to add `C [a]' to the context of the `D [a]' instance (as appears
in comments).  But there's that `C [a]' instance decl one line above
that says that I can reduce the need for a `C [a]' instance to the
need for a `C a' instance, and in this case, I already have the
necessary `C a' instance (since we have `D a' explicitly in the
context, and `C' is a superclass of `D').

Unfortunately, the above reasoning indicates a premature commitment to
the
generic `C [a]' instance.  I.e., it prematurely rules out the more
specific
instance `C [Int]'.  This is the mistake that ghc-4.06 makes.  The fix
is to
add the context that hugs suggests (uncomment the `C [a]'), effectively
deferring the decision about which instance to use.

So, what's the fix?  I think hugs has it right.  Here's why.  Let's try
something else out with ghc-4.04.  Let's add the following line:

    d' :: D a =3D> [a]
    d' =3D c

Everyone raise their hand who thinks that `d :: [Int]' should give a
different answer from `d' :: [Int]'.  Well, in ghc-4.04, it does.  The
`optimization' only applies to instance decls, not to regular
bindings, giving inconsistent behavior.

Old hugs had this same bug.  Here's how we fixed it: like GHC, the
list of instances for a given class is ordered, so that more specific
instances come before more generic ones.  For example, the instance
list for C might contain:
    ..., C Int, ..., C a, ... =20
When we go to look for a `C Int' instance we'll get that one first.
But what if we go looking for a `C b' (`b' is unconstrained)?  We'll
pass the `C Int' instance, and keep going.  But if `b' is
unconstrained, then we don't know yet if the more specific instance
will eventually apply.  GHC keeps going, and matches on the generic `C
a'.  The fix is to, at each step, check to see if there's a reverse
match, and if so, abort the search.  This prevents hugs from
prematurely chosing a generic instance when a more specific one
exists.

--Jeff

BUT NOTE [Nov 2001]: we must actually *unify* not reverse-match in
this test.  Suppose the instance envt had
    ..., forall a b. C a a b, ..., forall a b c. C a b c, ...
(still most specific first)
Now suppose we are looking for (C x y Int), where x and y are
unconstrained.
	C x y Int  doesn't match the template {a,b} C a a b
but neither does=20
	C a a b  match the template {x,y} C x y Int
But still x and y might subsequently be unified so they *do* match.

Simple story: unify, don't match.