[Haskell-cafe] Fundeps and overlapping instances

Wed Jul 7 17:14:31 EDT 2010

Oleg points out, and Martin also mentions, that functional dependencies appear to interact OK with overlapping instances, but type families do not. I this impression is mistaken, and I'll try to explain why in this message, in the hope of exposing any flaws in my reasoning.

We can't permit overlap for type families because it is unsound to do so (ie you can break "well typed programs don't go wrong"). But if it's unsound for type families, it would not be surprising if it was unsound for fundeps too.  (I don't think anyone has done a soundness proof for fundeps + local constraints + overlapping instances, have they?)  And indeed I think it is.

So the short summary of this message is: if it works for fundeps it works for type families, and vice versa.  (NB this equivalence is not true about GHC's current implementation, however.   GHC doesn't support the combination of fundeps and local constraints at all.)

Such an equivalence doesn't argue against fundeps; I'm only suggesting the that the two really are very closely equivalent.   I much prefer type families from a programming-style point of view, but that's a subjective opinion.

Simon

Imagine a system "FDL" that has functional dependencies and local type constraints.  The big deal about this is that you get to exploit type equalities in *given* constraints.  Consider Oleg's example, cut down a bit:

class C a b | a -> b
instance C Int Bool
newtype N2 a = N2 (forall b. C a b => b)

t2 :: N2 Int
t2 = N2 True

We end up type-checking (True :: forall b. C Int b => b).   From the functional dependency we know that (b~Bool), so the function should typecheck.  GHC rejects this program; FDL would not.

But making use of these extra equalities in "given" constraints is quite tricky.  To see why look first at Example 1:

module X where
   class C a b | a -> b

   data T a where
     MkT :: C a b => b -> T a

module M1 where
  import X
  instance C Int Char where ...
  f :: Char -> T Int
  f c = MkT c

module M2 where
  import X
  instance C Int Bool
  g :: T Int -> Bool
  g (MkT x) = x

module Bad where
  import M1
  import M2
  bad :: Char -> Bool
  bad = g . f

This program is unsound: it lets you cast an Int to a Bool; result is a seg-fault.

You may say that the problem is the inconsistent functional dependencies in M1 and M2.  But GHC won't spot that.  For type families, to avoid this we "eagerly" check for conflicts in type-family instances.  In this case the conflict would be reported when compiling module Bad, because that is the first time when both instances are visible together.

So any FDL system should also make this eager check for conflicts.

What about overlap?  Here's Example 2:

{-# LANGUAGE IncoherentInstances #-}
module Bad where
  import X
  -- Overlapping instances
  instance C Int Bool         -- Instance 1
  instance C a [a]               -- Instance 2

  f :: Char -> T Int
  f c = MkT c                       -- Uses Instance 1

  g :: T a -> a
  g (MkT x) = x                    -- Uses Instance 2

  bad :: Char -> Int
  bad = g . f

Again, a seg fault if it typechecks.  But will it?  When typechecking 'g', we get a constraint (C a ?), where 'a' is a skolem constant.  Without IncoherentInstances GHC would reject the program on the grounds that it does not know what instance to choose.  But *with* IncoherentInstances it would probably go through, which is unsound.  So IncoherentInstances has moved from causing varying dynamic behaviour to causing seg faults.

Very well, so FDL must get rid of IncoherentInstances altogether, at least for classes that have functional dependencies (or that have superclasses that do).

But at the moment GHC makes an exception for *existentials*.  Consider Example 3:

  class C a b | a -> b

  -- Overlapping instances
  instance C Int Bool         -- Instance 1
  instance C a [a]               -- Instance 2

  data T where
    MkT :: C a b => a -> b -> T

  f :: Bool -> T
  f x = MkT (3::Int) x          -- Uses Instance 1

  g :: T -> T
  g (MkT n x) = MkT n (reverse x)   -- Uses Instance 2

  bad :: Bool -> T
  bad = g . f

In the pattern match for MkT in g we have the constraint (C a b), where 'a' is existentially bound.   So under GHC's current rules it'll choose the (C a [a]) instance, and conclude that (b ~ [a]).  So it's ok to reverse x.  But it isn't; see function bad!

So to avoid unsoundness we must not choose a particular instance from an overlapping set unless we know, absolutely positively, that the other cases cannot match.

(GHC's exception for existentials was introduced in response to user demand. Usually, overlapping instances are somehow semantically coherent, and with an existential we are *never* going to learn more about the instantiating type, so choosing the best available seems like a good thing to do.)

But even nuking IncoherentInstances altogether is not enough.  Consider this variant of Example 3, call it Example 4:
 module M where
  class C a b | a -> b

  instance C a [a]               -- Instance 2

  data T where
    MkT :: C a b => a -> b -> T

  g :: T -> T
  g (MkT n x) = MkT n (reverse x)   -- Uses Instance 2

module Bad where
  import M
  instance C Int Bool         -- Instance 1

  f :: Bool -> T
  f x = MkT (3::Int) x          -- Uses Instance 1

  bad :: Bool -> T
  bad = g . f

This is nasty.  In module M we have only one instance declaration for C, so there's no overlap, and function g typechecks fine.

Now module Bad adds an overlapping instance declaration and 'f' uses it; moreover, it's clear that the new instance declaration is the right one to use.

It's not clear how to fix this.  The only way I can think of is to insist that all the instances appear in a single module, so that you cannot add more "later". But that loses part of the point of overlap in the first place, namely to provide a generic instance and then later override it.

So FDL must

*        embody eager checking for inconsistent instances, across modules

*        never resolve overlap until only a unique instance can possibly match

*        put all overlapping instances in a single module

Type families already implement the first of these.    I believe that if we added the second and third, then overlap of type families would be fine.  (I may live to eat my words here.)

I'd be interested to know what Chameleon does on these examples.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100707/a94e17b0/attachment-0001.html