TypeFamilies vs. etc - restrained instance overlap

Sun Jun 26 13:45:12 CEST 2011

Hi all

THE TOPIC: [from the original post]
>TypeFamilies vs. FunctionalDependencies & type-level
recursion
>David Mazieres dm-list-haskell-prime at scs.stanford.edu 
>Sun May 29 20:59:44 CEST 2011 
>...  GHC is on its way to accepting "a ~ b" as a constraint
that
>types a and b are equal.  If there were a corresponding "a
/~ b", ...

>Of course, I have no idea how the compiler could know such
constraints
>are sufficient to avoid overlapping types. ...

I think I do have an idea - see below. More important, how
does type inference proceed without the dreaded *search* and
potentially premature commit to an instance? I've also some
ideas on that. (But this post will already be long enough,
see separately for type inference.)

THE WHAT:
[SPJ responded to the OP]
>1. As things stand in GHC you can do some things with
functional dependencies that you can't do with type
families. The archetypical example is type equality.
>               .... (example that needed overlapping
instances, so rejected for type families)
>  But you can with fundeps
>	class Eq a b c | a b -> c
>	instance Eq k k True
>	instance Eq j k False

As other posters pointed out, this form isn't quite right,
and the form that works is more cumbersome and obscure. The
thread then veered away into how fundeps could be improved,
mostly resulting in the rules becoming even harder to
understand (at least for me).

SPJ mused:
>2.  In a hypothetical extension to type families, namely
*closed* type families, you could support overlap:
>	closed type family Eq a b :: * where
>	  type instance Eq k k = True
>	  type instance Eq j k = False
> The idea is that you give all the equations together;
matching is top-to-bottom; and the type inference only picks
an equation when all the earlier equations are guaranteed
not to match.  So there is no danger of making different
choices in different parts of the program (which is what
gives rise to unsoundness).

THE HOW:
I'd like to tackle this from another approach: what would it
need to extend type families with a restrained form of
instances? (Because, like SPJ "A big reason I like type
families better is simply that I can understand them.")

I prefer the following style:
	type family Eq a b :: *		-- not closed
	type instance Eq k k	= TTrue
	type instance Eq j k	| j /~ k	= TFalse	-- effectively this
makes the type family closed

(Variations of this idea have appeared and reappeared at
various times [at least since  "A Theory of Overloading"
Stuckey & Sulzmann, ICFP 2002, section 8.2, using
disequalities in CHR's]. I'm keenly aware that the
Haskell-prime process is only supposed to consider
already-implemented extensions, so I'm trying to draw in as
much pre-existing work as possible.)

(The pattern-guard style of syntax seems natural, but I'll
leave others to debate the lexicals ;-)

Other examples:
	instance (Monad m) => MonadState s (StateT s m) where ...

	instance (Monad (t m), MonadTrans t, MonadState s m) =>
            MonadState s (t m) | (t m) /~ (StateT s m)	where
..

	type instance IsFunction (a -> b)	= TTrue
	type instance IsFunction a	| a /~ (_ -> _)	= TFalse
                             	-- using _ as 'don't care'
type var

THE WHY:
Advantages:
A1: Each instance 'stands alone' (so they don't have to be
defined in the same place).
      This approach avoids one difficulty SPJ points out
about sequence of equations:
>3.  Closed type families are not implemented yet.  The only
tricky point I see would be how to express in System FC the
proof that "earlier equations don't match".   Moreover, to
support abstraction one might well want David's /~
predicate, so that you could say
>	f :: forall a b. (a /~ b) => ..blah..
> This f can be applied to any types a,b provided they don't
match.   I'm frankly unsure about all the consequences here.

A2: The instances don't overlap, although the instance heads
do. So ...

A3: Introducing a new instance (perhaps imported from an
ostensibly unrelated module)
       does not 'silently' change behaviour.
      (One of the long-standing complaints about overlaps.)
      That is: if the new instance overlaps, we get a
compile fail.

A4: And we can validate instances for overlap 'there and
then' as the compiler encounters each instance.
      That is, how Hugs behaves, whereas GHC currently
defers validating overlap
        until it encounters a use-site where it can't figure
out which instance to choose.
      GHC's behaviour was a source of confusion even in the
thread that followed
        SPJ's post, because instances appeared to compile
clean,
        but in fact couldn't be used.

THE HOW (part 1 - validating instances):
Instance validation uses mechanisms already in GHC for the
limited forms of overlap it supports in Type Families:
V0:	The terms each side of the /~ must be of the same Kind
           (same rule as for type equalities).
V1:	Upon encountering each instance, compare it one by one
            with all other instances in scope.
V2:	If the heads don't overlap, we're OK.
V3:	If they do overlap, form the mgu for the overlap.
V4:	Substitute the mgu into the RHS of the instances, if
they're confluent we're OK.
	(Up to this point is existing GHC behaviour.)
V5:	If they do overlap but aren't confluent, and neither has
a disequality restraint,
          reject.
V6:	If <none of the above>, substitute the mgu into the
disequality(s)
          for the instance(s).
	At least one must lead to a contradiction.

Example from GHC's documentation for type families:
>	type instance G (a, Int)	= [a]
>	type instance G (Char, b)	= [b]	-- ILLEGAL overlap on G
(Char, Int)
>	type instance G (Bool, b)	| b /~ Int	= [b]	
>         -- legal, because substituting the mgu for the
overlap yields
>         -- G (Bool, Int) | Int /~ Int -- at validation
step V6

THE HOW (part 2 - type inference): see the next episode.

... Ends

AntC