the MPTC Dilemma (please solve)

Fri Feb 17 12:55:34 EST 2006

I'm forwarding an email that Martin Sulzmann asked me to post on his
behalf.

------------------------------------------------------------
From: Martin Sulzmann <sulzmann at comp.nus.edu.sg>
Subject: MPTC/FD dilemma 

Here's my take on the MPTC/FD dilemma.

There has been lots of discussion about MPTC/FDs. They are very useful
no doubt. But they also show inconsistent behavior across different
Haskell platforms (e.g. see Claus Reinke's last email).

Claus Reinke's example is "interesting" because he mixes some advanced form
of FDs with overlapping instances. Something which has never been
studied before (besides a few email exchanges between Oleg and myself
on Haskell a few months ago). So, I propose to ignore overlapping
instances for the moment.

What follows is rather lengthy. The main points of this email are:

- There's a class of MPTC/FD programs which enjoy sound, complete
  and decidable type inference. See Result 1 below. I believe that
  hugs and ghc faithfully implement this class.
  Unfortunately, for advanced type class acrobats this class of
  programs is too restrictive.

- There's a more expressive class of MPTC/FD programs which enjoy
  sound and complete type inference if we can guarantee termination
  by for example imposing a dynamic check.  See Result 2 below. Again,
  I believe that  hugs and ghc faithfully implement this class if they
  additionally implement a dynamic termination check.
  Most type class acrobats should be happy with this class I believe.

- ATs (associated types) will pose the same challenges.
  That is, type inference relies on dynamic termination checks.

Here are the details.

Let's take a look at the combination of FDs and "well-behaved" instances.
By well-behaved instances I mean instances which are non-overlapping and
terminating. From now on I will assume that instances must be well-behaved.
The (maybe) surprising observation is that the combination
of FDs and well-behaved instances easily breaks completeness and
decidability of type inference. Well, all this is well-studied.
Check out
[February 2006] Understanding Functional Dependencies via Constraint
Handling Rules by Martin Sulzmann, Gregory J. Duck, Simon Peyton-Jones
and Peter J. Stuckey
which is available via my home-page.

Here's a short summary of the results in the paper:

Result 1:
To obtain sound (we always have that), complete and decidable type inference
we need to impose
    - the Basic Conditions (see Sect4)
      (we can replace the Basic Conditions by any conditions which guarantees
       that instances are well-behaved. I'm ignoring here
       super classes for simplicity)
    - Jones's FD restrictions and the Bound Variable Condition (see Sect4.1)

The trouble is that these restrictions are quite "severe". In fact,
there's not much hope to improve on these results. See the discussion
in Sect5.1-5.3.

Let's take a look at a simple example to understand the gist of the problem.

Ex: Consider

class F a b | a->b
instance F a b => F [a] [b] -- (F)

Assume some program text generates the constraint F [a] a.
Type inference will improve a to [b] (because of the 
functional dependency in connection with the instance).
Though, then we'll find the constraint F [[b]] [b]
which can be resolved by the instance to F [b] b.
But this is a renamed copy of the original constraint
hence, type inference will not terminate here.

If we translate the instance and improvement conditions of
the above program to CHRs the problem becomes obvious.

rule F a b, F a c  ==> b=c    -- the FD rule
rule F [a] [a]    <==> F a b  -- the instance rule
rule F [a] c       ==> c=[b]  -- the improvement rule

The rules should be read from left to right where
"==>" stands for propagation and "<==>" for simplification.

My point: The above improvement rule is (potentially) dangerous
cause we introduce a fresh variable b. And that's why type inference
may not terminate. Using the terminology from the paper.
The instance (F) violates one of Jones' FD restriction (the Coverage
Condition).

So, is all lost? Not quite. A second major result in the paper says
that if we can guarantee termination then we can guarantee completeness.
Of course, we need to find some appropriate replacement for 
Jones' FD restrictions.

Result2:
Assuming we can guarantee termination, then type inference
is complete if we can satisfy
   - the Bound Variable Condition,
   - the Weak Coverage Condition, 
   - the Consistency Condition, and
   - and FDs are full.
Effectively, the above says that type inference is sound,
complete but semi-decidable. That is, we're complete
if each each inference goal terminates.

Here's a short explanation of the conditions.
The Bound Variable Condition says that all variables which appear
in the instance head must appear in the instance body.
Weak Coverage says that any of the "unbound" variables (see b
in the above example) must be captured by a FD in the instance context.
This sounds complicated but is fairly simple. Please see the paper.
Consistency says that for any FD and any two instances the improvement
rules generated must be consistent. In fact, because we assume
that FDs are full and instances are well-behaved (i.e. instances are
non-overlapping) the Consistency Condition is always satisfied here. 
So, why did I mention Consistency then? Well, Theorem 2 in the paper 
from which Result2 is derived is stated in very general terms. 
We only assume that instances are terminating and confluent. Hence,
we could (eventually) extend Result2 to overlapping instances
and *then* Consistency becomes important (But PLEASE let's ignore
overlapping instances for the moment).
Okay, let's consider the last condition.
We say a FD is full
if all class parameters either appear in the domain or range of a FD.
For example, 
   class F a b c | a -> b is not full but
   class F a b c | a b -> c is full

Again, if we leave out any of the above conditions, we loose completeness.

Result2 covers in my opinion a rich class of FD programs.
In my experience, the fullness condition is not a problem.
(In some situations, we can transform a non-full into a full
program but I'd rather reject non-full FDs).
We could even drop the Bound Variable Condition and check
for this condition "locally", similar to the way we check
for termination now locally.

Somebody might ask, will ATs (associated types) behave any better?
Unfortunately not, they share the same problem as FDs. AT type inference
has the potential to become undecidable and incomplete.

Here's the above example re-casted in AT terms.

class F a where
  type T a
instance F a => F [a]
  type T [a] = [T a]

Assume some program text generates the constraint T [b] = b.
Naively, we could apply the above type definition which results
in [T [b]] = b etc 
(T [b] = [T b] and T [b] = b leads to [T b] = b which leads
to [T [T b]] = b)
Though, the current AT description rejects the constraint T [b] = b
right away. AT applies the occurs check rule. Clearly, b in fv(T [b]).

Important point to note: The occurs check is a "dynamic" check. 
We could impose a similar check on FDs. E.g. the type class
constraint F [a] a contains a cycle. Hence, we should immediately
reject this constraint. Of course, cycles may span across
several constraints/type equations. For example, consider
F [a] b, F [b] a  and T [a]=b, T [b]=a.

Probably, more could be said but I'll stop here.

Martin