the MPTC Dilemma (please solve)

Tue Feb 28 09:21:11 EST 2006

> It's very hard to follow you here. 

is it really? perhaps you're not trying hard enough:

> > - for each FD a given constraint is subject to, the range types 
> >     should be ignored during instance selection: the domain types
> >     of the FD constitute the inputs and are sufficient to compute 
> >     unique FD range types as outputs of the type relation. the
> >     computed range types may then be compared with the range 
> >     types given in the original constraint to determine whether the
> >     constraint can be fulfilled or not

is there a simpler way to put the idea?

whatever way you formalize (n+1)-parameter classes "C t1..tn t(n+1)"
with an FD "t1..tn -> t(n+1)", my suggestion is to formalize it in almost 
the same way, but treating parameter (n+1) differently. instead of 
checking checking for validity of instances wrt constraints of the form
"C r1..rn r(n+1)" (where ri are the concrete types arising), check that 
any constraint "C r1..rn x" (where x is a free, but instantiable variable) 
will yield zero or one instantiations for x. then, when you have a 
constraint "C r1..rn r(n+1)", use something like this instead: 
"exists x. (C r1..rn x && x==r(n+1)"

there, that's just as long, and not much clearer than saying that t1..
tn should be treated as inputs to and t(n+1) as an output of the 
inference process.

> Can you formalize your proposal below such that we can verify 
> some formal results (e.g. tractable inference, coherence etc).

I'd like to, but you're not giving me anything to work with. for 
instance, your "the TC to CHR translation" carefully omits all the
features currently under discussion here.

> Why not be "macho" and use a formal framework in which this 
> all can be expressed?

because I'm the humble programmer and you're the mighty 
formalist?-) I don't find this style of discussion constructive.

I'm interested in language design, so actually I do want to look
at both theory and practice, but I'm most interested in using a
language I can think in and thinking in a language I can use. for
the present context, that is Haskell. if you prefer to think in 
CHRs, that's fine, but why do you expect me to translate for 
you?

> > ...make type classes a syntactic
> > representation of their semantic domain.
> 
> What do you mean? Explaining type classes in terms of
> type classes? This can't work.

that is neither what I meant, not what I said. 

semantics maps from some existing syntactic domain (here 
Haskell type classes) to some semantic domain (eg, QTs or 
CHRs), in order to help understand the former in terms of 
the latter and its nice theory. semantics-based language design 
maps from some semantic domain to some to be constructed
syntactic domain, in order to help expressing the former in 
terms of the latter, preferably inheriting the elegance of the
formers nice theory.

QTs explain type classes, but QTs are more elegant, simple 
and expressive than current type class practice. most of Mark's
extentions to type classes were based on this difference, as
were most of his advanced programming techniques.

I try to read current Haskell type classes as an immature, to
be improved, representation of type predicate entailment, and 
where there are discrepancies, I prefer not to complicate the 
semantics, but to simplify the syntax.

so, when I talk about "=>" in haskell, I think about predicate 
entailment in QT. and usually, I talk about cases where I can't
map current practice to what predicate entailment would lead
me to expect.

> In your previous email, you mentioned the Theory of Qualified
> Types (QT) and CHRs as formal type class frameworks but you
> seem to be reluctant to use either of these frameworks. Why?

am I? give me a version of CHRs that accurately covers current
type class practice, and I'll try to express my modifications. but
you won't be able to do that until we have pinned down what
current practice is. which is what I'm trying to do here..

> In case, people don't know CHRs. Here's
> the type class to CHR translation.

there is no "the" translation. and this one omits all the interesting
features under discussion here (MPTCs with flexible instances,
FDs, overlap resolution), and worse, includes optimizations
based on those omissions.

> 1) For each  class C => TC t
> we generate a propagation CHR
> rule TC t ==> C
> Ex: class Eq a => Ord a translates to
>    rule Ord a ==> Eq a

note that this is a case of encoding current syntax-level practice
(of early, eager checking, then using) in the semantics. as I've 
argued in previous email discussing this, it seems quite possible 
to interpret implication as implication, if you are willing to accept 
more programs as valid. I think that would simplify the language.

> 2) For each instance C => TC t
> we generate a simplification CHR
> rule TC t <==> C
> Ex: instance Eq a => Eq [a] translates to
>    rule Eq [a] <==> Eq a

this should be: 

    for each instance C => TC t
    rule TC t <= C

you can combine those to

    for all known instances Ci => TC ti
    rule TC x <= .. || (x==ti && Ci) || ... 

but to get the reverse implication, you need a closed-world
assumption that you don't have for current Haskell (unless you
abandon separate compilation, perhaps, or restrict instance
syntax to something simple that doesn't have much to do with 
the issues we're talking about here).

> Roughly, the CHR semantics is as follows.
> Propagation rules add (propagate) the right-hand side
> if we find a match to the left-hand side.
> Simplification rules reduce (simplifify) the left-hand side
> by the right-hand side.

so you're only actually using simplification rules one way.

> This shows that CHRs are *very* close in terms of syntax and semantics
> of type classes. 

no, it shows that (a simple form of) type classes can be encoded naturally
(without complex rewriting) in a subset of CHR. the same can be said 
about QT. but CHR/=QT, so if we tried to devise Haskell variants that 
would allow us to represent all of QT, or all of CHR, those two variants
would probably look different, wouldn't they?

> BTW, 1) shows that the superclass arrow should indeed be the other way
> around and 2) shows that instances do NOT correspond to Prolog-style
> Horn clauses.

nonsense. 1) shows that current Haskell does not interpret the superclass
arrow as the implication it should be. turning the implication around would
still leave you with two interpretations of the same symbol in the same 
language (implication in classes being checked eagerly, implication in
instances being checked lazily). 2) shows nothing of the sort. encoding
a closed world assumption by collecting the rhss is inappropriate here,
but standard practice in logic programming interpretation.

> In fact, I don't really care if you're using CHRs, QT or whatever.
> As long as there's a (at least semi-) formal description. Otherwise,
> we can't start a discussion because we don't know what we're talking
> about.

being formal alone is neither necessary nor sufficient to know what we
are talking about. the above translation is useless for the present problem. 
and if I look at "Theory of Overloading", it claims that FDs "just extend 
the proof requirements for an instance". 

which seems to miss the point that we want to draw conclusions from 
FDs: they are not just something to check, but something to use, and 
the present problem is all about when and how to use them. but let's 
look at the rules there:

    for each 
        class SC scs => C cs | ds -> rs
    add rules
        rule C cs, C cs', ds==ds' => rs==rs'

    where scs: superclass constraint variables
               cs: class variables
               ds: variables in domain of FD
               rs: variables in range of FD

the paper then goes on to introduce per-instance rules for FDs, which to
me look like consequences of (a) the simplification CHR, which introduces
a fact "C ts" for each "instance C ts", combined with (b) the class FD-CHR,
which should match those facts with any constraints "C ts'" arising, to
add equality constraints implied by the FD. so I don't see why you need 
those extra per-instance rules? what am I missing?

and how would you like to reflect the best-fit overlap resolution in CHRs,
without simply restating it?

once we can agree on some form of TC2CHR translation that covers the
current practice in unconstrained type class programming, with FDs and 
with best-fit overlap resolution, we can start to think about the modifications.

cheers,
claus