[Haskell] Lexically scoped type variables: new proposal

Thu Oct 26 20:54:12 EDT 2006

Hi, everyone. Long time no see. I have a specific proposal for scoped type
variables, which is totally different from plans A and B, and which I'd like
people to pick to pieces for me. (Especially you, Simon.)

Type variable scoping in System F is simple: the type bindings are explicit
and follow the same rules as the value bindings. Type variable scoping in
Haskell is complicated only because the actual binding sites of the
underlying System F type variables aren't visible. In particular, GHC's type
signatures are never located at type variable binding sites; at best they're
nearby. I'm convinced that the only way that type variable scoping will ever
make sense is if scoped type variable bindings are written at their actual
System F binding sites. That's what this proposal is about.

We introduce "anonymous type aliases", with the syntax

     decl  ->  ... | 'type' tycon

or perhaps more conveniently

     decl  ->  ... | 'type' tycon1 ... tyconN

These are like ordinary type aliases, except that instead of referring to a
type specified by the programmer, they refer to a type to be inferred by the
compiler. They can appear in local bindings; for example,

     let { type N ; x :: [N] ; x = "hello" } in ...

results in N being an alias for (i.e. bound to) Char in the body of the let,
just as x is bound to "hello".

And that's all. Well, that's not all, but I think the system above, as an
extension of Haskell 98, has the power of traditional scoped type variables.

Note that the problem of distinguishing between local type variables and
nonlocal (scoped) type variables goes away in this proposal. The rule is the
same as Haskell 98's: lowercase identifiers are local to a type expression,
and uppercase identifiers scope like value bindings. Explicit foralls at the
top level of a type are never necessary.

Also note that in this proposal, the binding site of the type variable is
decoupled from the specification of the type; the type signature that fixes
N (by forcing unification with a rigid type) can be located anywhere in N's
lexical scope.

But that's not all! (I wish it were.) The problem with putting the anonymous
type aliases in let/where decls is that System F type variables aren't bound
there. (They could be, but I don't think they ever are in GHC.) Where they
are bound is in lambda expressions, in function bindings (when the function
is called), and in pattern matching on existential dcons. The let/where
syntax does suffice for those cases, but it's awkward -- you end up having
to write something like

     f x' y = let type N
                  x :: [N]
                  x = x'
              in ...

It should be possible to bind type aliases everywhere that System F type
variables can be bound, for which I propose the following syntax:

     f (type N) (x::[N]) y = ...

and analogously

     \(type N) (x::[N]) y -> ...
     case e of D (type N) (x::[N]) y -> ...

The parentheses around (type N) are mandatory. This syntax is sensible
inasmuch as functions really do take type arguments and dcons really do have
type fields, and we're simply writing them explicitly in the place where
they've always been. This syntax also harmonizes fairly well with the
existing type alias syntax, which can kinda sorta be seen as a pattern
binding in this picture. The type variables thus introduced have the same
scope as pattern variables, except that they also scope over the pattern
type signatures in subsequent arguments or dcon fields.

And that's all, for old-style scoped type variables. I realize that
capitalized type variables might take some getting used to, but I really
feel like this approach, or something like it, is much easier to understand
and a significantly better candidate for standardization than what we've got
now. It's always clear which variables are scoped and which aren't, and the
scoping rules are exactly what you'd naively expect them to be. There are no
difficult choices to make in the design (except for syntax; beware of
Wadler's Law).

Things that worry me:

   * Do constraint (dictionary) parameters threaten this proposal? They
     aren't explicit either, and so can't be given type signatures for
     the purpose of fixing a scoped tyvar. Fortunately in this proposal
     you needn't identify the meaning of a tyvar at its binding spot,
     and my intuition is that, if the scoped tyvar is useful at all,
     there will always be somewhere in its scope where its meaning can
     be disambiguated by a type signature. But I'm not sure.

   * Something else that's slipped my mind, but I want to post this
     before going to bed.

   * What about the GHC 6.6 changes? See below.

I don't know how to make this proposal compatible with GHC 6.6 because I
don't understand yet how type inference works with GADTs and
impredicativity. My best guess is that the correct change is to require that
an anonymous type variable only alias a rigid System F type variable
[introduced in the same scope], where I'm not sure whether the part in
brackets is necessary. This change has the same disadvantages in my proposal
as in the existing GHC implementation: you can no longer write things like
"type N ; f (x::N) (y::N) = ..." to enforce monomorphic type equality, or
"type N ; x :: [N]" to get partial type signatures. I have an idea for how
we might get the best of both worlds: anonymous types bound in let/while
declarations could be flexible (subject to unification), while anonymous
types bound in argument or field lists could refer to rigid type variables
actually bound there. I'm not sure how much sense this idea makes because,
as I said, I don't grok the reason for the GHC 6.6 changes yet.

If people are confused by any aspect of the proposal, give me GHC 6.4 or 6.6
examples and I will either translate them into my proposed syntax, or fail
to do so and thereby learn something interesting.

And that's really all. Comments solicited.

-- Ben