Lexically scoped type variables
Simon Peyton-Jones
simonpj at microsoft.com
Tue Jan 17 09:14:11 EST 2006
Dear GHC users
As part of a revision of GHC to make type inference for GADTs simpler
and more uniform, I propose to change the way in which lexically-
scoped type variables work in GHC. (Indeed, I have done so, and will
commit it shortly.) This message explains the new system, highlighting
the differences.
I'm very interested to know whether you like it or hate it.
In the latter case, I'd also like to know whether you also
have programs that will be broken by the change.
Simon
Scoped type variables in GHC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
January 2006
0) Terminology.
A *pattern binding* is of the form
pat = rhs
A *function binding* is of the form
f pat1 .. patn = rhs
A binding of the formm
var = rhs
is treated as a (degenerate) *function binding*.
A *declaration type signature* is a separate type signature for a
let-bound or where-bound variable:
f :: Int -> Int
A *pattern type signature* is a signature in a pattern:
\(x::a) -> x
f (x::a) = x
A *result type signature* is a signature on the result of a
function definition:
f :: forall a. [a] -> a
head (x:xs) :: a = x
The form
x :: a = rhs
is treated as a (degnerate) function binding with a result
type signature, not as a pattern binding.
1) The main invariants:
A) A lexically-scoped type variable always names a rigid
type variable (not a wobbly one, and not a non-type-variable
type). THIS IS A CHANGE. Previously, a scoped type variable
named an arbitrary *type*.
B) A type signature always describes a rigid type (since
its free (scoped) type variables name rigid type variables).
This is also a change, a consequence of (A).
C) Distinct lexically-scoped type variables name distinct
rigid type variables. This choice is open;
This means that you cannot say
\(x:: [a]) -> <expr>
(where 'a' is not yet in scope) to enforce that x is a list without
saying anything about 'a'. (Well, not unless the type of this lambda
is known from the "outside".)
1a) Possible extension. We might consider allowing
\(x :: [ _ ]) -> <expr>
where "_" is a wild card, to mean "x has type list of something",
without
naming the something.
2) Scoping
2(a) If a declaration type signature has an explicit forall, those type
variables are brought into scope in the right hand side of the
corresponding binding (plus, for function bindings, the patterns on
the LHS).
f :: forall a. a -> [a]
f (x::a) = [x :: a, x]
Both occurences of 'a' in the second line are bound by
the 'forall a' in the first line
A declaration type signature *without* an explicit top-level forall
is implicitly quantified over all the type variables that are
mentioned in the type but not already in scope. GHC's current
rule is that this implicit quantification does *not* bring into scope
any new scoped type variables.
f :: a -> a
f x = ...('a' is not in scope here)...
This gives compatibility with Haskell 98
2(b) A pattern type signature implicitly brings into scope any type
variables mentioned in the type that are not already into scope.
These are called *pattern-bound type variables*.
g :: a -> a -> [a]
g (x::a) (y::a) = [y :: a, x]
The pattern type signature (x::a) brings 'a' into scope.
The 'a' in the pattern (y::a) is bound, as is the occurrence on
the RHS.
A pattern type siganture is the only way you can bring existentials
into scope.
data T where
MkT :: forall a. a -> (a->Int) -> T
f x = case x of
MkT (x::a) f -> f (x::a)
2a) QUESTION
class C a where
op :: forall b. b->a->a
instance C (T p q) where
op = <rhs>
Clearly p,q are in scope in <rhs>, but is 'b'? Not at the moment.
Nor can you add a type signature for op in the instance decl.
You'd have to say this:
instance C (T p q) where
op = let op' :: forall b. ...
op' = <rhs>
in op'
3) A pattern-bound type variable is allowed only if the pattern's
expected type is rigid. Otherwise we don't know exactly *which*
skolem the scoped type variable should be bound to, and that means
we can't do GADT refinement. This is invariant (A), and it is a
change
from the current situation.
f (x::a) = x -- NO
g1 :: b -> b
g1 (x::b) = x -- YES, because the pattern type is rigid
g2 :: b -> b
g2 (x::c) = x -- YES, same reason
h :: forall b. b -> b
h (x::b) = x -- YES, but the inner b is bound
k :: forall b. b -> b
k (x::c) = x -- NO, it can't be both b and c
3a) You *can* give a different name to the same type variable in
different
disjoint scopes, just as you can (if you want) give diferent names
to
the same value parameter
f :: a -> Bool -> Maybe a
f (x::p) True = Just x :: Maybe p
f (y::q) False = Nothing :: Maybe q
3b) Scoped type variables respect alpha renaming. For example,
function g from 2(b) above could also be written:
g2 :: a -> a -> [a]
g2 (x::b) (y::b) = [y :: b, x]
where the scoped type variable is called 'b' instead of 'a'.
However, you cannot write
f :: a -> a -> [a]
f (x::b) (y::c) = [y :: b, x]
because then two scoped type variables ('b' and 'c') would be bound
to the same underlying type variable. (Invariant (C) above.)
4) Result type signatures obey the same rules as pattern types
signatures.
In particular, they can bind a type variable only if the result type
is rigid
f x :: a = x -- NO
g :: b -> b
g x :: b = x -- YES; binds b in rhs
5) A *pattern type signature* in a *pattern binding* cannot bind a
scoped type variable
(x::a, y) = ... -- Legal only if 'a' is already in scope
Reason: in type checking, the "expected type" of the LHS pattern is
always wobbly, so we can't bind a rigid type variable. (The
exception
would be for an existential type variable, but existentials are not
allowed in pattern bindings either.)
Even this is illegal
f :: forall a. a -> a
f x = let ((y::b)::a, z) = ...
in
Here it looks as if 'b' might get a rigid binding; but you can't bind
it to the same skolem as a.
6) Explicitly-forall'd type variables in the *declaration type
signature(s)*
for a *pattern binding* do not scope AT ALL.
x :: forall a. a->a -- NO; the forall a does
Just (x::a->a) = Just id -- not scope at all
y :: forall a. a->a
Just y = Just (id :: a->a) -- NO; same reason
THIS IS A CHANGE, but one I bet that very few people will notice.
Here's why:
strange :: forall b. (b->b,b->b)
strange = (id,id)
x1 :: forall a. a->a
y1 :: forall b. b->b
(x1,y1) = strange
This is legal Haskell 98 (modulo the forall). If both 'a' and 'b'
both scoped over the RHS, they'd get unified and so cannot stand
for distinct type variables. One could *imagine* allowing this:
x2 :: forall a. a->a
y2 :: forall a. a->a
(x2,y2) = strange
using the very same type variable 'a' in both signatures, so that
a single 'a' scopes over the RHS. That seems defensible, but odd,
because though there are two type signatures, they introduce just
*one* scoped type variable, a.
Implementation notes
~~~~~~~~~~~~~~~~~~~~
1) This means that dealing with pattern/result type signatures is
simple:
- if the signature binds one or more variables, and the
pattern type is rigid, *match* the signature against the
pattern type to bind the variables
- if the signature binds no type variables, *unify* the
pattern type against the (necessarily rigid) type signature
2) Skolem constants get introduced by
a) Declaration type signatures with explicit foralls
b) *Function* declaration type signatures on bindings where
there is no explicit forall
c) Existential pattern matches
d) SKOL rule in subsumption checking
A *declaration type signature* for a *pattern-bound* variable
does not introduce a skolem, and is never the basis for refinement.
Instead we use an ordinary meta type variable, and check after the
event that everything is still distinct. That is how the x4/y4
example type-checks.
More information about the Glasgow-haskell-users
mailing list