[Haskell] Lexically scoped type variables

Thu Nov 25 05:44:05 EST 2004

| This is a great example, thanks for posting it. However, I feel like
| the real problem in this example is the lexically-scoped type
| variables declared with your function f.

Paul's message gives me an excuse to consult the Haskell list about the
design of lexically scoped type variables.   I have a two particular
questions I'd like your advice about.  (This is nothing to do with
monomorphism restriction, or Lennart's original puzzle.)

I'm sending this message to the main Haskell list, to maximise
readership, but I suggest that

	you reply to haskell-cafe at haskell.org

Thereby to avoid spamming the main Haskell list, which is meant to be
low-bandwidth.  I believe I've set the reply-to field of this message to
achieve this goal.  [Perhaps others can do the same?!] 

Simon

---- Background ----

First, for background you might want to glance at "Lexically scoped type
variables", an unpublished paper available at
	http://research.microsoft.com/%7Esimonpj/papers/scoped-tyvars
I don't want to take time in this message to discuss why we want scoped
type variables; I'll just take it for granted here.

One design choice discussed in the paper, and implemented in GHC, is
that we can bring new lexically-scoped type variables into scope by
mentioning them in a pattern type signature:
	f (x::a) = ....
Here 'a' is a name for the type of x, and 'a' scopes over f's body, and
can be mentioned in type signatures in f's body.  Sometimes one wants to
name types in the result* of f, rather than it its *arguments*.  For
example
	head (xs::[a]) :: a = ...
Here, the ":: a" before the "=" gives the type of head's result.  If a
result type signature binds a type variable, that type variable scopes
over the right hand side of the definition.

In GHC, it's not necessary for the type that 'a' names to be a type
variable; it could be Int, for example:
	f (x::a) = x + (1::Int)
Already this embodies several design choices.  For example, we know that
'a' is a *binding* for 'a' because there is no 'a' already in scope,
otherwise it'd be an *occurrence* of 'a'.  (An alternative would be to
mark binding sites somehow.)  Also once could require that 'a' names a
type variable rather than an arbitrary type.  But for this message I
want to take GHC's design as given.

---- QUESTION 1 -----

In GHC at present, a separate type signature introduces no scoping.  For
example:
	f :: forall a. a -> a
	f x = (x::a)
would be rejected, because the type signature for 'f' does not make
anything scope over the right-hand side, so (x::a) means (x::forall
a.a), which is ill typed.

An obvious alternative, taken up by (among others) Mondrian and
Chamaeleon, is to say that

	the top-level quantified variables of a separate type signature
	scope over the right hand side of the binding

In the above example, then, the 'a' bound by the 'forall' would scope
over the right hand side of 'f'.  

I've argued against this in the past, because the construct (forall a.
<type>) patently scopes 'a' only over <type>.  But my resolve is
weakening.  It's just too convenient!  Furthermore, it binds a *rigid*
type variable (i.e. 'a' really must name a type variable) and that's
important for GADTs... which strengthens the case.

In short, I'm considering adopting the Mondrian/Chameleon rule for GHC.
There are two variations

  1a) In the example, 'a' is only brought into scope in the
	     right hand side if there's an explicit 'forall' written by
the programmer
  1b) It's brought into scope even if the forall is implicit; e.g.
		f :: a -> a
		f x = (x::a)

I'm inclined to (1a). Coments?

----- QUESTION 2 ------

Question 2 concerns degenerate bindings. First, consider pattern
bindings.  For example:
	let 
	    (x::[a], y) = <rhs>
	in <body>
The LHS of the let-binding is a pair, so it's called a pattern binding.
What should the scope of 'a' be?  GHC's current story is that "it is the
same as the scope of term variables bound in the same pattern", so just
as 'x' scopes over both <rhs> and <body>, so does 'a'.  And that in turn
means that we can't generalise over 'a', so 'x' is monomorphic.

An alternative would be to say that type variables bound in a pattern
binding scope only over the right-hand side.

---- The degenerate case: variable bindings ----

Now the nasty case

	let 
	    f :: a -> a = \x -> x
	in (f True, f 'c')

Is the binding for 'f' 
	(A) a (degenerate) function binding no arguments, but with a
result signature?
or	(B) a (degenerate) pattern binding with pattern (var::type)?

If we consider it to be (A) then arguably 'a' scopes only over the
right-hand side, and f gets the polymorphic type (forall a. a-> a).

If we consider it to be (B) then currently 'a' scopes over the right
hand side and the body of the let, so 'f' has to be monomorphic, with
type ty->ty for some ty; and no ty will do, so the program is rejected.

The trouble is that the binding is patently *both* (A) and (B).  This
gives no difficulty in Haskell 98, because there's no difference in
treatment; but GHC's current rules for scoped type variables do
distinguish the two.  

The alternatives I can see are

2a) Make an arbitrary choice of (A) or (B); GHC currently chooses (B)
2b) Decide that the scoped type variables arising from pattern
	bindings scope only over the right hand side, not over
	the body of the let
2b) Get rid of result type signatures altogether; instead,
      use choice (1a) or (1b), and use a separate type signature
instead.

Opinions?