[Haskell-cafe] Re: Aren't type system extensions fun? [Further
analysis]
Andrew Coppin
andrewcoppin at btinternet.com
Tue May 27 13:55:02 EDT 2008
Gleb Alexeyev wrote:
> foo :: (forall a . a -> a) -> (Bool, String)
> foo g = (g True, g "bzzt")
So, after an entire day of boggling my mind over this, I have brought it
down to one simple example:
(id 'J', id True) -- Works perfectly.
\f -> (f 'J', f True) -- Fails miserably.
Both expressions are obviously perfectly type-safe, and yet the type
checker stubbornly rejects the second example. Clearly this is a flaw in
the type checker.
So what is the type of that second expression? You would think that
(x -> x) -> (Char, Bool)
as a valid type for it. But alas, this is not correct. The CALLER is
allowed to choose ANY type for x - and most of possible choices wouldn't
be type-safe. So that's no good at all!
In a nutshell, the function being passed MAY accept any type - but we
want a function that MUST accept any type. This excusiatingly subtle
distinction appears to be the source of all the trouble.
Interestingly, if you throw in the undocumented "forall" keyword,
everything magically works:
(forall x. x -> x) -> (Char, Bool)
Obviously, the syntax is fairly untidy. And the program is now not valid
Haskell according to the written standard. And best of all, the compiler
seems unable to deduce this type automatically either.
At this point, I still haven't worked out exactly why this hack works.
Indeed, I've spent all day puzzling over this, to the point that my head
hurts! I have gone through several theories, all of which turned out to
be self-contradictory. So far, the only really plausible theory I have
been able to come up with is this:
- A function starts out with a polymorphic type, such as 'x -> x'.
- Each time the function is called, all the type variables have to be
locked down to real types such as 'Int' or 'Either (Maybe (IO Int))
String' or something.
- By the above, any function passed to a high-order function has to have
all its type variables locked down, yielding a completely monomorphic
function.
- In the exotic case above, we specifically WANT a polymorphic function,
hence the problem.
- The "forall" hack somehow convinces the type checker to not lock down
some of the type variables. In this way, the type variables relating to
a function can remain unlocked until we reach each of the call sites.
This allows the variables to be locked to different types at each site
[exactly as they would be if the function were top-level rather than an
argument].
This is a plausible hypothesis for what this language feature actually
does. [It is of course frustrating that I have to hypothesise rather
than read a definition.] However, this still leaves a large number of
questions unanswered...
- Why are top-level variables and function arguments treated differently
by the type system?
- Why do I have to manually tell the type checker something that should
be self-evident?
- Why do all type variables need to be locked down at each call site in
the first place?
- Why is this virtually never a problem in real Haskell programs?
- Is there ever a situation where the unhacked type system behaviour is
actually desirably?
- Why can't the type system automatically detect where polymorphism is
required?
- Does the existence of these anomolies indicate that Haskell's entire
type system is fundamentally flawed and needs to be radically altered -
or completely redesigned?
Unfortunately, the GHC manual doesn't have a lot to say on the matter.
Reading section 8.7.4 ("Arbitrary-rank polymorphism"), we find the
following:
- "The type 'x -> x' is equivilent to 'forall x. x -> x'." [Um... OK,
that's nice?]
- "GHC also allows you to do things like 'forall x. x -> (forall y. y ->
y)'." [Fascinating, but what does that MEAN?]
- "The forall makes explicit the universal quantification that is
implicitly added by Haskell." [Wuh??]
- "You can control this feature using the following language options..."
[Useful to know, but I still don't get what this feature IS.]
The examples don't help much either, because (for reasons I haven't
figured out yet) you apparently only need this feature in fairly obscure
cases.
The key problem seems to be that the GHC manual assumes that anybody
reading it will know what "universal quantification" actually means.
Unfortunately, neither I nor any other human being I'm aware of has the
slightest clue what that means. A quick search on that infallable
reference [sic] Wikipedia yields several articles full of dense logic
theory. Trying to learn brand new concepts from an encyclopedia is more
or less doomed from the start. As far as I can comprehend it, we have
existential quantification = "this is true for SOMETHING"
universal quantification = "this is true for EVERYTHING"
Quite how that is relevant to the current discussion eludes me.
I have complete confidence that whoever wrote the GHC manual knew
exactly what they meant. I am also fairly confident that this was the
same person who implemented and even designed this particular feature.
And that they probably have an advanced degree in type system theory. I,
however, do not. So if in future the GHC manual could explain this in
less opaque language, that would be most appreciated. :-}
For example, the above statements indicate that '(x -> x) -> (Char,
Bool)' is equivilent to 'forall x. (x -> x) -> (Char, Bool)', and we
already know that this is NOT the same as '(forall x. x -> x) -> (Char,
Bool)'. So clearly my theory above is incomplete, because it fails to
explain why this difference exists. My head hurts...
(Of course, I could just sit back and pretend that rank-N types don't
exist. But I'd prefer to comprehend them if possible...)
More information about the Haskell-Cafe
mailing list