<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">I think this behavior is obscured by the way Haskell uses the term “type variable” to refer to three different things:<div class=""><br class=""></div><div class=""><ol class="MailOutline"><li class="">Variables that refer to types and are bound within the type namespace.</li><div class=""><br class=""></div><li class="">Skolem type constants introduced by a `forall`.<br class=""><br class=""></li><li class="">Solver/unification variables, aka “metavariables”, which aren’t really types at all but holes to be filled in with types.<br class=""></li></ol><div class=""><br class=""></div><div class="">These are all subtly different, but they are so frequently treated as if they were the same that I suspect listing them out is not actually enough for many people to understand the distinctions! So let me elaborate.</div><div class=""><br class=""></div><div class="">Suppose we view `forall` like a type-level lambda, so `forall a. a -> a` is analogous to this term:</div><div class=""><br class=""></div></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div class="">\a -> a :-> a</div></div></blockquote><div class=""><div class=""><br class=""></div><div class="">Clearly, there is a difference between the variable `a` in the above expression and the value we eventually substitute for it when we apply the lambda. When we instantiate the function by doing, say, `id @Int`, we substitute `Int` for `a` to get `Int -> Int`. So `a` itself isn’t a type, it’s a name that refers to a type—a type variable.</div><div class=""><br class=""></div><div class="">Under this interpretation, there’s nothing wrong with your example. In</div><div class=""><br class=""></div></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div class="">add (x :: a) (y :: b) = x + y</div></div></blockquote><div class=""><div class=""><br class=""></div><div class="">the variable `a` names `x`’s type, and the variable `b` names `y`’s type, and there’s nothing at all wrong with having two different names that refer to the same thing. It would be like writing this term:</div><div class=""><br class=""></div></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div class="">\t -> let { a = t; b = t } in a :-> b</div></div></blockquote><div class=""><div class=""><br class=""></div><div class="">So why does it seem confusing? Well, because if we write</div><div class=""><br class=""></div></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div class="">add :: Num t => a -> b -> t</div></div></blockquote><div class=""><br class=""></div>we get a type error, not two aliases for t. This is because GHC will implicitly quantify all the variables in the type signature to get<div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">add :: forall t a b. Num t => a -> b -> t</div></blockquote><div class=""><br class=""></div>so `a` and `b` are now distinct, “lambda-bound” variables, not aliases for `t`.<div class=""><br class=""></div><div class="">This implicit quantification means we don’t often think about the type variables we write in type signatures as being like term variables, which are just names that reference other things. Rather, we think of type variables as being like <i class="">type constants</i>, and in fact GHC encourages this interpretation in its error reporting.</div><div class=""><br class=""></div><div class="">To explain, consider what happens when GHC checks `add` against the type we wrote above. Somehow, it needs to figure out the types of `x` and `y`, but it can’t just say</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">x :: a</div><div class="">y :: b</div></blockquote><div class=""><br class=""></div>because those variables were bound by the `forall` (they are “lambda-bound”), and if we move them into some other context where they appear free, they no longer have any meaning. GHC has to somehow pick actual <i class="">types</i> for x and y, not type variables. A naïve idea would be to invent two new <i class="">solver variables</i><span style="font-style: normal;" class="">, aka “holes to be filled in with inferred types”, yielding something like</span><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">a1, b1 fresh</div><div class="">x :: a1</div><div class="">y :: b1</div></blockquote><div class=""><br class=""></div>but this doesn’t work. Imagine we did that while typechecking this function:<div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">bad :: a -> a</div><div class="">bad _ = True</div></blockquote><div class=""><br class=""></div>If we invented a new solver variable to use in place of `a` while typechecking `bad`, GHC would just say “oh, I can fill in the hole with Bool” and carry on, which would be quite bad! The whole point of parametricity is that the function must work with <i class="">any</i><span style="font-style: normal;" class=""> type.</span><div class=""><br class=""></div><div class="">So GHC actually does something else. Behind the scenes, it creates two new type definitions, distinct from all other types in the program:</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">data FakeTypeA</div><div class="">data FakeTypeB</div></blockquote><div class=""><br class=""></div>Now it typechecks the function using these fake types substituted for `a` and `b`:<div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">x :: FakeTypeA</div><div class="">y :: FakeTypeB</div></blockquote><div class=""><br class=""></div>Now when GHC goes to typecheck `x + y`, it will raise a type error, since FakeTypeA and FakeTypeB are obviously not the same type. This enforces parametricity.<div class=""><br class=""></div><div class="">Of course, it would be very confusing to the user to see names like “FakeTypeA” in their type errors. So GHC pulls some sleight of hand, and it prints the new types it generates as just “a” and “b”, the same as the forall-bound variables it created them for. These don’t <i class="">look</i><span style="font-style: normal;" class=""> like type constants, because in source Haskell, all type constants start with UpperCase. But behind the scenes, they really are type constants, not type variables. GHC calls these fake type constants “rigid </span>variables” or “skolem variables,” but don’t be fooled: they’re not variables at all, they’re constants.</div><div class=""><br class=""></div><div class="">GHC tells this white lie to keep programmers from having to think about all this behind the scenes business. When GHC says “could not match ‘a’ with ‘b’,” it’s natural for programmers to think “ah, these are the ‘a’ and ‘b’ I wrote in my type signature,” and they learn to treat “type variables” and “rigid type constants” as one and the same. They don’t think about how this is a meaningfully different definition of “variable” than what they’re used to at the term level. This works out okay because the only way to bind type variables in source Haskell 98/2010 is universal quantification.</div><div class=""><br class=""></div><div class="">But now ScopedTypeVariables and ExistentialQuantification enter the picture, and there’s a problem: we suddenly need to be able to bind type variables in a place that isn’t `forall`. Now we have type variable binding sites that look just like type variable uses, but they aren’t implicitly-quantified the way they are in standalone type signatures, and confusion ensues.</div><div class=""><br class=""></div><div class="">This confusion actually goes much deeper than you might expect. Because Haskell historically hasn’t syntactically distinguished type variable uses from binding sites, we have new ambiguous situations like this:</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">\(x :: Maybe a) -> ...</div></blockquote><div class=""><br class=""></div>Is this a type <i class="">annotation</i> on x, which states that x should have type `Maybe a`, where `a` is a type variable already in scope? Or is it a <i class="">pattern binding</i> that matches against x’s type, which asserts that it has the shape `Maybe t` and binds a new variable `a` to the type `t`? And it gets even worse! Suppose we have this:<div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">f (Foo @a x) = ...</div></blockquote><div class=""><br class=""></div>Is `@a` a visible type application of `Foo` to the in-scope type `a`, instantiating the (universally-quantified) `Foo` constructor at a particular type? Or is it a binder for an <span style="font-style: normal;" class="">existential</span> variable “stored inside” `Foo`? Both are reasonable (and useful) interpretations.<div class=""><br class=""></div><div class="">A GHC proposal that presents a design that sorts all these ambiguities out would be most welcome. To my knowledge, no such proposal currently exists, but I could be wrong—I know this issue has been discussed on some proposals (such as visible type applications in patterns), so maybe someone has come up with something I’m not aware of.</div><div class=""><br class=""></div><div class="">Alexis<br class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Aug 16, 2020, at 07:17, Tom Ellis <<a href="mailto:tom-lists-haskell-cafe-2017@jaguarpaw.co.uk" class="">tom-lists-haskell-cafe-2017@jaguarpaw.co.uk</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">I have just noticed that with ScopedTypeVariables one can write the<br class="">following bizarre definition<br class=""><br class=""> -- Inferred signature:<br class=""> -- add :: Num a => a -> a -> a<br class=""> add (x :: a) (y :: b) = x + y<br class=""><br class="">The reason for this behaviour is that<br class=""><br class=""><blockquote type="cite" class="">in all patterns other than pattern bindings, a pattern type<br class="">signature may mention a type variable that is not in scope; in this<br class="">case, the signature brings that type variable into scope.<br class=""></blockquote><br class=""><a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#pattern-type-signatures" class="">https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#pattern-type-signatures</a><br class=""><br class="">and the justification is that<br class=""><br class=""><blockquote type="cite" class="">Bringing type variables into scope is particularly important for<br class="">existential data constructors<br class=""></blockquote><br class="">But this leads to a rather puzzling user experience. Was it really<br class="">not possible to design this extension in a way that would allow<br class="">bringing new type variables into scope locally but not allow them to<br class="">unify with other type variables from outer scopes?<br class=""><br class="">To be more concrete, what I would like to see is a design where `k` is<br class="">allowed because the type `a` (bound within a pattern) does not need to<br class="">unify with anything but `k2` is not allowed because `a` is forbidden<br class="">to unify with `b`. `k3` would also be forbidden (the type variables<br class="">called `a` are actually distinct) as would `k4` (the type variable for<br class="">the inferred signature would be fresh and distinct from `a`).<br class=""><br class="">I believe this design might lead to much less puzzling behaviour.<br class="">Would it be possible (within a new extension, of course) or have I<br class="">overlooked something?<br class=""><br class="">Tom<br class=""><br class=""><br class=""> data T = forall a. MkT [a]<br class=""><br class=""> k :: T -> T<br class=""> k (MkT [t::a]) =<br class=""> MkT t3<br class=""> where<br class=""> (t3::[a]) = [t,t,t]<br class=""><br class=""> data T2 a = MkT2 [a]<br class=""><br class=""> k2 :: T2 b -> T2 b<br class=""> k2 (MkT2 [t::a]) =<br class=""> MkT2 t3<br class=""> where<br class=""> (t3::[a]) = [t,t,t]<br class=""><br class=""> k3 :: T2 a -> T2 a<br class=""> k3 (MkT2 [t::a]) =<br class=""> MkT2 t3<br class=""> where<br class=""> (t3::[a]) = [t,t,t]<br class=""><br class=""> k4 (MkT2 [t::a]) =<br class=""> MkT2 t3<br class=""> where<br class=""> (t3::[a]) = [t,t,t]<br class=""></div></div></blockquote></div><br class=""></div></div></div></div></div></div></div></div></div></div></div></body></html>