[GHC] #13959: substTyVar's definition is highly dubious
GHC
ghc-devs at haskell.org
Thu Jul 20 14:23:53 UTC 2017
#13959: substTyVar's definition is highly dubious
-------------------------------------+-------------------------------------
Reporter: RyanGlScott | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D3732
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by goldfire):
After a long debate, Simon and I came up with the following claim:
'''(*) Claim''': If a tyvar is not in the domain of a substitution, then
no variables in its kind are in the domain of a substitution.
If (*) is true, then the change Phab:D3732 should be a no-op: precisely no
behavior should change. And we can check (*) on the fly, which we should.
Why should (*) be true? Here's some rough reasoning: a tyvar `a` whose
kind mentions `k` must be bound in `k`'s scope. If we are substituting `k`
(to `Type`, say), then we'll need to substitute both in `a`'s binding site
and `a`'s scope. When we subst in the binding site (with
`substTyVarBndr`), the subst is extended mapping `a:k` to `a:Type`; `a` is
thus in the domain of the subst when we get to substing in `a`'s scope. If
we've forgotten to call `substTyVarBndr`, then we're in deep trouble,
anyway. Thus, all tyvars whose kinds mention `k` will be added to the
subst by the time we get to their scope.
But thinking about (*) led us to ponder this strange scenario: `forall k.
forall (b :: k). forall k. forall (a :: k). Proxy [a,b]`. Is that well-
kinded? Surely not, because `a` and `b` have different kinds: `a`'s kind
is the more local `k` while `b`'s kind is the less local `k`. However,
Core Lint is going to have a hard time with this. When Lint looks at
`[a,b]`, it will ask for `a`'s kind and get `k`. It will ask for `b`'s
kind and get `k`. And the `k`s will be equal! Core Lint will accept. This
is horrible. This bug may have been lurking in GHC for ages.
The remedy: don't do this. Specifically, we introduce
'''(+) Rule''': When a tyvar `a` is brought into scope, all vars (type and
term) whose type mentions `a` fall out of scope.
Under (+), `b` is not in scope in `[a,b]` and we have a straightforward
scoping error. We can check this in Lint. We couldn't quite convince
ourselves that (+) is actually always respected during transformations,
etc., but by adding a rule in Lint, we can discover when it doesn't. An
alternative to (+) is
(-) Rule: It is illegal to shadow a variable. That is, if `a` is in scope,
we cannot bind `a`.
(-) implies (+) and is easy to check for. However, it may not be so easy
to uphold; Simon claims several places in GHC knowingly violate (-), and
avoiding this violation may hurt performance. I'm worried that any code
that violates (-) will sometimes fall afoul of (+) (but only in very
obscure scenarios). Simon claims that the violations of (-) but not (+)
involve closed types and therefore are safe w.r.t. (+). So we've agreed to
check for (+) for now and see how that goes.
As a sidenote: comment:7 and comment:9 concern term-level substitutions.
At the term level, we have a real interest in commoning up occurrences (so
that all occurrences of an Id `x` actually point to the same heap object)
because of `IdInfo`. In types, on the other hand, we have an interest in
avoiding forcing a `TCvSubst`'s `InScopeSet`, and so we don't use the same
"look up in in-scope set" approach. So, essentially, comment:7 and
comment:9 are red herrings.
== Bottom line ==
We have these tasks:
1. Check for (*) in `substTyVar`. This is instead of the kind substitution
discussed earlier in this ticket. Document (*).
2. Update Core Lint to check (+). Document (+) in a nice Note.
3. (Optional) If we blindly set `no_kind_change` to `False` in
`substTyVarBndrCallback`, then substitutions ''will'' common up type
variables by looking up type variables in the substitution. There's a
chance this will have a performance benefit. Try and investigate.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13959#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list