[GHC] #13959: substTyVar's definition is highly dubious

GHC ghc-devs at haskell.org
Thu Jul 20 14:23:53 UTC 2017


#13959: substTyVar's definition is highly dubious
-------------------------------------+-------------------------------------
        Reporter:  RyanGlScott       |                Owner:  (none)
            Type:  bug               |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D3732
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by goldfire):

 After a long debate, Simon and I came up with the following claim:

 '''(*) Claim''': If a tyvar is not in the domain of a substitution, then
 no variables in its kind are in the domain of a substitution.

 If (*) is true, then the change Phab:D3732 should be a no-op: precisely no
 behavior should change. And we can check (*) on the fly, which we should.

 Why should (*) be true? Here's some rough reasoning: a tyvar `a` whose
 kind mentions `k` must be bound in `k`'s scope. If we are substituting `k`
 (to `Type`, say), then we'll need to substitute both in `a`'s binding site
 and `a`'s scope. When we subst in the binding site (with
 `substTyVarBndr`), the subst is extended mapping `a:k` to `a:Type`; `a` is
 thus in the domain of the subst when we get to substing in `a`'s scope. If
 we've forgotten to call `substTyVarBndr`, then we're in deep trouble,
 anyway. Thus, all tyvars whose kinds mention `k` will be added to the
 subst by the time we get to their scope.

 But thinking about (*) led us to ponder this strange scenario: `forall k.
 forall (b :: k). forall k. forall (a :: k). Proxy [a,b]`. Is that well-
 kinded? Surely not, because `a` and `b` have different kinds: `a`'s kind
 is the more local `k` while `b`'s kind is the less local `k`. However,
 Core Lint is going to have a hard time with this. When Lint looks at
 `[a,b]`, it will ask for `a`'s kind and get `k`. It will ask for `b`'s
 kind and get `k`. And the `k`s will be equal! Core Lint will accept. This
 is horrible. This bug may have been lurking in GHC for ages.

 The remedy: don't do this. Specifically, we introduce

 '''(+) Rule''': When a tyvar `a` is brought into scope, all vars (type and
 term) whose type mentions `a` fall out of scope.

 Under (+), `b` is not in scope in `[a,b]` and we have a straightforward
 scoping error. We can check this in Lint. We couldn't quite convince
 ourselves that (+) is actually always respected during transformations,
 etc., but by adding a rule in Lint, we can discover when it doesn't. An
 alternative to (+) is

 (-) Rule: It is illegal to shadow a variable. That is, if `a` is in scope,
 we cannot bind `a`.

 (-) implies (+) and is easy to check for. However, it may not be so easy
 to uphold; Simon claims several places in GHC knowingly violate (-), and
 avoiding this violation may hurt performance. I'm worried that any code
 that violates (-) will sometimes fall afoul of (+) (but only in very
 obscure scenarios). Simon claims that the violations of (-) but not (+)
 involve closed types and therefore are safe w.r.t. (+). So we've agreed to
 check for (+) for now and see how that goes.

 As a sidenote: comment:7 and comment:9 concern term-level substitutions.
 At the term level, we have a real interest in commoning up occurrences (so
 that all occurrences of an Id `x` actually point to the same heap object)
 because of `IdInfo`. In types, on the other hand, we have an interest in
 avoiding forcing a `TCvSubst`'s `InScopeSet`, and so we don't use the same
 "look up in in-scope set" approach. So, essentially, comment:7 and
 comment:9 are red herrings.

 == Bottom line ==

 We have these tasks:

 1. Check for (*) in `substTyVar`. This is instead of the kind substitution
 discussed earlier in this ticket. Document (*).

 2. Update Core Lint to check (+). Document (+) in a nice Note.

 3. (Optional) If we blindly set `no_kind_change` to `False` in
 `substTyVarBndrCallback`, then substitutions ''will'' common up type
 variables by looking up type variables in the substitution. There's a
 chance this will have a performance benefit. Try and investigate.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13959#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list