[GHC] #8995: When generalising, use levels rather than global tyvars

Mon Apr 14 12:31:30 UTC 2014

#8995: When generalising, use levels rather than global tyvars
------------------------------------+-------------------------------------
       Reporter:  simonpj           |             Owner:
           Type:  bug               |            Status:  new
       Priority:  normal            |         Milestone:
      Component:  Compiler          |           Version:  7.8.2
       Keywords:                    |  Operating System:  Unknown/Multiple
   Architecture:  Unknown/Multiple  |   Type of failure:  None/Unknown
     Difficulty:  Unknown           |         Test Case:
     Blocked By:                    |          Blocking:
Related Tickets:                    |
------------------------------------+-------------------------------------
 A long time ago Didier Remy described how to use 'ranks' or 'levels' to
 make the test "is this type variable free in the environment" when
 generalising a type.  Oleg gives a
 [http://okmij.org/ftp/ML/generalization.html great explanation here].

 GHC still uses the old "find the free type variables in the environment"
 method.  But in fact GHC ''does'' maintain levels, just as Didier
 explains.  The level is called `Untouchables` and is used to make sure we
 don't infer non-principal types for things involving GADTs.  The same
 level stuff is used to prevent skolem-escape.  Details with `Note
 [Untouchable type variables]` and `Note [Skolem escape prevention]` in
 `TcType`.

 So it ought to be straightforward to use the same levels for
 generalisation.  I had a go, and came across the following tricky points:

  * We need to bump the level (`pushUntouchables`) when starting to
 typecheck a let RHS.  Currently we don't.  That will be a bit less
 efficient, because we'll end up deferring unification constraints that
 currently get solved immediately.

  * However the current set up is actually wrong.  In particular, we
 generate a constraint where the implications do not have monotonically
 increasing level numbers (invariant `(ImplicInv)` in `TcType`).  Try this
 program with `-ddump-tc-trace`.
 {{{
 {-# LANGUAGE GADTs #-}
 module Bar where

 data T where
   MkT :: a -> (a->Int) -> T

 f x z = case x of
            MkT _ _ -> let g y = (y, [z,Nothing])
                       in (g True, g 'c')
 }}}
   I don't know how to make this bug cause an actual problem, but it's very
 unsettling.  Bumping the level before doing the RHS would solve this.

  * Once we have bumped the level, we then have to do some more type-
 variable promotion, especially of type variables free in the finally-
 chosen type of the binder.  This came out rather awkwardly in the code.

  * Because we now get more deferred equality constraints, there's a real
 risk that we'll end up quantifying over them.  If we get, say `(gbl ~
 Maybe lcl)`, where `gbl` is a type var free in the envt, we '''do not'''
 want to quantify over it!  (It would be sound, but would lead to horrible
 types.)  A long time ago Didier Remy described how to use 'ranks' or
 'levels' to make the test "is this type variable free in the environment"
 when generalising a type.  Oleg gives a
 [http://okmij.org/ftp/ML/generalization.html great explanation here].

  GHC still uses the old "find the free type variables in the environment"
 method.  But in fact GHC ''does'' maintain levels, just as Didier
 explains.  The level is called `Untouchables` and is used to make sure we
 don't infer non-principal types for things involving GADTs.  The same
 level stuff is used to prevent skolem-escape.  Details with `Note
 [Untouchable type variables]` and `Note [Skolem escape prevention]` in
 `TcType`.

  So it ought to be straightforward to use the same levels for
 generalisation.  I had a go, and came across the following tricky points:

  * We need to bump the level (`pushUntouchables`) when starting to
 typecheck a let RHS.  Currently we don't.  That will be a bit less
 efficient, because we'll end up deferring unification constraints that
 currently get solved immediately.

  * However the current set up is actually wrong.  In particular, we
 generate a constraint where the implications do not have monotonically
 increasing level numbers (invariant `(ImplicInv)` in `TcType`).  Try this
 program with `-ddump-tc-trace`.
 {{{
 {-# LANGUAGE ExistentialQuantification #-}
 module Bar where

 data T = forall a. MkT a

 f x z = case x of
            MkT y -> let g y = [y,Nothing]
                     in g True
 }}}
   Look for the `reportUnsolved` trace and you'll nested implications, both
 with level 1.  I don't know how to make this bug cause an actual problem,
 but it's very unsettling.  Bumping the level before doing the RHS would
 solve this.

  * Once we have bumped the level, we then have to do some more type-
 variable promotion, especially of type variables free in the finally-
 chosen type of the binder.  This came out rather awkwardly in the code.

  * Because we now get more deferred equality constraints, there's a real
 risk that we'll end up quantifying over them.  If we get, say `(gbl ~
 Maybe lcl)`, where `gbl` is a type var free in the envt, we '''do not'''
 want to quantify over it!  (It would be sound, but would lead to horrible
 types.)  Rather we want to promote `lcl`, and not quantify.

   Currently we don't get many deferred equality constraints, becuase they
 all get done "on the fly". But if we do get one, we do quantify over it,
 which yields a rather stupid type.  Try this:
 {{{
 {-# LANGUAGE ExistentialQuantification #-}
 module Bar where

 data T = forall a. MkT a

 f x z = case x of
            MkT _ -> let g y = [z,Nothing]
                     in (g True, g 'c')
 }}}
   We get this `g :: forall t a2. Maybe a ~ Maybe a2 => t -> [Maybe a2]`,
 where `x::a` is in the envt.  Not very clever!  This only shows up with
 existentials and

   In general I think we want to find all the type variables reachable by
 equality constraints from the environment, and not quantify over them.

   * We also use the type environment `getGlobalTyVars` in ''kind''
 generalisation, and kind inference currently does not use the untoucahble
 story.  Adapting it to use levels might be tricky because we currently
 assume we solve all kind equality constraints on-the-fly, and don't gather
 any deferreed constraints.  Maybe it's easy, but needs thought.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8995>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler