[GHC] #9200: Milner-Mycroft failure at the kind level

Fri Jun 13 11:06:00 UTC 2014

#9200: Milner-Mycroft failure at the kind level
----------------------------------------------+----------------------------
        Reporter:  ekmett                     |            Owner:
            Type:  bug                        |           Status:  new
        Priority:  normal                     |        Milestone:
       Component:  Compiler (Type checker)    |          Version:  7.8.2
      Resolution:                             |         Keywords:
Operating System:  Unknown/Multiple           |     Architecture:
 Type of failure:  GHC rejects valid program  |  Unknown/Multiple
       Test Case:                             |       Difficulty:  Unknown
        Blocking:                             |       Blocked By:
                                              |  Related Tickets:
----------------------------------------------+----------------------------

Comment (by simonpj):

 That's very interesting Richard.  I had not followed this particular
 aspect of your implementation of closed type families.

 == The new proposed strategy ==

 I would explain the strategies rather differently to you.

 '''Baseline strategy''' (BASELINE), originally due to Mark Jones. Here is
 the strategy that we currently follow for ordinary, recursive term-level
 functions, and for recursive data types.  I'll describe it for data types,
 with this example:
 {{{
    data S f a b = MkS (T a f) (S f a b)
    data T (a::k) (f::k -> *) :: * where
       MkT :: f a -> S f a Maybe -> S f a Int -> T a f
 }}}
  1. Identify which type constructors have Complete User Type Signatures
 (CUSK).  Extend the environment with these, fixed, kinds:
 {{{
        T :: forall k. k -> (k->*) -> *
 }}}
  2. Perform strongly-connected component analysis on the non-CUSK decls,
 *ignoring* dependencies on a type constructor with a CUSK.  That gives us
 a single recursive component, containing S.

  3. For each s-c component in turn:
     * Bind the type constructor to a fresh meta-kind variable:
 {{{
         S :: kappa0
 }}}
     * Kind-check all the declarations of the component in this
 environment.  This will generate some unifications, so in the end we get
 {{{
         kappa0 ~ (kappa1 -> *) -> kappa1 -> kappa2 -> *
 }}}
      The `kappa1` arises from instantiating `T` at its call site in `S`
     * Generalise.  So we get
 {{{
         S :: forall k1 k2. (k1->*) -> k1 -> k2 -> *
 }}}
  4. Extend the environment with these generalised kind bindings, and kind-
 check the CUSK declarations.

 The Key Point is that we can kind-check S ''without looking at T's
 definition at all'', because we completely know T's kind.  That in turn
 means that we can exploit ''inferred'' polymorphism for S when kind-
 checking T.  As we do here: T uses S in two different ways `(S f a Maybe)`
 and `(S f a Int)`.

 '''Richard's strategy''' (RICHARD). The key idea is that ''all
 polymorphism is declared'', so nothing gets to be kind-polymorphic unless
 you say so.  But the payoff is that you can give partial kind signatures.
 I'm not going to call it !NonParametric becuase I think that's a terrible
 name.  Here's the strategy.

  1. Sort the declarations into strongly-connected components.  No special
 treatment for CUSKs.

  2. For each declaration, extend the environment with a kind binding that
 has a forall for each *explicit* kind variable, but meta-kind variables
 otherwise.  For example
 {{{
       data Foo (a :: k1 -> k1) b c
 }}}
    would get a kind binding
 {{{
       Foo :: forall k1. (k1->k1) -> kappa1 -> kappa2 -> *
 }}}
    Our earlier example would give
 {{{
       T :: forall k. k -> (k->*) -> *
       S :: kappa3 -> kappa4 -> kappa5 -> *
 }}}

  3. Kind-check the declartions in this environment.  At a call of `Foo`,
 say, we'd instantiate the `forall k1` with a fresh meta-kind variable, but
 would share `kappa1`, `kappa2` among all calls to `Foo`.

  4. Default any unconstrained meta kind variables to `*`

 That's it!   No generalisation step.  The *only* polymorphism is that
 declared by the user.

 So our earlier S/T example would be rejected because it relies on S being
 polymorphic in its third parameter. If you want the S/T example to work
 you could write
 {{{
    data S f (a::k1) (b::k2) = MkS (T a f) (S f a b)
    data T (a::k) f where
       MkT :: f a -> S f a Maybe -> S f a Int -> T a f
 }}}
 That's enough to ensure that S and T's kinds start with
 {{{
      S :: forall k1 k2. ...blah...
      T :: forall k. ...blah...
 }}}

 == Type signatures ==

 Another place that we currently (i.e. using (BASELINE)) do kind
 generalisation is in ''type signatures''. If you write
 {{{
 f :: m a -> m a
 f = ...
 }}}
 then the type signature is kind-generalised thus:
 {{{
 This user-written signature
   f :: m a -> m a
 means this (BASELINE)
   f :: forall k (a:k) (m:k->*). m a -> m a
 }}}
 And f's RHS had better ''be'' that polymorphic.

 Under (RICHARD) it would be consistent to say this:
 {{{
 This user-written signature
   f :: m a -> m a
 means this (RICHARD)
   f :: forall (a:*) (m:k->*). m a -> m a
 }}}
 If you want the kind-polymorphic one, you'd have to write thus
 {{{
 This user-written signature
   f :: forall k (a:k) (m:k->*). m a -> m a
 means this (RICHARD)
   f :: forall k (a:k) (m:k->*). m a -> m a
 }}}

 == Declarative typing rules ==

 Happily (RICHARD) has a nice declarative typing rule.

 Here is what the conventional declarative typing rule, ''in the absence of
 polymorphism'' for a single self-recursive function looks like:
 {{{
         G, f:t |- e:t
         G, f:t |- b:t'
       ---------------------------
         G |- letrec f = e in b : t'
 }}}
 Here the "t" is a monotype (no foralls) that the declarative typing rules
 clairvoyantly conjures up out of thin air.

 Once you add Hindley-Milner style polymorphism, the rule gets a bit more
 complicated
 {{{
         G, f:t |- e:t
         G, f:gen(G,t) |- b:t'
       ---------------------------
         G |- letrec f = e in b : t'
 }}}
 where 'gen' is generalising.

 The (RICHARD) rule might look like this:
 {{{
         t = forall vs. sig[t1..tn/_]
         G, f : t |- e : forall vs.t
         G, f : t |- b:t'
       ---------------------------
         G |- letrec f :: forall vs. sig; f = e in b : t'
 }}}
 Here I'm expressing the user-specified knowledge as a signature `forall
 vs.sig`, with '_' for bits you don't want to specify.
 {{{
        f :: forall a. _ -> a -> _
 }}}
 Then the rule intantiates each '_' with a clairvoyantly guessed monotype
 (which might mention the 'vs', or 'a' in this example), and off you go.

 == Reflection ==

 I think we could reasonably switch to (RICHARD) throughout.

 As Richard's comments in `TcHsType` point out, we don't want maximal
 polymorphism.  His example is:
 {{{
     type family F a where
       F Int = Bool
       F Bool = Char
 }}}
 We could generate
 {{{
    F :: forall k1 k2. k1 -> k2
 }}}
 so that `(F Maybe)` is well-kinded, but stuck. But that's probably not
 what we want. It would be better to get `F :: * -> *`

 But what about
 {{{
     type family G a f b where
       G Int  f b = f b
       G Bool f b = Char -> f b
 }}}
 You could just about argue that the programmer intends
 {{{
    F :: forall k. * -> (k->*) -> k -> *
 }}}
 It's quite similar to this:
 {{{
   data PT f a = MkPT (f a)
 }}}
 which today, using (BASELINE), we infer to have kind
 {{{
   PT :: forall k. (k->*) -> k -> *
 }}}
 But I'd be perfectly happy if PT got a ''monomorphic'' inferred kind,
 which is what (RICHARD) would do:
 {{{
   PT :: (*->*) -> * -> *
 }}}
 If you want the poly-kinded PT, use a signature:
 {{{
   -- Any of these would do
   data PT f             (a :: k) = MkPT (f a)
   data PT (f :: k -> *) a        = MkPT (f a)
   data PT (f :: k -> *) (a :: k) = MkPT (f a)
 }}}
 One oddity is that we'd do (BASELINE) for terms and (RICHARD) for types.
 But perhaps that's ok.  They are different.
  * Terms ought to be as polymorphic as possible but arguably not types.
 Examples above.  Also, since kind polymorphism is still in its infancy,
 maybe it's no bad thing if all kind polymorphism is explicitly signalled
 every time a kind-polymorphic binder is introduced.

  * Terms have well-established separate type signatures, but we don't have
 a syntax for separate kind signatures of types and classes.

 If we moved from (BASELINE) to (RICHARD), some programs that work now
 would fail:
  * the original S/T example above
  * a data type like `PT` where the user did actually want the kind-
 polymorphic version.

 But that might be a price worth paying for the simplicity, uniformity, and
 predictability you'd get in exchange.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9200#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler