[GHC] #6018: Injective type families

Tue Oct 14 02:07:26 UTC 2014

#6018: Injective type families
-------------------------------------+-------------------------------------
              Reporter:  lunaris     |            Owner:  jstolarek
                  Type:  feature     |           Status:  new
  request                            |        Milestone:  7.10.1
              Priority:  normal      |          Version:  7.4.1
             Component:  Compiler    |         Keywords:  TypeFamilies,
            Resolution:              |  Injective
      Operating System:              |     Architecture:  Unknown/Multiple
  Unknown/Multiple                   |       Difficulty:  Unknown
       Type of failure:              |       Blocked By:
  None/Unknown                       |  Related Tickets:  #4259
             Test Case:              |
              Blocking:              |
Differential Revisions:  Phab:D202   |
-------------------------------------+-------------------------------------

Comment (by goldfire):

 Replying to [comment:82 jstolarek]:
 > Right. Still, I wondering how this works. I mean there are probably no
 special cases in the typechecker that try to guess whether a type family
 is an identity function?

 Of course, you're right -- GHC doesn't have a special case there. What it
 does do is so-called ''compatibility'' checking. Two equations of a type
 family (open or closed) are ''compatible'' if, whenever the LHSs unify
 with substitution ''s'', applying ''s'' to the RHSs makes them the same.
 Two equations for an open type family do not have a malignant overlap if
 they are compatible. And, more relevant here, GHC does the apartness check
 in closed type families only among incompatible equations. In `F` as
 originally given, all the equations are compatible with one another, and
 so GHC always skips the apartness check. Thus, when it sees `F a`, the
 last equation triggers and reduces `F a` to `a`. This is all described in
 the closed type families paper, and in the manual.

 > > We have to be careful with the word overlap here. I think we want
 "overlaps" to be "is subsumed by".
 >
 > I don't see the difference between "overlaps" and "is subsumed by". :-(

 I would say these equations overlap:

 > type instance G a Int = a
 > type instance G Int a = a

 but neither equation subsumes the other. On the other hand,

 > type instance H a a = a     -- 1
 > type instance H Int Int = Int    -- 2

 equation 1 subsumes equation 2, because anything that matches 2 will also
 surely match 1.

 >
 > Now, in 4c I wrote:
 >
 > > if unification [of the RHSs] succeeds and there are type variables
 involved we substitute unified type variables on the LHS and check whether
 this LHS overlaps with any of the previous equations. If it does we
 proceed to the next equation
 >
 > You replied with this:
 >
 > > you want to know of the equation at hand is reachable, given the LHS
 substitution. Even if it is reachable, the (substituted) LHS may coincide
 with the LHS of the earlier equation whose RHS unified with the current
 RHS.
 >
 > I don't see the difference between your version and mine.

 See new example `E1` on the wiki page.

 >
 > > Even subtler, it's possible that certain values for type variables are
 excluded if the current LHS is reachable (for example, some variable a
 couldn't be Int, because if a were Int, then a previous equation would
 have triggered). Perhaps these exclusions can also be taken into account.
 >
 > Hm... can you give an example where this would be useful?

 See new example `E2` on the wiki page.

 >
 > > '''RAE:''' But it seems that, under this treatment, any self-recursion
 would automatically lead to a conclusion of "not injective", just like any
 other use of a type family. For example:
 > >
 > > {{{#!hs
 > > type family IdNat n where
 > >   IdNat Z = Z
 > >   IdNat (S n) = S (IdNat n)
 > > }}}
 > >
 > > `IdNat` is injective. But, following the algorithm above, GHC would
 see the recursive use of `IdNat`, not know whether `IdNat` is injective,
 and then give up, concluding "not injective". Is there a case where the
 special treatment of self-recursion leads to a conclusion of "injective"?
 '''End RAE'''
 >
 > This example is just like my `NatToMaybe`. My idea here was that RHSs of
 these two equations won't unify - first returns `Z`, second returns `S
 something`. These are distinct constructors that have no chance of being
 unified. (I assumed that we are able to detect that.) There are no calls
 to type families other than self-recursion and so we can declare `IdNat`
 to be injective. I admit I am not 100% certain that allowing self-
 recursion is safe. It's just that I was unable to come up with an example
 that would show that my algorithm: a:) declares injective function to be
 non-injective; b) declares a non-injective function to be injective.
 >

 What are the steps your algorithm is taking? I don't see how `IdNat` can
 be considered injective while `Ban` is considered non-injective. A
 critical step in banning `Ban` is noting that we don't yet know that `Ban`
 is injective when checking it, so we conclude "not injective".  Maybe it's
 because `Ban`'s recursion is at the top-level? In `IdNat` and
 `NatToMaybe`, unification fails before seeing the recursive use of the
 type family. Hmm... maybe. There seems to be something going on here that,
 if unification succeeds up until it sees the recursive use of the type
 family, then the family either must be non-injective or non-terminating...
 because this situation can only arise when the function "collapses" two
 inputs into the same output through recursion. But I'd want a proof first.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/6018#comment:83>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler