[GHC] #15952: Reduce zonking-related invariants in the type checker

Mon Dec 31 04:38:47 UTC 2018

#15952: Reduce zonking-related invariants in the type checker
-------------------------------------+-------------------------------------
        Reporter:  goldfire          |                Owner:  (none)
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.6.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by goldfire):

 I have not reviewed the code, but just responding to comment:7.

 I agree about adding `ArgFlag` to `FunTy`, but not the way you probably
 want.

 I see `FunTy` as an abbreviation of `ForAllTy`, where the argument is
 relevant (preserved at runtime) and non-dependent (cannot appear in the
 rest of the type). Morally (in my opinion), we should have `ForAllTy`
 store a `TyCoBinder`, which stores either a `Named TyCoVarBinder` or an
 `Anon Type`. The `TyCoVarBinder` is a pair containing a `TyVar` and an
 `ArgFlag`. This `ForAllTy` should be called `PiTy`. It was implemented en
 route to the TypeInType patch but then `FunTy` was re-extracted from it
 for performance reasons.

 "Adding `ArgFlag` to `FunTy`" really means that the primordial `PiTy` now
 stores either a `Named (ArgFlag, TyVar)` or an `Anon (ArgFlag, Type)`.
 (I've rewritted `TyCoVarBinder` as `(ArgFlag, TyVar)` to simplify the
 presentation... and to ignore coercion variables, which can indeed safely
 be ignore in these parts.) But looking at this definition, we see an
 "obvious" refactor: store a `(NamedFlag, ArgFlag, TyVar)` where
 `NamedFlag` is a `Bool` cognate that says whether or not the quantifier is
 dependent. In the end, though, this information is needed only for
 performance reasons, as we can always treat quantifiers as dependent, but
 some usage sites won't take advantage of this capability. So that leaves
 us with `(ArgFlag, TyVar)` as the payload of the binder part of a `PiTy`
 (along with the "body" part, containing a `Type`). In other words:

 {{{#!hs
 data Type = ... | PiTy (ArgFlag, TyVar) Type | ...
 }}}

 Now, consider this code that we might try to compile:

 {{{#!hs
 data Dict :: Constraint -> Type where
   MkDict :: c => Dict c
 foo :: forall (x :: Proxy (MkDict :: Dict c)). Proxy x -> ()
 }}}

 What's the representation of the type of `foo`? It should be something
 like

 {{{#!hs
 foo :: PiTy (Specified, c :: Constraint) $
        PiTy (Inferred, ct :: c) $
        PiTy (Specified, x :: Proxy (Dict c) (MkDict c ct)) $
        PiTy (Required, _ :: Proxy (Proxy (Dict c) (MkDict c ct)) x) $
        unitTy
 }}}

 Now we have a usage site of `foo`. We instantiate this type. But,
 critically, '''how do we instantiate `ct`'''? Do we (a) emit an wanted
 constraint `c`, or (b) create a fresh unification variable `alpha` and
 then unify it with the right part of the type of the passed argument to
 `foo`? This choice matters -- it's conceivable that the solver in strategy
 (a) will find a different solution for `ct` than unification would come up
 with (in a situation with incoherency -- which normally does not imperil
 the type system).

 The bottom line here is that we need to indicate, for implicit arguments,
 precisely how GHC should try to infer them when they are omitted. Note
 that Agda does this: unification is indicated by single braces and solving
 is indicated by double braces. Currently, `ArgFlag` says when an argument
 is implicit. In the future, though, I think it will additionally need to
 specify an instantiation strategy for implicit arguments. To wit, we might
 have

 {{{#!hs
 data InstStrategy = Solve | Unify
 data ArgFlag = Inferred InstStrategy | Specified InstStrategy | Required
 }}}

 or possibly

 {{{#!hs
 data InstStrategy = Solve | Unify
 data VisibleInstAllowed = CanBeVisible | CannotBeVisible
 data ArgFlag = Invisible InstStrategy VisibleInstAllowed | Visible
 }}}

 Zooming back to the present: should we add `ArgFlag` to `FunTy`? I say:
 "no", because the current `ArgFlag` doesn't scale to the situation
 described above. Instead -- if we don't want to buy into the more
 elaborate scheme above -- we should make a new

 {{{#!hs
 data NonDepArgFlag   -- rename me
   = SolvedFor
   | Visible  -- I wish I could just reuse Required but no TDNR
 }}}

 and then add that to `FunTy`. This approach then naturally scales up to
 the more elaborate scheme described above. It also has the added bonus of
 not using the three-way `ArgFlag` to describe a situation which currently
 only has 2 possibilities.

 -------------------------------------

 Frustrating about #15918. But I agree with your analysis here. As long as
 we have two different equality relations (the one where `Type ~
 Constraint` and the one where it doesn't), this problem will persist. And
 I think trying to cheat by avoiding writing all these extra functions will
 fail. As to the idea of simplifying `mkCastTy` in favor of complicating
 `tcSplitXXX`, I think that might work fine... but I don't think it will
 save sweat or tears. We would need to have `tc` variants of all the
 splitting functions instead of all the making functions. And the problems
 we'll run into will be the same. I favor doing the work at construction,
 because I conjecture that we construct less often than we split, and it
 will make inspecting dumps easier (because we'll know that things that
 look different really are different).

 And I don't think that having, e.g., `mkTcCastTy` is as bad as the
 nakedness business. The naked casts were a nasty, fragile hack, using
 `Refl` in a way it was never intended. `mkTcCastTy`, on the other hand, is
 a perfectly sensible operation to build a type in a way so that types we
 consider equal have identical representations. The downside to the new
 approach is that it requires more code duplication, etc., than the naked-
 casts trick.

 --------------------------

 About #15799: That bug requires #12045 to witness, though the underlying
 fault exists in HEAD. I never drilled down to figure out how #12045
 triggers it.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15952#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler