[Git][ghc/ghc][wip/T21623] Fix the equality constraint problem

Richard Eisenberg (@rae) gitlab at gitlab.haskell.org
Thu Oct 13 13:20:15 UTC 2022



Richard Eisenberg pushed to branch wip/T21623 at Glasgow Haskell Compiler / GHC


Commits:
462dd0df by Richard Eisenberg at 2022-10-13T08:51:30-04:00
Fix the equality constraint problem

- - - - -


3 changed files:

- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Tc/TyCl.hs
- compiler/GHC/Types/Id/Make.hs


Changes:

=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -58,7 +58,7 @@ module GHC.Core.DataCon (
         isUnboxedSumDataCon,
         isVanillaDataCon, isNewDataCon, isTypeDataCon,
         classDataCon, dataConCannotMatch,
-        dataConUserTyVarsArePermuted, checkDataConTyVars,
+        dataConUserTyVarsNeedWrapper, checkDataConTyVars,
         isBanged, isMarkedStrict, cbvFromStrictMark, eqHsBang, isSrcStrict, isSrcUnpacked,
         specialPromotedDc,
 
@@ -438,7 +438,7 @@ data DataCon
         -- Reason: less confusing, and easier to generate Iface syntax
 
         -- The type/coercion vars in the order the user wrote them [c,y,x,b]
-        -- INVARIANT(dataConTyVars: the set of tyvars in dcUserTyVarBinders is
+        -- INVARIANT(dataConTyVars): the set of tyvars in dcUserTyVarBinders is
         --    exactly the set of tyvars (*not* covars) of dcExTyCoVars unioned
         --    with the set of dcUnivTyVars whose tyvars do not appear in dcEqSpec
         -- See Note [DataCon user type variable binders]
@@ -613,8 +613,8 @@ A DataCon has two different sets of type variables:
     and as a consequence, they do not come equipped with visibilities
     (that is, they are TyVars/TyCoVars instead of ForAllTyBinders).
 
-Often (dcUnivTyVars ++ dcExTyCoVars) = dcUserTyVarBinder; but they may differ
-for two reasons, coming next:
+Often (dcUnivTyVars ++ dcExTyCoVars) = dcUserTyVarBinders; but they may differ
+for three reasons, coming next:
 
 --- Reason (R1): Order of quantification in GADT syntax ---
 
@@ -660,23 +660,102 @@ according to the rules of TypeApplications, in the absence of `forall` GHC
 performs a stable topological sort on the type variables in the user-written
 type signature, which would place `b` before `a`.
 
---- Reason (R2): GADTs introduce new existentials ---
+--- Reason (R2): GADT constructors quantify over different variables ---
 
-GADTs may also introduce extra existentials.  Consider
+GADT constructors may quantify over different variables than the worker
+would.  Consider
    data T a b where
       MkT :: forall c d. c -> T [c] d
 
-Then we make up a fresh universal, say 'a', to be the first argument to T in
-the result, and get this type for the "true underlying" worker data con:
+The dcUserTyVarBinders must be [c, d] -- that's what the user quantified over.
+But c is actually existential, as it is not equal to either of the two
+universal variables.
+
+Here is what we'll get:
+
+  dcUserTyVarBinders = [c, d]
+  dcUnivTyVars = [a, d]
+  dcExTyCoVars = [c]
+
+Note that dcUnivTyVars contains `a` from the type header (the `data T a b`)
+and `d` from the signature for MkT. This is done because d is used in place
+of b in the result of MkT, and so we use the name d for the universal, as that
+might improve error messages. On the other hand, we need to use a fresh name
+for the first universal (recalling that the result of a worker must be the
+type constructor applied to a sequence of plain variables), so we use `a`, from
+the header. This choice of universals is made in GHC.Tc.TyCl.mkGADTVars.
+
+Because c is not a universal, it is an existential. Here, we see that (even
+ignoring order) dcUserTyVarBinders is not dcUnivTyVars ⋃ dcExTyCoVars, because
+the latter has `a` while the former does not. To understand this better, let's
+look at this type for the "true underlying" worker data con:
 
       MkT :: forall a d. forall c. (a ~# [c]) => c -> T a d
 
-(We don't need to make up a fresh tyvar for the second arg; 'd' will do.)
+We see here that the `a` universal is connected with the `c` existential via
+an equality constraint. It will always be the case (see the code in mkGADTVars)
+that the universals not mentioned in dcUserTyVarBinders will be used in a
+GADT equality -- that is, used on the left-hand side of an element of dcEqSpec:
+
+  dcEqSpec = [a ~# [c]]
+
+Putting this all together, all variables used on the left-hand side of an
+equation in the dcEqSpec will be in dcUnivTyVars but *not* in
+dcUserTyVarBinders.
+
+--- Reason (R3): Kind equalities may have been solved ---
+
+Consider now this case:
+
+  type family F a where
+    F Type = False
+    F _    = True
+  type T :: forall k. (F k ~ True) => k -> k -> Type
+  data T a b where
+    MkT :: T Maybe List
+
+The constraint F k ~ True tells us that T does not want to be indexed by, say,
+Int. Now let's consider the Core types involved:
+
+  axiom for F: axF[0] :: F Type ~ False
+               axF[1] :: forall a. F a ~ True   (a must be apart from Type)
+  tycon: T :: forall k. (F k ~ True) -> k -> k -> Type
+  wrapper: MkT :: T @(Type -> Type) @(Eq# (axF[1] (Type -> Type)) Maybe List
+  worker:  MkT :: forall k (c :: F k ~ True) (a :: k) (b :: k).
+                  (k ~# (Type -> Type), a ~# Maybe, b ~# List) =>
+                  T @k @c a b
+
+The key observation here is that the worker quantifies over c, while the wrapper
+does not. The worker *must* quantify over c, because c is a universal variable,
+and the result of the worker must be the type constructor applied to a sequence
+of plain type variables. But the wrapper certainly does not need to quantify over
+any evidence that F (Type -> Type) ~ True, as no variables are needed there.
+
+(Aside: the c here is a regular type variable, *not* a coercion variable. This
+is because F k ~ True is a *lifted* equality, not the unlifted ~#. This is why
+we see Eq# in the type of the wrapper: Eq# boxes the unlifted ~# to become a
+lifted ~. See also Note [The equality types story] in GHC.Builtin.Types.Prim about
+Eq# and Note [Constraints in kinds] in GHC.Core.TyCo.Rep about having this constraint
+in the first place.)
+
+In this case, we'll have these fields of the DataCon:
+
+  dcUserTyVarBinders = []    -- the wrapper quantifies over nothing
+  dcUnivTyVars = [k, c, a, b]
+  dcExTyCoVars = []  -- no existentials here, but a different constructor might have
+  dcEqSpec = [k ~# (Type -> Type), a ~# Maybe, b ~# List]
+
+Note that c is in the dcUserTyVars, but mentioned neither in the dcUserTyVarBinders nor
+in the dcEqSpec. We thus have Reason (R3): a variable might be missing from the
+dcUserTyVarBinders if its type's kind is Constraint.
+
+(At one point, we thought that the dcEqSpec would have to be non-empty. But that
+wouldn't account for silly cases like type T :: (True ~ True) => Type.)
 
 --- End of Reasons ---
 
 INVARIANT(dataConTyVars): the set of tyvars in dcUserTyVarBinders
-consists precisely of:
+consists of:
 
 * The set of tyvars in dcUnivTyVars whose type variables do not appear in
   dcEqSpec, unioned with:
@@ -684,12 +763,16 @@ consists precisely of:
 * The set of tyvars (*not* covars) in dcExTyCoVars
   No covars here because because they're not user-written
 
+When comparing for equality, we ignore differences concerning type variables
+whose kinds have kind Constraint.
+
 The word "set" is used above because the order in which the tyvars appear in
 dcUserTyVarBinders can be completely different from the order in dcUnivTyVars or
 dcExTyCoVars. That is, the tyvars in dcUserTyVarBinders are a permutation of
 (tyvars of dcExTyCoVars + a subset of dcUnivTyVars). But aside from the
 ordering, they in fact share the same type variables (with the same Uniques). We
-sometimes refer to this as "the dcUserTyVarBinders invariant".
+sometimes refer to this as "the dcUserTyVarBinders invariant". It is checked
+in checkDataConTyVars.
 
 dcUserTyVarBinders, as the name suggests, is the one that users will
 see most of the time. It's used when computing the type signature of a
@@ -1668,7 +1751,7 @@ dataConCannotMatch tys con
 --
 -- This is not a cheap test, so we minimize its use in GHC as much as possible.
 -- Currently, its only call site in the GHC codebase is in 'mkDataConRep' in
--- "MkId", and so 'dataConUserTyVarsArePermuted' is only called at most once
+-- "MkId", and so 'dataConUserTyVarsNeedWrapper' is only called at most once
 -- during a data constructor's lifetime.
 
 dataConResRepTyArgs :: DataCon -> [Type]
@@ -1696,20 +1779,42 @@ dataConResRepTyArgs dc@(MkData { dcRepTyCon = rep_tc, dcOrigResTy = orig_res_ty
 
 checkDataConTyVars :: DataCon -> Bool
 -- Check that the worker and wrapper have the same set of type variables
--- See Note [DataCon user type variable binders], as well as
-checkDataConTyVars dc
-  = mkUnVarSet (dataConUnconstrainedTyVars dc) == mkUnVarSet (dataConUserTyVars dc)
-    -- Worker tyvars ...should be the same set as...wrapper tyvars
+-- See Note [DataCon user type variable binders]
+-- Also ensures that no user tyvar is in the eq_spec (the eq_spec should
+-- only relate fresh universals from (R2) of the note)
+checkDataConTyVars dc@(MkData { dcUnivTyVars = univ_tvs
+                              , dcExTyCoVars = ex_tvs
+                              , dcEqSpec = eq_spec })
+     -- use of sets here: (R1) from the Note
+  = mkUnVarSet depleted_worker_vars == mkUnVarSet depleted_wrapper_vars &&
+    all (not . is_eq_spec_var) wrapper_vars
+  where
+    is_constraint_var v = typeTypeOrConstraint (tyVarKind v) == ConstraintLike
+      -- implements (R3) from the Note
+
+    worker_vars = univ_tvs ++ ex_tvs
+    eq_spec_tvs = mkUnVarSet (map eqSpecTyVar eq_spec)
+    is_eq_spec_var = (`elemUnVarSet` eq_spec_tvs)  -- (R2) from the Note
+    depleted_worker_vars = filterOut (is_eq_spec_var <||> is_constraint_var)
+                                     worker_vars
+
+    wrapper_vars = dataConUserTyVars dc
+    depleted_wrapper_vars = filterOut is_constraint_var wrapper_vars
 
-dataConUserTyVarsArePermuted :: DataCon -> Bool
--- Check whether the worker and wapper have their tpye variables in
--- the same order.  If not, we need a wrapper to swizzle them.
+dataConUserTyVarsNeedWrapper :: DataCon -> Bool
+-- Check whether the worker and wapper have the same type variables
+-- in the same order. If not, we need a wrapper to swizzle them.
 -- See Note [DataCon user type variable binders], as well as
 -- Note [Data con wrappers and GADT syntax] for an explanation of what
 -- mkDataConRep is doing with this function.
-dataConUserTyVarsArePermuted dc
-  = dataConUnconstrainedTyVars dc /= dataConUserTyVars dc
-    -- Worker tyvars                  Wrapper tyvars
+dataConUserTyVarsNeedWrapper dc@(MkData { dcUnivTyVars = univ_tvs
+                                        , dcExTyCoVars = ex_tvs
+                                        , dcEqSpec = eq_spec })
+  = assert (null eq_spec || answer)  -- all GADTs should say "yes" here
+    answer
+  where
+    answer = (univ_tvs ++ ex_tvs) /= dataConUserTyVars dc
+              -- Worker tyvars         Wrapper tyvars
 
 -- | Return the quantified variables (universal and existential)
 --   that are unconstrained by dcEqSpec, in the order that the /worker/ expects


=====================================
compiler/GHC/Tc/TyCl.hs
=====================================
@@ -4037,6 +4037,8 @@ mkGADTVars tmpl_tvs dc_tvs subst
             |  not (r_tv `elem` univs)
             ,  tyVarKind r_tv `eqType` (substTy t_sub (tyVarKind t_tv))
             -> -- simple, well-kinded variable substitution.
+               -- the name of the universal comes from the result of the ctor
+               -- see (R2) of Note [DataCon user type variable binders] in GHC.Core.DataCon
                choose (r_tv:univs) eqs
                       (extendTvSubst t_sub t_tv r_ty')
                       (extendTvSubst r_sub r_tv r_ty')
@@ -4046,7 +4048,8 @@ mkGADTVars tmpl_tvs dc_tvs subst
               r_ty'  = mkTyVarTy r_tv1
 
                -- Not a simple substitution: make an equality predicate
-
+               -- the name of the universal comes from the datatype header
+               -- see (R2) of Note [DataCon user type variable binders] in GHC.Core.DataCon
           _ -> choose (t_tv':univs) eqs'
                       (extendTvSubst t_sub t_tv (mkTyVarTy t_tv'))
                          -- We've updated the kind of t_tv,


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -784,16 +784,16 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
         (not new_tycon
                      -- (Most) newtypes have only a worker, with the exception
                      -- of some newtypes written with GADT syntax. See below.
-         && (any isBanged (ev_ibangs ++ arg_ibangs)
+         && (any isBanged (ev_ibangs ++ arg_ibangs)))
                      -- Some forcing/unboxing (includes eq_spec)
-             || (not $ null eq_spec))) -- GADT
       || isFamInstTyCon tycon -- Cast result
-      || dataConUserTyVarsArePermuted data_con
+      || dataConUserTyVarsNeedWrapper data_con
                      -- If the data type was written with GADT syntax and
                      -- orders the type variables differently from what the
                      -- worker expects, it needs a data con wrapper to reorder
                      -- the type variables.
                      -- See Note [Data con wrappers and GADT syntax].
+                     -- NB: All GADTs return true from this function
       || not (null stupid_theta)
                      -- If the data constructor has a datatype context,
                      -- we need a wrapper in order to drop the stupid arguments.



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/462dd0df14af81837ee07feedbf9bbc6a18267df

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/462dd0df14af81837ee07feedbf9bbc6a18267df
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221013/28fc5377/attachment-0001.html>


More information about the ghc-commits mailing list