[Git][ghc/ghc][master] Use conLikeUserTyVarBinders to quantify field selector types

Marge Bot gitlab at gitlab.haskell.org
Sun Apr 12 15:22:21 UTC 2020



 Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
54ca66a7 by Ryan Scott at 2020-04-12T11:22:10-04:00
Use conLikeUserTyVarBinders to quantify field selector types

This patch:

1. Writes up a specification for how the types of top-level field
   selectors should be determined in a new section of the GHC User's
   Guide, and
2. Makes GHC actually implement that specification by using
   `conLikeUserTyVarBinders` in `mkOneRecordSelector` to preserve the
   order and specificity of type variables written by the user.

Fixes #18023.

- - - - -


12 changed files:

- compiler/GHC/Core/ConLike.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/HsToCore/Expr.hs
- compiler/GHC/Tc/TyCl/Utils.hs
- docs/users_guide/8.12.1-notes.rst
- docs/users_guide/exts/existential_quantification.rst
- + docs/users_guide/exts/field_selectors_and_type_applications.rst
- docs/users_guide/exts/gadt.rst
- docs/users_guide/exts/rank_polymorphism.rst
- docs/users_guide/exts/records.rst
- + testsuite/tests/typecheck/should_compile/T18023.hs
- testsuite/tests/typecheck/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/ConLike.hs
=====================================
@@ -12,6 +12,7 @@ module GHC.Core.ConLike (
         , conLikeArity
         , conLikeFieldLabels
         , conLikeInstOrigArgTys
+        , conLikeUserTyVarBinders
         , conLikeExTyCoVars
         , conLikeName
         , conLikeStupidTheta
@@ -113,6 +114,18 @@ conLikeInstOrigArgTys (RealDataCon data_con) tys =
 conLikeInstOrigArgTys (PatSynCon pat_syn) tys =
     patSynInstArgTys pat_syn tys
 
+-- | 'TyVarBinder's for the type variables of the 'ConLike'. For pattern
+-- synonyms, this will always consist of the universally quantified variables
+-- followed by the existentially quantified type variables. For data
+-- constructors, the situation is slightly more complicated—see
+-- @Note [DataCon user type variable binders]@ in "GHC.Core.DataCon".
+conLikeUserTyVarBinders :: ConLike -> [TyVarBinder]
+conLikeUserTyVarBinders (RealDataCon data_con) =
+    dataConUserTyVarBinders data_con
+conLikeUserTyVarBinders (PatSynCon pat_syn) =
+    patSynUnivTyVarBinders pat_syn ++ patSynExTyVarBinders pat_syn
+    -- The order here is because of the order in `GHC.Tc.TyCl.PatSyn`.
+
 -- | Existentially quantified type/coercion variables
 conLikeExTyCoVars :: ConLike -> [TyCoVar]
 conLikeExTyCoVars (RealDataCon dcon1) = dataConExTyCoVars dcon1


=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -384,7 +384,9 @@ data DataCon
                 --      MkT :: forall a b. (a ~ [b]) => b -> T a
                 --      MkT :: forall b. b -> T [b]
                 -- Each equality is of the form (a ~ ty), where 'a' is one of
-                -- the universally quantified type variables
+                -- the universally quantified type variables. Moreover, the
+                -- only place in the DataCon where this 'a' will occur is in
+                -- dcUnivTyVars. See [The dcEqSpec domain invariant].
 
                 -- The next two fields give the type context of the data constructor
                 --      (aside from the GADT constraints,
@@ -595,6 +597,35 @@ dcUserTyVarBinders, as the name suggests, is the one that users will see most of
 the time. It's used when computing the type signature of a data constructor (see
 dataConUserType), and as a result, it's what matters from a TypeApplications
 perspective.
+
+Note [The dcEqSpec domain invariant]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider this example of a GADT constructor:
+
+  data Y a where
+    MkY :: Bool -> Y Bool
+
+The user-written type of MkY is `Bool -> Y Bool`, but what is the underlying
+Core type for MkY? There are two conceivable possibilities:
+
+1. MkY :: forall a. (a ~# Bool) => Bool -> Y a
+2. MkY :: forall a. (a ~# Bool) => a    -> Y a
+
+In practice, GHC picks (1) as the Core type for MkY. This is because we
+maintain an invariant that the type variables in the domain of dcEqSpec will
+only ever appear in the dcUnivTyVars. As a consequence, the type variables in
+the domain of dcEqSpec will /never/ appear in the dcExTyCoVars, dcOtherTheta,
+dcOrigArgTys, or dcOrigResTy; these can only ever mention variables from
+dcUserTyVarBinders, which excludes things in the domain of dcEqSpec.
+(See Note [DataCon user type variable binders].) This explains why GHC would
+not pick (2) as the Core type, since the argument type `a` mentions a type
+variable in the dcEqSpec.
+
+There are certain parts of the codebase where it is convenient to apply the
+substitution arising from the dcEqSpec to the dcUnivTyVars in order to obtain
+the user-written return type of a GADT constructor. A consequence of the
+dcEqSpec domain invariant is that you /never/ need to apply the substitution
+to any other part of the constructor type, as they don't require it.
 -}
 
 -- | Data Constructor Representation


=====================================
compiler/GHC/HsToCore/Expr.hs
=====================================
@@ -641,11 +641,7 @@ dsExpr expr@(RecordUpd { rupd_expr = record_expr, rupd_flds = fields
     mk_alt upd_fld_env con
       = do { let (univ_tvs, ex_tvs, eq_spec,
                   prov_theta, _req_theta, arg_tys, _) = conLikeFullSig con
-                 user_tvs =
-                   case con of
-                     RealDataCon data_con -> dataConUserTyVars data_con
-                     PatSynCon _          -> univ_tvs ++ ex_tvs
-                       -- The order here is because of the order in `GHC.Tc.TyCl.PatSyn`.
+                 user_tvs  = binderVars $ conLikeUserTyVarBinders con
                  in_subst  = zipTvSubst univ_tvs in_inst_tys
                  out_subst = zipTvSubst univ_tvs out_inst_tys
 


=====================================
compiler/GHC/Tc/TyCl/Utils.hs
=====================================
@@ -872,19 +872,17 @@ mkOneRecordSelector all_cons idDetails fl
 
     -- Selector type; Note [Polymorphic selectors]
     field_ty   = conLikeFieldType con1 lbl
-    data_tvs   = tyCoVarsOfTypesWellScoped inst_tys
-    data_tv_set= mkVarSet data_tvs
+    data_tvbs  = filter (\tvb -> binderVar tvb `elemVarSet` data_tv_set) $
+                 conLikeUserTyVarBinders con1
+    data_tv_set= tyCoVarsOfTypes inst_tys
     is_naughty = not (tyCoVarsOfType field_ty `subVarSet` data_tv_set)
-    (field_tvs, field_theta, field_tau) = tcSplitSigmaTy field_ty
     sel_ty | is_naughty = unitTy  -- See Note [Naughty record selectors]
-           | otherwise  = mkSpecForAllTys data_tvs          $
+           | otherwise  = mkForAllTys data_tvbs             $
                           mkPhiTy (conLikeStupidTheta con1) $   -- Urgh!
-                          mkVisFunTy data_ty                $
-                          mkSpecForAllTys field_tvs         $
-                          mkPhiTy field_theta               $
                           -- req_theta is empty for normal DataCon
                           mkPhiTy req_theta                 $
-                          field_tau
+                          mkVisFunTy data_ty                $
+                          field_ty
 
     -- Make the binding: sel (C2 { fld = x }) = x
     --                   sel (C7 { fld = x }) = x
@@ -936,6 +934,16 @@ mkOneRecordSelector all_cons idDetails fl
     (univ_tvs, _, eq_spec, _, req_theta, _, data_ty) = conLikeFullSig con1
 
     eq_subst = mkTvSubstPrs (map eqSpecPair eq_spec)
+    -- inst_tys corresponds to one of the following:
+    --
+    -- * The arguments to the user-written return type (for GADT constructors).
+    --   In this scenario, eq_subst provides a mapping from the universally
+    --   quantified type variables to the argument types. Note that eq_subst
+    --   does not need to be applied to any other part of the DataCon
+    --   (see Note [The dcEqSpec domain invariant] in GHC.Core.DataCon).
+    -- * The universally quantified type variables
+    --   (for Haskell98-style constructors and pattern synonyms). In these
+    --   scenarios, eq_subst is an empty substitution.
     inst_tys = substTyVars eq_subst univ_tvs
 
     unit_rhs = mkLHsTupleExpr []
@@ -945,13 +953,44 @@ mkOneRecordSelector all_cons idDetails fl
 Note [Polymorphic selectors]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We take care to build the type of a polymorphic selector in the right
-order, so that visible type application works.
-
-  data Ord a => T a = MkT { field :: forall b. (Num a, Show b) => (a, b) }
-
-We want
-
-  field :: forall a. Ord a => T a -> forall b. (Num a, Show b) => (a, b)
+order, so that visible type application works according to the specification in
+the GHC User's Guide (see the "Field selectors and TypeApplications" section).
+We won't bother rehashing the entire specification in this Note, but the tricky
+part is dealing with GADT constructor fields. Here is an appropriately tricky
+example to illustrate the challenges:
+
+  {-# LANGUAGE PolyKinds #-}
+  data T a b where
+    MkT :: forall b a x.
+           { field1 :: forall c. (Num a, Show c) => (Either a c, Proxy b)
+           , field2 :: x
+           }
+        -> T a b
+
+Our goal is to obtain the following type for `field1`:
+
+  field1 :: forall {k} (b :: k) a.
+            T a b -> forall c. (Num a, Show c) => (Either a c, Proxy b)
+
+(`field2` is naughty, per Note [Naughty record selectors], so we cannot turn
+it into a top-level field selector.)
+
+Some potential gotchas, inspired by #18023:
+
+1. Since the user wrote `forall b a x.` in the type of `MkT`, we want the `b`
+   to appear before the `a` when quantified in the type of `field1`.
+2. On the other hand, we *don't* want to quantify `x` in the type of `field1`.
+   This is because `x` does not appear in the GADT return type, so it is not
+   needed in the selector type.
+3. Because of PolyKinds, the kind of `b` is generalized to `k`. Moreover, since
+   this `k` is not written in the source code, it is inferred (i.e., not
+   available for explicit type applications) and thus written as {k} in the type
+   of `field1`.
+
+In order to address these gotchas, we start by looking at the
+conLikeUserTyVarBinders, which gives the order and specificity of each binder.
+This effectively solves (1) and (3). To solve (2), we filter the binders to
+leave only those that are needed for the selector type.
 
 Note [Naughty record selectors]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


=====================================
docs/users_guide/8.12.1-notes.rst
=====================================
@@ -18,6 +18,34 @@ Full details
 Language
 ~~~~~~~~
 
+* Record field selectors are now given type signatures that preserve the
+  user-written order of quantified type variables. Moreover, field selector
+  type signatures no longer make inferred type variables avaiable for explicit
+  type application. See :ref:`field-selectors-and-type-applications` for more
+  details.
+
+  In certain situations, this will constitute a breaking change as this can
+  affect :extension:`TypeApplications`. For instance, given the following
+  definitions: ::
+
+    {-# LANGUAGE PolyKinds #-}
+
+    newtype P a = MkP { unP :: Proxy a }
+
+    newtype N :: Type -> Type -> Type where
+      MkN :: forall b a. { unN :: Either a b } -> N a b
+
+  Previous versions of GHC would give the following types to ``unP`` and
+  ``unN``: ::
+
+    unP :: forall k (a :: k). P a -> Proxy a
+    unN :: forall a b. N a b -> Either a b
+
+  GHC will now give them the following types instead: ::
+
+    unP :: forall {k} (a :: k). P a -> Proxy a
+    unN :: forall b a. N a b -> Either a b
+
 * In obscure scenarios, GHC now rejects programs it previously accepted, but
   with unhelpful types. For example, if (with ``-XPartialTypeSignatures``) you
   were to write ``x :: forall (f :: forall a (b :: a -> Type). b _). f _``, GHC previously
@@ -40,7 +68,7 @@ Language
 
   There is a chance we will tweak the lookup scheme in the future, to make this
   workaround unnecessary.
-  
+
 Compiler
 ~~~~~~~~
 


=====================================
docs/users_guide/exts/existential_quantification.rst
=====================================
@@ -116,7 +116,11 @@ example: ::
         }
 
 Here ``tag`` is a public field, with a well-typed selector function
-``tag :: Counter a -> a``. The ``self`` type is hidden from the outside;
+``tag :: Counter a -> a``. See :ref:`field-selectors-and-type-applications`
+for a full description of how the types of top-level field selectors are
+determined.
+
+The ``self`` type is hidden from the outside;
 any attempt to apply ``_this``, ``_inc`` or ``_display`` as functions
 will raise a compile-time error. In other words, *GHC defines a record
 selector function only for fields whose type does not mention the


=====================================
docs/users_guide/exts/field_selectors_and_type_applications.rst
=====================================
@@ -0,0 +1,122 @@
+.. _field-selectors-and-type-applications:
+
+Field selectors and ``TypeApplications``
+----------------------------------------
+
+Field selectors can be used in conjunction with :extension:`TypeApplications`,
+as described in :ref:`visible-type-application`. The type of a field selector
+is constructed by using the surrounding definition as context. This section
+provides a specification for how this construction works. We will explain it
+by considering three different forms of field selector, each of which is a
+minor variation of the same general theme.
+
+Field selectors for Haskell98-style data constructors
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider the following example: ::
+
+  data T a b = MkT { unT :: forall e. Either e a }
+
+This data type uses a Haskell98-style declaration. The only part of this data
+type that is not Haskell98 code is ``unT``, whose type uses higher-rank
+polymorphism (:ref:`arbitrary-rank-polymorphism`). To construct the type of
+the ``unT`` field selector, we will assemble the following:
+
+1. The type variables quantified by the data type head
+   (``forall a b. <...>``).
+2. The return type of the data constructor
+   (``<...> T a b -> <...>``). By virtue of this being a Haskell98-style
+   declaration, the order of type variables in the return type will always
+   coincide with the order in which they are quantified.
+3. The type of the field
+   (``<...> forall e. Either e a``).
+
+The final type of ``unT`` is therefore
+``forall a b. T a b -> forall e. Either e a``. As a result, one way to use
+``unT`` with :extension:`TypeApplications` is
+``unT @Int @Bool (MkT (Right 1)) @Char``.
+
+Field selectors for GADT constructors
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Field selectors for GADT constructors (:ref:`gadt-style`) are slightly more
+involved. Consider the following example: ::
+
+  data G a b where
+    MkG :: forall x n a. (Eq a, Show n)
+        => { unG1 :: forall e. Either e (a, x), unG2 :: n } -> G a (Maybe x)
+
+The ``MkG`` GADT constructor has two records, ``unG1`` and ``unG2``.
+However, only ``unG1`` can be used as a top-level field selector. ``unG2``
+cannot because it is a "hidden" selector (see :ref:`existential-records`); its
+type mentions a free variable ``n`` that does not appear in the result type
+``G a (Maybe x)``. On the other hand, the only free type variables in the type
+of ``unG1`` are ``a`` and ``x``, so ``unG1`` is fine to use as a top-level
+function.
+
+To construct the type of the ``unG1`` field selector, we will assemble
+the following:
+
+1. The subset of type variables quantified by the GADT constructor that are
+   mentioned in the return type. Note that the order of these variables follows
+   the same principles as in :ref:`ScopedSort`.
+   If the constructor explicitly quantifies its type variables at the beginning
+   of the type, then the field selector type will quantify them in the same
+   order (modulo any variables that are dropped due to not being mentioned in
+   the return type).
+   If the constructor implicitly quantifies its type variables, then the field
+   selector type will quantify them in the left-to-right order that they appear
+   in the field itself.
+
+   In this example, ``MkG`` explicitly quantifies ``forall x n a.``, and of
+   those type variables, ``a`` and ``x`` are mentioned in the return type.
+   Therefore, the type of ``unG1`` starts as ``forall x a. <...>``.
+   If ``MkG`` had not used an explicit ``forall``, then they would have instead
+   been ordered as ``forall a x. <...>``, since ``a`` appears to the left of
+   ``x`` in the field type.
+2. The GADT return type
+   (``<...> G a (Maybe x) -> ...``).
+3. The type of the field
+   (``<...> -> forall e. Either e (a, x)``).
+
+The final type of ``unG1`` is therefore
+``forall x a. G a (Maybe x) -> forall e. Either e (a, x)``. As a result, one
+way to use ``unG1`` with :extension:`TypeApplications` is
+``unG1 @Int @Bool (MkG (Right (True, 42)) ()) @Char``.
+
+Field selectors for pattern synonyms
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Certain record pattern synonyms (:ref:`record-patsyn`) can give rise to
+top-level field selectors. Consider the following example: ::
+
+  pattern P :: forall a. Read a
+            => forall n. (Eq a, Show n)
+            => (forall e. Either e (a, Bool)) -> n -> G a (Maybe Bool)
+  pattern P {unP1, unP2} = MkG unP1 unP2
+
+We can only make field selectors for pattern synonym records that do not
+mention any existential type variables whatsoever in their types, per
+:ref:`record-patsyn`. (This is a stronger requirement than for GADT records,
+whose types can mention existential type variables provided that they are also
+mentioned in the return type.) We can see that ``unP2`` cannot be used as a
+top-level field selector since its type has a free type variable ``n``, which
+is existential. ``unP1`` is fine, on the other hand, as its type only has one
+free variable, the universal type variable ``a``.
+
+To construct the type of the ``unP1`` field selector, we will assemble
+the following:
+
+1. The universal type variables
+   (``forall a. <...>``).
+2. The required constraints
+   (``<...> Read a => <...>``).
+3. The pattern synonym return type
+   (``<...> G a (Maybe Bool) -> <...>``).
+4. The type of the field
+   (``<...> -> forall e. Either e (a, Bool)``).
+
+The final type of ``unP1`` is therefore
+``forall a. Read a => G a (Maybe Bool) -> forall e. Either e (a, Bool)``. As a
+result, one way to use ``unP1`` with :extension:`TypeApplications` is
+``unP1 @Double (MkG (Right (4.5, True)) ()) @Char``.


=====================================
docs/users_guide/exts/gadt.rst
=====================================
@@ -118,6 +118,9 @@ also sets :extension:`GADTSyntax` and :extension:`MonoLocalBinds`.
          num :: Term Int -> Term Int
          arg :: Term Bool -> Term Int
 
+   See :ref:`field-selectors-and-type-applications` for a full description of
+   how the types of top-level field selectors are determined.
+
 -  When pattern-matching against data constructors drawn from a GADT,
    for example in a ``case`` expression, the following rules apply:
 


=====================================
docs/users_guide/exts/rank_polymorphism.rst
=====================================
@@ -1,3 +1,5 @@
+.. _arbitrary-rank-polymorphism:
+
 Arbitrary-rank polymorphism
 ===========================
 


=====================================
docs/users_guide/exts/records.rst
=====================================
@@ -7,6 +7,7 @@ Records
     :maxdepth: 1
 
     traditional_record_syntax
+    field_selectors_and_type_applications
     disambiguate_record_fields
     duplicate_record_fields
     record_puns


=====================================
testsuite/tests/typecheck/should_compile/T18023.hs
=====================================
@@ -0,0 +1,34 @@
+{-# LANGUAGE DataKinds #-}
+{-# LANGUAGE GADTs #-}
+{-# LANGUAGE PolyKinds #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-# LANGUAGE TypeApplications #-}
+module T18023 where
+
+import Data.Kind
+import Data.Proxy
+
+newtype N :: Type -> Type -> Type where
+  MkN :: forall b a. { unN :: Either a b } -> N a b
+
+toN :: Either Int Bool -> N Int Bool
+toN = MkN @Bool @Int
+
+fromN :: N Int Bool -> Either Int Bool
+fromN = unN @Bool @Int
+
+newtype P a = MkP { unP :: Proxy a }
+
+toPTrue :: Proxy True -> P True
+toPTrue = MkP @True
+
+fromPTrue :: P True -> Proxy True
+fromPTrue = unP @True
+
+newtype P2 a b = MkP2 { unP2 :: (Proxy a, Proxy b) }
+
+toP2True :: (Proxy True, Proxy True) -> P2 True True
+toP2True = MkP2 @True @True
+
+fromP2True :: P2 True True -> (Proxy True, Proxy True)
+fromP2True = unP2 @True @True


=====================================
testsuite/tests/typecheck/should_compile/all.T
=====================================
@@ -702,3 +702,4 @@ test('T17792', normal, compile, [''])
 test('T17024', normal, compile, [''])
 test('T17021a', normal, compile, [''])
 test('T18005', normal, compile, [''])
+test('T18023', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/54ca66a7d30d7f7cfbf3753ebe547f5a20d76b96

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/54ca66a7d30d7f7cfbf3753ebe547f5a20d76b96
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20200412/7c3783d1/attachment-0001.html>


More information about the ghc-commits mailing list