[Git][ghc/ghc][wip/romes/isNullaryRepDataCon] 3 commits: codeGen: Fix LFInfo of imported datacon wrappers

Rodrigo Mesquita (@alt-romes) gitlab at gitlab.haskell.org
Fri Apr 28 16:02:48 UTC 2023



Rodrigo Mesquita pushed to branch wip/romes/isNullaryRepDataCon at Glasgow Haskell Compiler / GHC


Commits:
ba2177fa by Rodrigo Mesquita at 2023-04-26T14:24:05+01:00
codeGen: Fix LFInfo of imported datacon wrappers

As noted in #23231 and in the previous commit, we were failing to give a
an LFInfo of LFCon to a nullary datacon wrapper from another module,
failing to properly tag pointers which ultimately led to the
segmentation fault in #23146.

On top of the previous commit which now considers wrappers where we
previously only considered workers, we change the order of the guards so
that we check for the arity of the binding before we check whether it is
a constructor. This allows us to
(1) Correctly assign `LFReEntrant` to imported wrappers whose worker was
nullary, which we previously would fail to do
(2) Remove the `isNullaryRepDataCon` predicate:
    (a) which was previously wrong, since it considered wrappers whose
    workers had zero-width arguments to be non-nullary and would fail to
    give `LFCon` to them
    (b) is now unnecessary, since arity == 0 guarantees
        - that the worker takes no arguments at all
        - and the wrapper takes no arguments and its RHS must be an
          application of the worker to zero-width-args only.
        - we lint these two items with an assertion that the datacon
          `hasNoNonZeroWidthArgs`

We also update `isTagged` to use the new logic in determining the
LFInfos of imported Ids.

The creation of LFInfos for imported Ids and this detail are explained
in Note [The LFInfo of Imported Ids].

Note that before the patch to those issues we would already consider these
nullary wrappers to have `LFCon` lambda form info; but failed to re-construct
that information in `mkLFImported`

Closes #23231, #23146

(I've additionally batched some fixes to documentation I found while
investigating this issue)

- - - - -
1e173d64 by Rodrigo Mesquita at 2023-04-28T12:00:55+01:00
Make LFInfos for DataCons on construction

As a result of the discussion in !10165, we decided to amend the
previous commit which fixed the logic of `mkLFImported` with regard to
datacon workers and wrappers.

Instead of having the logic for the LFInfo of datacons be in
`mkLFImported`, we now construct an LFInfo for all data constructors on
GHC.Types.Id.Make and store it in the `lfInfo` field.

See the new Note [LFInfo of DataCon workers and wrappers] and
ammendments to Note [The LFInfo of Imported Ids]

- - - - -
e876c74a by Rodrigo Mesquita at 2023-04-28T17:02:08+01:00
Precompute static closures for DataCons with zero-width args

Relax the predicate over nullary datacons that determines whether we can
return a precomputed static closure for them, such that we give
precomputed static closures to both datacon workers and wrappers as long
as they only take zero-width arguments (and hence their closure is
comprised of just the constructor info).

Previously, we would only allow datacons that were nullary with regard
to their Core representation, which prevented datacons workers with only
zero-width arguments and wrappers with none from using a precomputed
static closure.

See Note [Precomputed static closures of nullary constructors]

Closes #23158

- - - - -


18 changed files:

- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Stg/InferTags/Rewrite.hs
- compiler/GHC/Stg/Syntax.hs
- compiler/GHC/Stg/Unarise.hs
- compiler/GHC/StgToCmm.hs
- compiler/GHC/StgToCmm/Closure.hs
- compiler/GHC/StgToCmm/DataCon.hs
- compiler/GHC/StgToCmm/Env.hs
- compiler/GHC/StgToCmm/Types.hs
- compiler/GHC/Types/Id.hs
- compiler/GHC/Types/Id/Info.hs
- compiler/GHC/Types/Id/Make.hs
- + testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unlifted.hs
- + testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unlifted.stdout
- + testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unliftedA.hs
- testsuite/tests/codeGen/should_run/T23146/all.T


Changes:

=====================================
compiler/GHC/Cmm/CLabel.hs
=====================================
@@ -1384,6 +1384,7 @@ For a data constructor (such as Just or Nothing), we have:
                    ordinary Haskell function of arity 1 that
                    allocates a (Just x) box:
                       Just = \x -> Just x
+    Just_entry:    The entry code for the worker function
     Just_closure:  The closure for this worker
 
     Nothing_closure: a statically allocated closure for Nothing


=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -111,8 +111,8 @@ import Data.List( find )
 import Language.Haskell.Syntax.Module.Name
 
 {-
-Data constructor representation
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Data constructor representation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider the following Haskell data type declaration
 
         data T = T !Int ![Int]
@@ -213,7 +213,7 @@ Note [Data constructor workers and wrappers]
 
 * The wrapper (if it exists) takes dcOrigArgTys as its arguments.
   The worker takes dataConRepArgTys as its arguments
-  If the worker is absent, dataConRepArgTys is the same as dcOrigArgTys
+  If the wrapper is absent, dataConRepArgTys is the same as dcOrigArgTys
 
 * The 'NoDataConRep' case of DataConRep is important. Not only is it
   efficient, but it also ensures that the wrapper is replaced by the
@@ -586,12 +586,22 @@ Function call 'dataConKindEqSpec' returns [k'~k]
 
 Note [DataCon arities]
 ~~~~~~~~~~~~~~~~~~~~~~
-dcSourceArity does not take constraints into account,
-but dcRepArity does.  For example:
+A `DataCon`'s source arity and core representation arity may differ:
+`dcSourceArity` does not take constraints into account, but `dcRepArity` does.
+
+The additional arguments taken into account by `dcRepArity` include quantified
+dictionaries and coercion arguments, lifted and unlifted (despite the unlifted
+coercion arguments having a zero-width runtime representation).
+For example:
    MkT :: Ord a => a -> T a
     dcSourceArity = 1
     dcRepArity    = 2
 
+   MkU :: (b ~ '[]) => U b
+    dcSourceArity = 0
+    dcRepArity    = 1
+
+
 Note [DataCon user type variable binders]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 A DataCon has two different sets of type variables:
@@ -981,7 +991,7 @@ but the rep type is
         Trep :: Int# -> a -> Void# -> T a
 Actually, the unboxed part isn't implemented yet!
 
-Not that this representation is still *different* from runtime
+Note that this representation is still *different* from runtime
 representation. (Which is what STG uses after unarise).
 
 This is how T would end up being used in STG post-unarise:
@@ -1395,9 +1405,10 @@ dataConSrcBangs = dcSrcBangs
 dataConSourceArity :: DataCon -> Arity
 dataConSourceArity (MkData { dcSourceArity = arity }) = arity
 
--- | Gives the number of actual fields in the /representation/ of the
--- data constructor. This may be more than appear in the source code;
--- the extra ones are the existentially quantified dictionaries
+-- | Gives the number of value arguments (including zero-width coercions)
+-- stored by the given `DataCon`'s worker in its Core representation. This may
+-- differ from the number of arguments that appear in the source code; see also
+-- Note [DataCon arities]
 dataConRepArity :: DataCon -> Arity
 dataConRepArity (MkData { dcRepArity = arity }) = arity
 
@@ -1406,8 +1417,14 @@ dataConRepArity (MkData { dcRepArity = arity }) = arity
 isNullarySrcDataCon :: DataCon -> Bool
 isNullarySrcDataCon dc = dataConSourceArity dc == 0
 
--- | Return whether there are any argument types for this 'DataCon's runtime representation type
--- See Note [DataCon arities]
+-- | Return whether this `DataCon`'s worker, in its Core representation, takes
+-- any value arguments.
+--
+-- In particular, remember that we include coercion arguments in the arity of
+-- the Core representation of the `DataCon` -- both lifted and unlifted
+-- coercions, despite the latter having zero-width runtime representation.
+--
+-- See also Note [DataCon arities].
 isNullaryRepDataCon :: DataCon -> Bool
 isNullaryRepDataCon dc = dataConRepArity dc == 0
 


=====================================
compiler/GHC/Core/Tidy.hs
=====================================
@@ -82,7 +82,7 @@ tidyBind env (Rec prs)
 -- This means the code generator can get the full calling convention by only looking at the function
 -- itself without having to inspect the RHS.
 --
--- The actual logic is in tidyCbvInfo and takes:
+-- The actual logic is in computeCbvInfo and takes:
 -- * The function id
 -- * The functions rhs
 -- And gives us back the function annotated with the marks.
@@ -169,7 +169,7 @@ computeCbvInfo fun_id rhs
                -- seqList: avoid retaining the original rhs
 
              | otherwise
-             = -- pprTraceDebug "tidyCbvInfo: Worker seems to take unboxed tuple/sum types!"
+             = -- pprTraceDebug "computeCbvInfo: Worker seems to take unboxed tuple/sum types!"
                --    (ppr fun_id <+> ppr rhs)
                asNonWorkerLikeId fun_id
 


=====================================
compiler/GHC/Stg/InferTags/Rewrite.hs
=====================================
@@ -36,6 +36,7 @@ import GHC.Core            ( AltCon(..) )
 import GHC.Core.Type
 
 import GHC.StgToCmm.Types
+import GHC.StgToCmm.Closure (mkLFImported)
 
 import GHC.Stg.Utils
 import GHC.Stg.Syntax as StgSyn
@@ -271,13 +272,10 @@ isTagged v = do
                             TagTagged -> True
                             TagTuple _ -> True -- Consider unboxed tuples tagged.
         False -- Imported
-            | Just con <- (isDataConWorkId_maybe v)
-            , isNullaryRepDataCon con
-            -> return True
-            | Just lf_info <- idLFInfo_maybe v
             -> return $!
-                -- Can we treat the thing as tagged based on it's LFInfo?
-                case lf_info of
+                -- Determine whether it is tagged from the LFInfo of the imported id.
+                -- See Note [The LFInfo of Imported Ids]
+                case mkLFImported v of
                     -- Function, applied not entered.
                     LFReEntrant {}
                         -> True
@@ -295,9 +293,6 @@ isTagged v = do
                     -- Shouldn't be possible. I don't think we can export letNoEscapes
                         -> True
 
-            | otherwise
-            -> return False
-
 
 isArgTagged :: StgArg -> RM Bool
 isArgTagged (StgLitArg _) = return True


=====================================
compiler/GHC/Stg/Syntax.hs
=====================================
@@ -245,7 +245,7 @@ literals.
         -- which can't be let-bound
   | StgConApp   DataCon
                 ConstructorNumber
-                [StgArg] -- Saturated
+                [StgArg] -- Saturated. (After Unarisation, [NonVoid StgArg])
                 [Type]   -- See Note [Types in StgConApp] in GHC.Stg.Unarise
 
   | StgOpApp    StgOp    -- Primitive op or foreign call


=====================================
compiler/GHC/Stg/Unarise.hs
=====================================
@@ -956,6 +956,8 @@ ubxSumRubbishArg (VecSlot n e)   = StgLitArg (LitRubbish TypeLike vec_rep)
 --------------------------------------------------------------------------------
 
 {-
+Note [Unarisation of Void binders and arguments]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 For arguments (StgArg) and binders (Id) we have two kind of unarisation:
 
   - When unarising function arg binders and arguments, we don't want to remove


=====================================
compiler/GHC/StgToCmm.hs
=====================================
@@ -239,9 +239,9 @@ cgEnumerationTyCon tycon
              | con <- tyConDataCons tycon]
 
 
+-- | Generate the entry code and associated info table for a constructor.
+-- Where are generating the static closure at all?
 cgDataCon :: ConInfoTableLocation -> DataCon -> FCode ()
--- Generate the entry code, info tables, and (for niladic constructor)
--- the static closure, for a constructor.
 cgDataCon mn data_con
   = do  { massert (not (isUnboxedTupleDataCon data_con || isUnboxedSumDataCon data_con))
         ; profile <- getProfile


=====================================
compiler/GHC/StgToCmm/Closure.hs
=====================================
@@ -22,7 +22,7 @@ module GHC.StgToCmm.Closure (
         argPrimRep,
 
         NonVoid(..), fromNonVoid, nonVoidIds, nonVoidStgArgs,
-        assertNonVoidIds, assertNonVoidStgArgs,
+        assertNonVoidIds, assertNonVoidStgArgs, hasNoNonZeroWidthArgs,
 
         -- * LambdaFormInfo
         LambdaFormInfo,         -- Abstract
@@ -96,6 +96,7 @@ import GHC.Utils.Outputable
 import GHC.Utils.Panic
 import GHC.Utils.Panic.Plain
 import GHC.Utils.Misc
+import GHC.Data.Maybe (isNothing)
 
 import Data.Coerce (coerce)
 import qualified Data.ByteString.Char8 as BS8
@@ -170,6 +171,12 @@ assertNonVoidStgArgs :: [StgArg] -> [NonVoid StgArg]
 assertNonVoidStgArgs args = assert (not (any (isZeroBitTy . stgArgType) args)) $
                             coerce args
 
+-- | Returns whether there are any arguments with a non-zero-width runtime
+-- representation.
+--
+-- Returns True if the datacon has no or /just/ zero-width arguments.
+hasNoNonZeroWidthArgs :: DataCon -> Bool
+hasNoNonZeroWidthArgs = all (isZeroBitTy . scaledThing) . dataConRepArgTys
 
 -----------------------------------------------------------------------------
 --                Representations
@@ -255,33 +262,68 @@ mkApLFInfo id upd_flag arity
         (mightBeFunTy (idType id))
 
 -------------
+-- | Make a 'LambdaFormInfo' for an imported Id.
+--   See Note [The LFInfo of Imported Ids]
 mkLFImported :: Id -> LambdaFormInfo
 mkLFImported id =
     -- See Note [Conveying CAF-info and LFInfo between modules] in
     -- GHC.StgToCmm.Types
     case idLFInfo_maybe id of
       Just lf_info ->
-        -- Use the LambdaFormInfo from the interface
+        -- Use the existing LambdaFormInfo
         lf_info
       Nothing
-        -- Interface doesn't have a LambdaFormInfo, make a conservative one from
-        -- the type.
-        | Just con <- isDataConId_maybe id
-        , isNullaryRepDataCon con
-            -- See Note [Imported nullary datacon wrappers must have correct LFInfo]
-            -- in GHC.StgToCmm.Types
-        -> LFCon con   -- An imported nullary constructor
-                       -- We assume that the constructor is evaluated so that
-                       -- the id really does point directly to the constructor
-
+        -- Doesn't have a LambdaFormInfo, but we know it must be 'LFReEntrant' from its arity
         | arity > 0
         -> LFReEntrant TopLevel arity True ArgUnknown
 
+        -- We can't be sure of the LambdaFormInfo of this imported Id,
+        -- so make a conservative one from the type.
         | otherwise
-        -> mkLFArgument id -- Not sure of exact arity
+        -> assert (isNothing (isDataConId_maybe id)) $ -- See Note [LFInfo of DataCon workers and wrappers] in GHC.Types.Id.Make
+           mkLFArgument id -- Not sure of exact arity
   where
     arity = idFunRepArity id
 
+{-
+Note [The LFInfo of Imported Ids]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As explained in Note [Conveying CAF-info and LFInfo between modules]
+the LambdaFormInfo records the details of a closure representation and is
+often, when optimisations are enabled, serialized to the interface of a module.
+
+In particular, the `lfInfo` field of the `IdInfo` field of an `Id`:
+* For DataCon workers and wrappers is populated as described in
+Note [LFInfo of DataCon workers and wrappers] in GHC.Types.Id.Make
+* For other Ids defined in this module: is `Nothing`
+* For other imported Ids:
+  * is (Just lf_info) if the LFInfo was serialised into the interface file
+    (typically, when the exporting module was compiled with -O)
+  * is Nothing if it wasn't serialised
+
+The LambdaFormInfo we give an Id is used in determining how to tag its pointer
+(see `litIdInfo`). Therefore, it's crucial we attribute a correct
+LambdaFormInfo to imported Ids, or otherwise risk having pointers incorrectly
+tagged which can lead to performance issues and even segmentation faults (see
+#23231 and #23146). In particular, saturated data constructor applications
+*must* be unambiguously given `LFCon`, and if the LFInfo says LFCon, then it
+really is a static data constructor, and similar for LFReEntrant.
+
+In `mkLFImported`, we construct a LambdaFormInfo for imported Ids as follows:
+
+(1) If the `lfInfo` field contains an LFInfo, we use that LFInfo which is
+correct by construction (the invariant being that if it exists, it is correct):
+  (1.1) Either it was serialised to the interface we're importing the Id from,
+  (1.2) Or it's a DataCon worker or wrapper and its LFInfo was constructed
+        according to Note [LFInfo of DataCon workers and wrappers]
+(2) When the `lfInfo` field is `Nothing`
+  (2.1) If the `idFunRepArity` of the Id is known and is greater than 0, then
+  the Id is unambiguously a function and is given `LFReEntrant`, and pointers
+  to this Id will be tagged (by `litIdInfo`) with the corresponding arity.
+  (2.2) Otherwise, we can make a conservative estimate from the type.
+
+-}
+
 -------------
 mkLFStringLit :: LambdaFormInfo
 mkLFStringLit = LFUnlifted


=====================================
compiler/GHC/StgToCmm/DataCon.hs
=====================================
@@ -41,6 +41,7 @@ import GHC.Data.FastString
 import GHC.Types.Id
 import GHC.Types.Id.Info( CafInfo( NoCafRefs ) )
 import GHC.Types.Name (isInternalName)
+import GHC.Types.Var (varName)
 import GHC.Types.RepType (countConRepArgs)
 import GHC.Types.Literal
 import GHC.Builtin.Utils
@@ -246,7 +247,8 @@ But also at runtime where the GC does the same (but only for
 INT/CHAR closures).
 
 `precomputedStaticConInfo_maybe` checks if a given constructor application
-can be replaced with a reference to a existing static closure.
+can be replaced with a reference to a existing static closure, according
+to the Note [Precomputed static closures of nullary constructors]
 
 If so the code will reference the existing closure when accessing
 the binding.
@@ -317,6 +319,103 @@ This holds for both local and top level bindings.
 
 We don't support this optimization when compiling into Windows DLLs yet
 because they don't support cross package data references well.
+
+Note [Precomputed static closures of nullary constructors]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We can easily create a precomputed static closure for all data constructors
+that don't take runtime-relevant arguments since their closure is always just
+the constructor info.
+
+Instead of allocating a closure with just the constructor info every time it is
+used, we can instead use the precomputed static closure!
+
+For example, to return from a function the constructor `Nothing`, instead of
+allocating on the heap a word for `Nothing_con_info` and returning the pointer
+to it tagged `+1`, we can simply return `Nothing_closure+1`
+
+We must consider three distinct situations of saturated applications of
+constructors that take no runtime-relevant arguments in which we can use a
+precomputed static closure:
+
+(1) For a data con /worker/ `TCon1` application to no arguments whatsoever we
+can trivially use the static closure of the worker, `TCon1_closure`.
+  Recall that for a worker such as `TCon1`, `TCon1_closure` is just the
+  `TCon1_con_info`:
+        section ""data" . M.TCon1_closure" {
+            M.TCon1_closure:
+                const M.TCon1_con_info;
+        }
+  Invariant: These workers don't have wrappers.
+
+(2) For a data con /wrapper/ `$WTCon2` that takes no arguments whatsoever, we
+can also trivially return the static closure of the wrapper, `$WTCon2_closure`.
+It might be surprising to see a nullary data con /wrapper/ -- they come into
+existence when the worker only takes zero-width arguments. See the example below.
+  As in (1), `$WTCon2_closure` simply points to a `TCon2_con_info`.
+      section ""data" . M.$WTCon2_closure" {
+        M.$WTCon2_closure:
+            const M.TCon2_con_info;
+      }
+
+(3) For a data con /worker/ `TCon2` that takes zero-width arguments only (and
+whose wrapper is `$WTCon2`): because the arguments aren't relevant at runtime,
+closures for it still only have the constructor info -- we can use a
+precomputed static closure instead of allocating them on the heap, nonetheless.
+  However, unlike the worker in (1), `TCon2`, in taking arguments (regardless
+of runtime representation), is unambiguously a function! Therefore, its
+`TCon2_closure` actually contains the info of the function (`TCon2_info`) that returns the
+constructor when called -- and as so it must remain -- if `TCon2` is ever used as
+a function instead of in a saturated data con application, it better be one.
+  To generate in place of a saturated data con application of `TCon2`, we would
+  need something close to:
+      section ""data" . M.TCon2_some_sort_of_closure" {
+        M.TCon2_some_sort_of_closure:
+            const M.TCon2_con_info; -- Must be TCon2_con_info rather than TCon2_info which we have in TCon2_closure
+      }
+  But this turns out to be exactly the definition of this worker's wrapper's
+  static closure (see `$WTCon2_closure`). So, for the kind of worker in (3),
+  the precomputed static closure is the same as the one for the wrapper.
+  Invariant: These workers always have a wrapper of type (2)
+
+The solution that handles all of these cases turns out to be surprisingly
+simple: A data con applied to an empty list of non-void arguments has a
+precomputed static closure which is the tagged closure label of the var name of
+the `dataConWrapId`, both for workers and wrappers.
+  For (1), `dataConWrapId` will return the Id of the worker because the wrapper
+  doesn't exist (i.e. `Wrk_closure+tag`).
+  For (2), `dataConWrapId` will return the Id of the wrapper for the wrapper (i.e. `$Wrp_closure+tag`).
+  For (3), `dataConWrapId` will return the Id of the wrapper, which must exist (i.e. `$Wrp_closure+tag`).
+
+As an example, since (2) and (3) might be hard to visualise, consider the datatype:
+
+  data TCon2 a where
+    TCon2 :: TCon2 ()
+
+and its STG representation post-unarisation:
+
+  G.$WTCon2 :: G.TCon2 ()
+      = G.TCon2! [];
+
+  G.TCon2 :: forall {a}. (a GHC.Prim.~# ()) => G.TCon2 a
+      = {} \r [void_0E] G.TCon2 [];
+
+and the C--:
+
+  section ""data" . G.$WTCon2_closure" {
+      G.$WTCon2_closure:
+          const G.TCon2_con_info; -- Static constructor info
+  }
+
+  section ""data" . G.TCon2_closure" {
+      G.TCon2_closure:
+          const G.TCon2_info;    -- Static function info
+  }
+
+The precomputed static closure for `$WTCon2` is `$WTCon2_closure+1`, and the
+precomputed static closure for `TCon2` is also `$WTCon2_closure+1`; that is,
+all saturated data con applications of `TCon2` and `$WTCon2` are compiled to
+`$WTCon2_closure+1` instead of an allocation on the heap and
+tagging of its pointer.
 -}
 
 -- (precomputedStaticConInfo_maybe cfg id con args)
@@ -326,10 +425,11 @@ because they don't support cross package data references well.
 -- See Note [Precomputed static closures]
 precomputedStaticConInfo_maybe :: StgToCmmConfig -> Id -> DataCon -> [NonVoid StgArg] -> Maybe CgIdInfo
 precomputedStaticConInfo_maybe cfg binder con []
--- Nullary constructors
-  | isNullaryRepDataCon con
-  = Just $ litIdInfo (stgToCmmPlatform cfg) binder (mkConLFInfo con)
-                (CmmLabel (mkClosureLabel (dataConName con) NoCafRefs))
+  -- Nullary constructors (list of nonvoid args is null)
+  -- See Note [Precomputed static closures of nullary constructors]
+  = assert (hasNoNonZeroWidthArgs con) $
+      Just $ litIdInfo (stgToCmmPlatform cfg) binder (mkConLFInfo con)
+                (CmmLabel (mkClosureLabel (varName $ dataConWrapId con) NoCafRefs))
 precomputedStaticConInfo_maybe cfg binder con [arg]
   -- Int/Char values with existing closures in the RTS
   | intClosure || charClosure


=====================================
compiler/GHC/StgToCmm/Env.hs
=====================================
@@ -151,7 +151,7 @@ getCgIdInfo id
               in return $
                   litIdInfo platform id (mkLFImported id) (CmmLabel ext_lbl)
           else
-              cgLookupPanic id -- Bug
+              cgLookupPanic id -- Bug, id is neither in local binds nor is external
         }}}
 
 -- | Retrieve cg info for a name if it already exists.


=====================================
compiler/GHC/StgToCmm/Types.hs
=====================================
@@ -71,12 +71,13 @@ moving parts are:
   -fomit-interface-pragmas or -fno-code; and we won't read it in if you have
   -fignore-interface-pragmas.  (We could revisit this decision.)
 
-Note [Imported nullary datacon wrappers must have correct LFInfo]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Imported unlifted nullary datacon wrappers must have correct LFInfo]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As described in `Note [Conveying CAF-info and LFInfo between modules]`,
-imported nullary datacons must have their LambdaFormInfo set to reflect the
-fact that they are evaluated . This is necessary are otherwise references
-to them may be passed untagged to code that expects tagged references.
+imported unlifted nullary datacons must have their LambdaFormInfo set to
+reflect the fact that they are evaluated . This is necessary as otherwise
+references to them may be passed untagged to code that expects tagged
+references.
 
 What may be less obvious is that this must be done for not only datacon
 workers but also *wrappers*. The reason is found in this program
@@ -113,6 +114,8 @@ The fix is straightforward: extend the logic in `mkLFImported` to cover
 know that the wrapper of a nullary datacon will be in WHNF, even if it
 includes equalities evidence (since such equalities are not runtime
 relevant). This fixed #23146.
+
+See also Note [The LFInfo of Imported Ids]
 -}
 
 -- | Codegen-generated Id infos, to be passed to downstream via interfaces.
@@ -161,7 +164,7 @@ data LambdaFormInfo
         !StandardFormInfo
         !Bool           -- True <=> *might* be a function type
 
-  | LFCon               -- A saturated constructor application
+  | LFCon               -- A saturated data constructor application
         !DataCon        -- The constructor
 
   | LFUnknown           -- Used for function arguments and imported things.


=====================================
compiler/GHC/Types/Id.hs
=====================================
@@ -697,6 +697,8 @@ idCallArity id = callArityInfo (idInfo id)
 setIdCallArity :: Id -> Arity -> Id
 setIdCallArity id arity = modifyIdInfo (`setCallArityInfo` arity) id
 
+-- | This function counts all arguments post-unarisation, which includes
+-- arguments with no runtime representation -- see Note [Unarisation and arity]
 idFunRepArity :: Id -> RepArity
 idFunRepArity x = countFunRepArgs (idArity x) (idType x)
 


=====================================
compiler/GHC/Types/Id/Info.hs
=====================================
@@ -120,7 +120,8 @@ infixl  1 `setRuleInfo`,
           `setCafInfo`,
           `setDmdSigInfo`,
           `setCprSigInfo`,
-          `setDemandInfo`
+          `setDemandInfo`,
+          `setLFInfo`
 {-
 ************************************************************************
 *                                                                      *
@@ -374,6 +375,7 @@ data IdInfo
         --
         -- See documentation of the getters for what these packed fields mean.
         lfInfo          :: !(Maybe LambdaFormInfo),
+        -- ^ See Note [The LFInfo of Imported Ids] in GHC.StgToCmm.Closure
 
         -- See documentation of the getters for what these packed fields mean.
         tagSig          :: !(Maybe TagSig)
@@ -439,7 +441,7 @@ oneShotInfo :: IdInfo -> OneShotInfo
 oneShotInfo = bitfieldGetOneShotInfo . bitfield
 
 -- | 'Id' arity, as computed by "GHC.Core.Opt.Arity". Specifies how many arguments
--- this 'Id' has to be applied to before it doesn any meaningful work.
+-- this 'Id' has to be applied to before it does any meaningful work.
 arityInfo :: IdInfo -> ArityInfo
 arityInfo = bitfieldGetArityInfo . bitfield
 


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -87,6 +87,10 @@ import GHC.Data.FastString
 import GHC.Data.List.SetOps
 import Data.List        ( zipWith4 )
 
+-- A bit of a shame we must import these here
+import GHC.StgToCmm.Types (LambdaFormInfo(..))
+import GHC.Runtime.Heap.Layout (ArgDescr(ArgUnknown))
+
 {-
 ************************************************************************
 *                                                                      *
@@ -595,11 +599,15 @@ mkDataConWorkId wkr_name data_con
                    `setInlinePragInfo`     wkr_inline_prag
                    `setUnfoldingInfo`      evaldUnfolding  -- Record that it's evaluated,
                                                            -- even if arity = 0
+                   `setLFInfo`             wkr_lf_info
           -- No strictness: see Note [Data-con worker strictness] in GHC.Core.DataCon
 
     wkr_inline_prag = defaultInlinePragma { inl_rule = ConLike }
     wkr_arity = dataConRepArity data_con
 
+    -- See Note [LFInfo of DataCon workers and wrappers]
+    wkr_lf_info = if wkr_arity == 0 then LFCon data_con else LFReEntrant TopLevel wkr_arity True ArgUnknown
+
     ----------- Workers for newtypes --------------
     univ_tvs = dataConUnivTyVars data_con
     ex_tcvs  = dataConExTyCoVars data_con
@@ -608,6 +616,7 @@ mkDataConWorkId wkr_name data_con
                   `setArityInfo` 1      -- Arity 1
                   `setInlinePragInfo`     dataConWrapperInlinePragma
                   `setUnfoldingInfo`      newtype_unf
+                  `setLFInfo`            (LFReEntrant TopLevel 1 True ArgUnknown)
     id_arg1      = mkScaledTemplateLocal 1 (head arg_tys)
     res_ty_args  = mkTyCoVarTys univ_tvs
     newtype_unf  = assertPpr (null ex_tcvs && isSingleton arg_tys)
@@ -618,6 +627,85 @@ mkDataConWorkId wkr_name data_con
                    wrapNewTypeBody tycon res_ty_args (Var id_arg1)
 
 {-
+Note [LFInfo of DataCon workers and wrappers]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As noted in Note [The LFInfo of Imported Ids] in GHC.StgToCmm.Closure, it's
+crucial saturated data con applications are given an LFInfo of `LFCon`.
+
+Since for data constructors we never serialise the worker and the wrapper (only
+the data type declaration), we never serialise their lambda form info either.
+
+Therefore, when making data constructors workers and wrappers, we construct a
+correct LFInfo for them right away. This ensures the critical logic of creating
+the correct LFInfo for a DataCon is done once on creation and we assertain that:
+
+  The `lfInfo` field of a DataCon worker or wrapper is always populated with the correct LFInfo.
+
+which is expected by `mkLFImported`.
+NB: The greater invariant being that if an `lfInfo` field is populated, the
+    LFInfo in it contained is correct 
+
+How do we construct a /correct/ LFInfo for workers and wrappers?
+
+(1) Data constructors with arity > 0 are unambiguously functions and should be
+given `LFReEntrant`, regardless of the runtime relevance of the arguments:
+  - For example, `Just :: a -> Maybe a` is given `LFReEntrant`,
+             and `HNil :: (a ~# '[]) -> HList a` is given `LFReEntrant` too.
+
+(2) Data constructors with arity == 0 should be given `LFCon` because
+they are fully saturated data constructor applications (and pointers to them
+should be tagged with the constructor index).
+
+(2.1) A datacon *wrapper* with zero arity must be a fully saturated application
+of the worker to zero-width arguments only (which are dropped after unarisation)
+
+(2.2) A datacon *worker* with zero arity is trivially fully saturated, it takes
+no arguments whatsoever (not even zero-width args)
+
+For example, consider the following data constructors:
+
+  data T1 a where
+    TCon1 :: {-# UNPACK #-} !(a :~: True) -> T1 a
+
+  data T2 a where
+    TCon2 :: {-# UNPACK #-} !() -> T2 a
+
+  data T3 a where
+    TCon3 :: T3 '[]
+
+`TCon1`'s wrapper has a lifted argument, which is non-zero-width, while
+the worker has an unlifted equality argument, which is zero-width.
+
+`TCon2`'s wrapper has a lifted equality argument, which is non-zero-width,
+while the worker has no arguments.
+
+`TCon3`'s wrapper has no arguments, and the worker has 1 zero-width argument;
+their Core representation:
+
+  $WTCon3 :: T3 '[]
+  $WTCon3 = TCon3 @[] <Refl>
+
+  TCon3 :: forall (a :: * -> *). (a ~# []) => T a
+  TCon3 = /\a. \(co :: a~#[]). TCon3 co
+
+For `TCon1`, both the wrapper and worker will be given `LFReEntrant` since they
+both have arity == 1.
+
+For `TCon2`, the wrapper will be given `LFReEntrant` since it has arity == 1
+while the worker is `LFCon` since its arity == 0
+
+For `TCon3`, the wrapper will be given `LFCon` since its arity == 0 and the
+worker `LFReEntrant` since its arity == 1
+
+One might think we could give *workers* with only zero-width-args the `LFCon`
+LambdaFormInfo, e.g. give `LFCon` to the worker of `TCon1` and `TCon3`.
+However, these workers, albeit rarely used, are unambiguously functions
+-- which makes `LFReEntrant`, the LambdaFormInfo we give them, correct.
+See also the discussion in #23158.
+
+See also the Note [Imported unlifted nullary datacon wrappers must have correct LFInfo]
+in GHC.StgToCmm.Types.
+
 -------------------------------------------------
 --         Data constructor representation
 --
@@ -709,11 +797,15 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
                              -- We need to get the CAF info right here because GHC.Iface.Tidy
                              -- does not tidy the IdInfo of implicit bindings (like the wrapper)
                              -- so it not make sure that the CAF info is sane
+                         `setLFInfo`            wrap_lf_info
 
              -- The signature is purely for passes like the Simplifier, not for
              -- DmdAnal itself; see Note [DmdAnal for DataCon wrappers].
              wrap_sig = mkClosedDmdSig wrap_arg_dmds topDiv
 
+             -- See Note [LFInfo of DataCon workers and wrappers]
+             wrap_lf_info = if wrap_arity == 0 then LFCon data_con else LFReEntrant TopLevel wrap_arity True ArgUnknown
+
              wrap_arg_dmds =
                replicate (length theta) topDmd ++ map mk_dmd arg_ibangs
                -- Don't forget the dictionary arguments when building


=====================================
testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unlifted.hs
=====================================
@@ -0,0 +1,14 @@
+{-# LANGUAGE GADTs, DataKinds #-}
+
+import T23146_lifted_unliftedA
+
+import Data.Type.Equality
+
+fieldsSam :: NP True -> NP True -> Bool
+fieldsSam (x' ::* xs) (y' ::* ys) = fieldsSam xs ys
+fieldsSam (UNil Refl) (UNil Refl) = True
+
+main :: IO ()
+main = print (fieldsSam (UNil Refl) (UNil Refl))
+
+


=====================================
testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unlifted.stdout
=====================================
@@ -0,0 +1 @@
+True


=====================================
testsuite/tests/codeGen/should_run/T23146/T23146_lifted_unliftedA.hs
=====================================
@@ -0,0 +1,13 @@
+{-# OPTIONS_GHC -O1 #-}
+{-# LANGUAGE DataKinds #-}
+
+module T23146_lifted_unliftedA where
+
+import Data.Kind
+import Data.Type.Equality
+
+data NP a where
+  UNil :: {-# UNPACK #-} !(a :~: True) -> NP a
+  (::*) :: Bool -> NP True -> NP True
+
+


=====================================
testsuite/tests/codeGen/should_run/T23146/all.T
=====================================
@@ -1,4 +1,4 @@
 test('T23146', normal, compile_and_run, [''])
 test('T23146_lifted', normal, compile_and_run, [''])
 test('T23146_liftedeq', normal, compile_and_run, [''])
-
+test('T23146_lifted_unlifted', normal, compile_and_run, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b75b3f37312493ad0acadffe88dc1b38ca8614c8...e876c74a81987a3586d47a18d0f7dae5b8fbe352

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b75b3f37312493ad0acadffe88dc1b38ca8614c8...e876c74a81987a3586d47a18d0f7dae5b8fbe352
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230428/6f5be4c9/attachment-0001.html>


More information about the ghc-commits mailing list