[Git][ghc/ghc][wip/T23146] Make LFInfos for DataCons on construction
Rodrigo Mesquita (@alt-romes)
gitlab at gitlab.haskell.org
Fri Apr 28 11:38:36 UTC 2023
Rodrigo Mesquita pushed to branch wip/T23146 at Glasgow Haskell Compiler / GHC
Commits:
d0fd4454 by Rodrigo Mesquita at 2023-04-28T12:38:25+01:00
Make LFInfos for DataCons on construction
As a result of the discussion in !10165, we decided to amend the
previous commit which fixed the logic of `mkLFImported` with regard to
datacon workers and wrappers.
Instead of having the logic for the LFInfo of datacons be in
`mkLFImported`, we now construct an LFInfo for all data constructors on
GHC.Types.Id.Make and store it in the `lfInfo` field.
See the new Note [LFInfo of DataCon workers and wrappers] and
ammendments to Note [The LFInfo of Imported Ids]
- - - - -
3 changed files:
- compiler/GHC/StgToCmm/Closure.hs
- compiler/GHC/Types/Id/Info.hs
- compiler/GHC/Types/Id/Make.hs
Changes:
=====================================
compiler/GHC/StgToCmm/Closure.hs
=====================================
@@ -96,6 +96,7 @@ import GHC.Utils.Outputable
import GHC.Utils.Panic
import GHC.Utils.Panic.Plain
import GHC.Utils.Misc
+import GHC.Data.Maybe (isNothing)
import Data.Coerce (coerce)
import qualified Data.ByteString.Char8 as BS8
@@ -255,130 +256,65 @@ mkApLFInfo id upd_flag arity
(mightBeFunTy (idType id))
-------------
+-- | Make a 'LambdaFormInfo' for an imported Id.
+-- See Note [The LFInfo of Imported Ids]
mkLFImported :: Id -> LambdaFormInfo
mkLFImported id =
-- See Note [Conveying CAF-info and LFInfo between modules] in
-- GHC.StgToCmm.Types
case idLFInfo_maybe id of
Just lf_info ->
- -- Use the LambdaFormInfo from the interface
+ -- Use the existing LambdaFormInfo
lf_info
Nothing
- -- Interface doesn't have a LambdaFormInfo, so make a conservative one from the type.
- -- See Note [The LFInfo of Imported Ids]; The order of the guards musn't be changed!
+ -- Doesn't have a LambdaFormInfo, but we know it must be 'LFReEntrant' from its arity
| arity > 0
-> LFReEntrant TopLevel arity True ArgUnknown
- | Just con <- isDataConId_maybe id
- -- See Note [Imported unlifted nullary datacon wrappers must have correct LFInfo] in GHC.StgToCmm.Types
- -- and Note [The LFInfo of Imported Ids] below
- -> assert (hasNoNonZeroWidthArgs con) $
- LFCon con -- An imported nullary constructor
- -- We assume that the constructor is evaluated so that
- -- the id really does point directly to the constructor
-
+ -- We can't be sure of the LambdaFormInfo of this imported Id,
+ -- so make a conservative one from the type.
| otherwise
- -> mkLFArgument id -- Not sure of exact arity
+ -> assert (isNothing (isDataConId_maybe id)) $ -- See Note [LFInfo of DataCon workers and wrappers] in GHC.Types.Id.Make
+ mkLFArgument id -- Not sure of exact arity
where
arity = idFunRepArity id
- hasNoNonZeroWidthArgs = all (isZeroBitTy . scaledThing) . dataConRepArgTys
{-
Note [The LFInfo of Imported Ids]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-As explained in Note [Conveying CAF-info and LFInfo between modules] and
-Note [Imported unlifted nullary datacon wrappers must have correct LFInfo], the
-LambdaFormInfo records the details of a closure representation and is often,
-when optimisations are enabled, serialized to the interface of a module.
-
-In particular, the `lfInfo` field of the `IdInfo` field of an `Id`
-* For Ids defined in this module: is `Nothing`
-* For imported Ids:
+As explained in Note [Conveying CAF-info and LFInfo between modules]
+the LambdaFormInfo records the details of a closure representation and is
+often, when optimisations are enabled, serialized to the interface of a module.
+
+In particular, the `lfInfo` field of the `IdInfo` field of an `Id`:
+* For DataCon workers and wrappers is populated as described in
+Note [LFInfo of DataCon workers and wrappers] in GHC.Types.Id.Make
+* For other Ids defined in this module: is `Nothing`
+* For other imported Ids:
* is (Just lf_info) if the LFInfo was serialised into the interface file
(typically, when the exporting module was compiled with -O)
* is Nothing if it wasn't serialised
-However, when an interface doesn't have a LambdaFormInfo for some imported Id
-(so that its `lfInfo` field is `Nothing`), we can conservatively create one
-using `mkLFImported`.
-
The LambdaFormInfo we give an Id is used in determining how to tag its pointer
-(see `litIdInfo`). Therefore, it's crucial we re-construct a LambdaFormInfo as
-faithfully as possible or otherwise risk having pointers incorrectly tagged,
-which can lead to performance issues and even segmentation faults (see #23231
-and #23146). In particular, saturated data constructor applications *must* be
-unambiguously given `LFCon`, and the invariant
-
- If the LFInfo (serialised or built with mkLFImported) says LFCon, then it
- really is a static data constructor, and similar for LFReEntrant
-
-must be upheld.
-
-In `mkLFImported`, we make a conservative approximation to the real
-LambdaFormInfo as follows:
-
-(1) Ids with an `idFunRepArity > 0` are `LFReEntrant` and pointers to them are
-tagged (by `litIdInfo`) with the corresponding arity.
- - This is also true of data con wrappers and workers with arity > 0,
- regardless of the runtime relevance of the arguments
- - For example, `Just :: a -> Maybe a` is given `LFReEntrant`
- and `HNil :: (a ~# '[]) -> HList a` is given `LFReEntrant` too
-
-(2) Data constructors with `idFunRepArity == 0` should be given `LFCon` because
-they are fully saturated data constructor applications and pointers to them
-should be tagged with the constructor index.
-
-(2.1) A datacon *wrapper* with zero arity must be a fully saturated application
-of the worker to zero-width arguments only (which are dropped after unarisation)
-
-(2.2) A datacon *worker* with zero arity is trivially fully saturated, it takes
-no arguments whatsoever (not even zero-width args)
-
-To ensure we properly give `LFReEntrant` to data constructors with some arity,
-and `LFCon` only to data constructors with zero arity, we must first check for
-`arity > 0` and only afterwards `isDataConId` -- the order of the guards in
-`mkLFImported` is quite important.
-
-As an example, consider the following data constructors:
-
- data T1 a where
- TCon1 :: {-# UNPACK #-} !(a :~: True) -> T1 a
-
- data T2 a where
- TCon2 :: {-# UNPACK #-} !() -> T2 a
-
- data T3 a where
- TCon3 :: T3 '[]
-
-`TCon1`'s wrapper has a lifted equality argument, which is non-zero-width, while
-the worker has an unlifted equality argument, which is zero-width.
-
-`TCon2`'s wrapper has a lifted equality argument, which is non-zero-width,
-while the worker has no arguments.
-
-`TCon3`'s wrapper has no arguments, and the worker has 1 zero-width argument;
-their Core representation:
-
- $WTCon3 :: T3 '[]
- $WTCon3 = TCon3 @[] <Refl>
-
- TCon3 :: forall (a :: * -> *). (a ~# []) => T a
- TCon3 = /\a. \(co :: a~#[]). TCon3 co
-
-For `TCon1`, both the wrapper and worker will be given `LFReEntrant` since they
-both have arity == 1.
-
-For `TCon2`, the wrapper will be given `LFReEntrant` since it has arity == 1
-while the worker is `LFCon` since its arity == 0
-
-For `TCon3`, the wrapper will be given `LFCon` since its arity == 0 and the
-worker `LFReEntrant` since its arity == 1
-
-One might think we could give *workers* with only zero-width-args the `LFCon`
-LambdaFormInfo, e.g. give `LFCon` to the worker of `TCon1` and `TCon3`.
-However, these workers, albeit rarely used, are unambiguously functions
--- which makes `LFReEntrant`, the LambdaFormInfo we give them, correct.
-See also the discussion in #23158.
+(see `litIdInfo`). Therefore, it's crucial we attribute a correct
+LambdaFormInfo to imported Ids, or otherwise risk having pointers incorrectly
+tagged which can lead to performance issues and even segmentation faults (see
+#23231 and #23146). In particular, saturated data constructor applications
+*must* be unambiguously given `LFCon`, and if the LFInfo says LFCon, then it
+really is a static data constructor, and similar for LFReEntrant.
+
+In `mkLFImported`, we construct a LambdaFormInfo for imported Ids as follows:
+
+(1) If the `lfInfo` field contains an LFInfo, we use that LFInfo which is
+correct by construction (the invariant being that if it exists, it is correct):
+ (1.1) Either it was serialised to the interface we're importing the Id from,
+ (1.2) Or it's a DataCon worker or wrapper and its LFInfo was constructed
+ according to Note [LFInfo of DataCon workers and wrappers]
+(2) When the `lfInfo` field is `Nothing`
+ (2.1) If the `idFunRepArity` of the Id is known and is greater than 0, then
+ the Id is unambiguously a function and is given `LFReEntrant`, and pointers
+ to this Id will be tagged (by `litIdInfo`) with the corresponding arity.
+ (2.2) Otherwise, we can make a conservative estimate from the type.
-}
=====================================
compiler/GHC/Types/Id/Info.hs
=====================================
@@ -120,7 +120,8 @@ infixl 1 `setRuleInfo`,
`setCafInfo`,
`setDmdSigInfo`,
`setCprSigInfo`,
- `setDemandInfo`
+ `setDemandInfo`,
+ `setLFInfo`
{-
************************************************************************
* *
=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -87,6 +87,10 @@ import GHC.Data.FastString
import GHC.Data.List.SetOps
import Data.List ( zipWith4 )
+-- A bit of a shame we must import these here
+import GHC.StgToCmm.Types (LambdaFormInfo(..))
+import GHC.Runtime.Heap.Layout (ArgDescr(ArgUnknown))
+
{-
************************************************************************
* *
@@ -595,11 +599,17 @@ mkDataConWorkId wkr_name data_con
`setInlinePragInfo` wkr_inline_prag
`setUnfoldingInfo` evaldUnfolding -- Record that it's evaluated,
-- even if arity = 0
+ `setLFInfo` wkr_lf_info
-- No strictness: see Note [Data-con worker strictness] in GHC.Core.DataCon
wkr_inline_prag = defaultInlinePragma { inl_rule = ConLike }
wkr_arity = dataConRepArity data_con
+ -- See Note [LFInfo of DataCon workers and wrappers]
+ wkr_lf_info
+ | wkr_arity == 0 = LFCon data_con
+ | otherwise = LFReEntrant TopLevel wkr_arity True ArgUnknown
+
----------- Workers for newtypes --------------
univ_tvs = dataConUnivTyVars data_con
ex_tcvs = dataConExTyCoVars data_con
@@ -608,6 +618,7 @@ mkDataConWorkId wkr_name data_con
`setArityInfo` 1 -- Arity 1
`setInlinePragInfo` dataConWrapperInlinePragma
`setUnfoldingInfo` newtype_unf
+ `setLFInfo` (LFReEntrant TopLevel 1 True ArgUnknown)
id_arg1 = mkScaledTemplateLocal 1 (head arg_tys)
res_ty_args = mkTyCoVarTys univ_tvs
newtype_unf = assertPpr (null ex_tcvs && isSingleton arg_tys)
@@ -618,6 +629,85 @@ mkDataConWorkId wkr_name data_con
wrapNewTypeBody tycon res_ty_args (Var id_arg1)
{-
+Note [LFInfo of DataCon workers and wrappers]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As noted in Note [The LFInfo of Imported Ids] in GHC.StgToCmm.Closure, it's
+crucial saturated data con applications are given an LFInfo of `LFCon`.
+
+Since for data constructors we never serialise the worker and the wrapper (only
+the data type declaration), we never serialise their lambda form info either.
+
+Therefore, when making data constructors workers and wrappers, we construct a
+correct LFInfo for them right away. This ensures the critical logic of creating
+the correct LFInfo for a DataCon is done once on creation and we assertain that:
+
+ The `lfInfo` field of a DataCon worker or wrapper is always populated with the correct LFInfo.
+
+which is expected by `mkLFImported`.
+NB: The greater invariant being that if an `lfInfo` field is populated, the
+ LFInfo in it contained is correct
+
+How do we construct a /correct/ LFInfo for workers and wrappers?
+
+(1) Data constructors with arity > 0 are unambiguously functions and should be
+given `LFReEntrant`, regardless of the runtime relevance of the arguments:
+ - For example, `Just :: a -> Maybe a` is given `LFReEntrant`,
+ and `HNil :: (a ~# '[]) -> HList a` is given `LFReEntrant` too.
+
+(2) Data constructors with arity == 0 should be given `LFCon` because
+they are fully saturated data constructor applications (and pointers to them
+should be tagged with the constructor index).
+
+(2.1) A datacon *wrapper* with zero arity must be a fully saturated application
+of the worker to zero-width arguments only (which are dropped after unarisation)
+
+(2.2) A datacon *worker* with zero arity is trivially fully saturated, it takes
+no arguments whatsoever (not even zero-width args)
+
+For example, consider the following data constructors:
+
+ data T1 a where
+ TCon1 :: {-# UNPACK #-} !(a :~: True) -> T1 a
+
+ data T2 a where
+ TCon2 :: {-# UNPACK #-} !() -> T2 a
+
+ data T3 a where
+ TCon3 :: T3 '[]
+
+`TCon1`'s wrapper has a lifted argument, which is non-zero-width, while
+the worker has an unlifted equality argument, which is zero-width.
+
+`TCon2`'s wrapper has a lifted equality argument, which is non-zero-width,
+while the worker has no arguments.
+
+`TCon3`'s wrapper has no arguments, and the worker has 1 zero-width argument;
+their Core representation:
+
+ $WTCon3 :: T3 '[]
+ $WTCon3 = TCon3 @[] <Refl>
+
+ TCon3 :: forall (a :: * -> *). (a ~# []) => T a
+ TCon3 = /\a. \(co :: a~#[]). TCon3 co
+
+For `TCon1`, both the wrapper and worker will be given `LFReEntrant` since they
+both have arity == 1.
+
+For `TCon2`, the wrapper will be given `LFReEntrant` since it has arity == 1
+while the worker is `LFCon` since its arity == 0
+
+For `TCon3`, the wrapper will be given `LFCon` since its arity == 0 and the
+worker `LFReEntrant` since its arity == 1
+
+One might think we could give *workers* with only zero-width-args the `LFCon`
+LambdaFormInfo, e.g. give `LFCon` to the worker of `TCon1` and `TCon3`.
+However, these workers, albeit rarely used, are unambiguously functions
+-- which makes `LFReEntrant`, the LambdaFormInfo we give them, correct.
+See also the discussion in #23158.
+
+See also the Note [Imported unlifted nullary datacon wrappers must have correct LFInfo]
+in GHC.StgToCmm.Types.
+
-------------------------------------------------
-- Data constructor representation
--
@@ -709,11 +799,17 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
-- We need to get the CAF info right here because GHC.Iface.Tidy
-- does not tidy the IdInfo of implicit bindings (like the wrapper)
-- so it not make sure that the CAF info is sane
+ `setLFInfo` wrap_lf_info
-- The signature is purely for passes like the Simplifier, not for
-- DmdAnal itself; see Note [DmdAnal for DataCon wrappers].
wrap_sig = mkClosedDmdSig wrap_arg_dmds topDiv
+ -- See Note [LFInfo of DataCon workers and wrappers]
+ wrap_lf_info
+ | wrap_arity == 0 = LFCon data_con
+ | otherwise = LFReEntrant TopLevel wrap_arity True ArgUnknown
+
wrap_arg_dmds =
replicate (length theta) topDmd ++ map mk_dmd arg_ibangs
-- Don't forget the dictionary arguments when building
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/d0fd4454d93aff0d193c37593cb37d4872f4b81c
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/d0fd4454d93aff0d193c37593cb37d4872f4b81c
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230428/ac2771d7/attachment-0001.html>
More information about the ghc-commits
mailing list