[Git][ghc/ghc][wip/DataToTagSmallOp] 7 commits: apply SPJ's suggestion for DTW4 and DTW5
Matthew Craven (@clyring)
gitlab at gitlab.haskell.org
Sun Dec 10 00:44:15 UTC 2023
Matthew Craven pushed to branch wip/DataToTagSmallOp at Glasgow Haskell Compiler / GHC
Commits:
7d546cc8 by Matthew Craven at 2023-12-09T19:20:32-05:00
apply SPJ's suggestion for DTW4 and DTW5
- - - - -
84cb27f9 by Matthew Craven at 2023-12-09T19:21:33-05:00
Re-flow text in DTW4
- - - - -
018153f4 by Matthew Craven at 2023-12-09T19:28:17-05:00
Fiddle with DTW4 some more
- - - - -
02805af6 by Matthew Craven at 2023-12-09T19:30:26-05:00
Refer to DTW4 from the "Each evaluates" bullet
- - - - -
336bd549 by Matthew Craven at 2023-12-09T19:34:42-05:00
Name the "special handling" bullets
- - - - -
cb0f5620 by Matthew Craven at 2023-12-09T19:40:19-05:00
Refer to DTT3 from DTW5
- - - - -
de1c039c by Matthew Craven at 2023-12-09T19:43:32-05:00
Update wrinkle DTW6
- - - - -
2 changed files:
- compiler/GHC/StgToCmm/Expr.hs
- compiler/GHC/Tc/Instance/Class.hs
Changes:
=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -75,7 +75,7 @@ cgExpr (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _res_ty) =
-- dataToTagSmall# :: a_levpoly -> Int#
-- See Note [DataToTag overview] in GHC.Tc.Instance.Class,
--- particularly wrinkle DTW4
+-- particularly wrinkles H3 and DTW4
cgExpr (StgOpApp (StgPrimOp DataToTagSmallOp) [StgVarArg a] _res_ty) = do
platform <- getPlatform
emitComment (mkFastString "dataToTagSmall#")
@@ -90,7 +90,7 @@ cgExpr (StgOpApp (StgPrimOp DataToTagSmallOp) [StgVarArg a] _res_ty) = do
-- dataToTagLarge# :: a_levpoly -> Int#
-- See Note [DataToTag overview] in GHC.Tc.Instance.Class,
--- particularly wrinkle DTW4
+-- particularly wrinkles H3 and DTW4
cgExpr (StgOpApp (StgPrimOp DataToTagLargeOp) [StgVarArg a] _res_ty) = do
platform <- getPlatform
emitComment (mkFastString "dataToTagLarge#")
=====================================
compiler/GHC/Tc/Instance/Class.hs
=====================================
@@ -716,17 +716,23 @@ these conditions:
These two primops have special handling in several parts of
the compiler:
-- They have a couple of built-in rewrite rules, implemented in
- GHC.Core.Opt.ConstantFold.dataToTagRule
+H1. They have a couple of built-in rewrite rules, implemented in
+ GHC.Core.Opt.ConstantFold.dataToTagRule
-- The simplifier rewrites most case expressions scrutinizing their results.
- See Note [caseRules for dataToTag] in GHC.Core.Opt.ConstantFold.
+H2. The simplifier rewrites most case expressions scrutinizing their results.
+ See Note [caseRules for dataToTag] in GHC.Core.Opt.ConstantFold.
-- Each evaluates its argument; this is implemented via special cases in
- GHC.StgToCmm.Expr.cgExpr.
+H3. Each evaluates its argument. But we want to omit this eval when the
+ actual argument is already evaluated and properly tagged. To do this,
-- Additionally, a special case in GHC.Stg.InferTags.Rewrite.rewriteExpr ensures
- that that any inferred tag information on the argument is retained until then.
+ * We have a special case in GHC.Stg.InferTags.Rewrite.rewriteOpApp
+ ensuring that any inferred tag information on the argument is
+ retained until code generation.
+
+ * We generate code via special cases in GHC.StgToCmm.Expr.cgExpr
+ instead of with the other primops in GHC.StgToCmm.Prim.emitPrimOp;
+ tag info is not readily available in the latter function.
+ (Wrinkle DTW4 describes what we generate after the eval.)
Wrinkles:
@@ -784,30 +790,42 @@ Wrinkles:
keepAlive on the constructor names.
(Contrast with Note [Unused name reporting and HasField].)
-(DTW4) The way tag information is stored at runtime is described in
+(DTW4) Why have two primops, `dataToTagSmall#` and `dataToTagLarge#`?
+ The way tag information is stored at runtime is described in
Note [Tagging big families] in GHC.StgToCmm.Expr. In particular,
for "big data types" we must consult the heap object's info table at
- least in the MAX_PTR_TAG case, while for "small data types" we can
- always just examine the tag bits on the pointer itself.
-
- Although it is always correct to consult the info table, we can
- produce slightly smaller and faster code by not doing so for "small
- data types." Since types and coercions are largely erased in STG,
- the simplest reliable way to achieve this is to produce different
- primops in DataToTag instances depending on the number of data
- constructors the relevant TyCon has.
-
-(DTW5) We consider a call `dataToTagSmall# x` to result in undefined
- behavior whenever the target supports pointer tagging but the actual
- constructor index for `x` is too large to fit in the pointer's tag
- bits. Otherwise, `dataToTagSmall#` behaves identically to
- `dataToTagLarge#`.
+ least in the mAX_PTR_TAG case, while for "small data types" we can
+ always just examine the tag bits on the pointer itself. So:
+
+ * dataToTagSmall# consults the tag bits in the pointer, ignoring the
+ info table. It should, therefore, be used only for data type with
+ few enough contructors that the tag always fits in the pointer.
+
+ * dataToTagLarge# also consults the tag bits in the pointer, but
+ must fall back te examining the info table whenever those tag
+ bits are equal to mAX_PTR_TAG.
+
+ One could imagine having one primop with a small/large tag, or just
+ the data type width, but the PrimOp data type is not currently set
+ up for that. Looking at the type information on the argument during
+ code generation is also possible, but would be less reliable.
+ Remember: type information is not always preserved in STG.
+
+(DTW5) How do the two primops differ in their semantics? We consider
+ a call `dataToTagSmall# x` to result in undefined behavior whenever
+ the target supports pointer tagging but the actual constructor index
+ for `x` is too large to fit in the pointer's tag bits. Otherwise,
+ `dataToTagSmall#` behaves identically to `dataToTagLarge#`.
This allows the rewrites performed in GHC.Core.Opt.ConstantFold to
safely treat `dataToTagSmall#` identically to `dataToTagLarge#`:
the allowed program behaviors for the former is always a superset of
the allowed program behaviors for the latter.
+ This undefined behavior is only observable if a user writes a
+ wrongly-sized primop call. The calls we generate are properly-sized
+ (condition DTT3 above) so that the type system protects us.
+
(DTW6) We make no promises about the primops used to implement
DataToTag instances. Changes to GHC's representation of algebraic
data types at runtime may force us to redesign these primops.
@@ -815,11 +833,12 @@ Wrinkles:
original (no longer existing) "dataToTag#" primop is one of the
main reasons the DataToTag class exists!
- We can currently get away with using the same primop for every
- DataToTag instance because every Haskell-land data constructor use
- gets translated to its own "real" heap or static data object at
- runtime and the index of that constructor is always exposed via
- pointer tagging and via the object's info table.
+ In particular, our current two primop implementations (as described
+ in wrinkle DTW4) are adequate for every DataToTag instance only
+ because every Haskell-land data constructor use gets translated to
+ its own "real" heap or static data object at runtime and the index
+ of that constructor is always exposed via pointer tagging and via
+ the object's info table.
(DTW7) Currently, the generated module GHC.PrimopWrappers in ghc-prim
contains the following non-sense definitions:
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/82795106afb05801c8220c6cc81e667d37303381...de1c039cbf4ed8bb07d33fb0768e55a8e79ef3fd
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/82795106afb05801c8220c6cc81e667d37303381...de1c039cbf4ed8bb07d33fb0768e55a8e79ef3fd
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231209/dedcddb4/attachment-0001.html>
More information about the ghc-commits
mailing list