[Git][ghc/ghc][wip/DataToTagSmallOp] 7 commits: apply SPJ's suggestion for DTW4 and DTW5

Matthew Craven (@clyring) gitlab at gitlab.haskell.org
Sun Dec 10 00:44:15 UTC 2023



Matthew Craven pushed to branch wip/DataToTagSmallOp at Glasgow Haskell Compiler / GHC


Commits:
7d546cc8 by Matthew Craven at 2023-12-09T19:20:32-05:00
apply SPJ's suggestion for DTW4 and DTW5

- - - - -
84cb27f9 by Matthew Craven at 2023-12-09T19:21:33-05:00
Re-flow text in DTW4

- - - - -
018153f4 by Matthew Craven at 2023-12-09T19:28:17-05:00
Fiddle with DTW4 some more

- - - - -
02805af6 by Matthew Craven at 2023-12-09T19:30:26-05:00
Refer to DTW4 from the "Each evaluates" bullet

- - - - -
336bd549 by Matthew Craven at 2023-12-09T19:34:42-05:00
Name the "special handling" bullets

- - - - -
cb0f5620 by Matthew Craven at 2023-12-09T19:40:19-05:00
Refer to DTT3 from DTW5

- - - - -
de1c039c by Matthew Craven at 2023-12-09T19:43:32-05:00
Update wrinkle DTW6

- - - - -


2 changed files:

- compiler/GHC/StgToCmm/Expr.hs
- compiler/GHC/Tc/Instance/Class.hs


Changes:

=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -75,7 +75,7 @@ cgExpr (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _res_ty) =
 
 -- dataToTagSmall# :: a_levpoly -> Int#
 -- See Note [DataToTag overview] in GHC.Tc.Instance.Class,
--- particularly wrinkle DTW4
+-- particularly wrinkles H3 and DTW4
 cgExpr (StgOpApp (StgPrimOp DataToTagSmallOp) [StgVarArg a] _res_ty) = do
   platform <- getPlatform
   emitComment (mkFastString "dataToTagSmall#")
@@ -90,7 +90,7 @@ cgExpr (StgOpApp (StgPrimOp DataToTagSmallOp) [StgVarArg a] _res_ty) = do
 
 -- dataToTagLarge# :: a_levpoly -> Int#
 -- See Note [DataToTag overview] in GHC.Tc.Instance.Class,
--- particularly wrinkle DTW4
+-- particularly wrinkles H3 and DTW4
 cgExpr (StgOpApp (StgPrimOp DataToTagLargeOp) [StgVarArg a] _res_ty) = do
   platform <- getPlatform
   emitComment (mkFastString "dataToTagLarge#")


=====================================
compiler/GHC/Tc/Instance/Class.hs
=====================================
@@ -716,17 +716,23 @@ these conditions:
 These two primops have special handling in several parts of
 the compiler:
 
-- They have a couple of built-in rewrite rules, implemented in
-  GHC.Core.Opt.ConstantFold.dataToTagRule
+H1. They have a couple of built-in rewrite rules, implemented in
+    GHC.Core.Opt.ConstantFold.dataToTagRule
 
-- The simplifier rewrites most case expressions scrutinizing their results.
-  See Note [caseRules for dataToTag] in GHC.Core.Opt.ConstantFold.
+H2. The simplifier rewrites most case expressions scrutinizing their results.
+    See Note [caseRules for dataToTag] in GHC.Core.Opt.ConstantFold.
 
-- Each evaluates its argument; this is implemented via special cases in
-  GHC.StgToCmm.Expr.cgExpr.
+H3. Each evaluates its argument.  But we want to omit this eval when the
+    actual argument is already evaluated and properly tagged.  To do this,
 
-- Additionally, a special case in GHC.Stg.InferTags.Rewrite.rewriteExpr ensures
-  that that any inferred tag information on the argument is retained until then.
+    * We have a special case in GHC.Stg.InferTags.Rewrite.rewriteOpApp
+      ensuring that any inferred tag information on the argument is
+      retained until code generation.
+
+    * We generate code via special cases in GHC.StgToCmm.Expr.cgExpr
+      instead of with the other primops in GHC.StgToCmm.Prim.emitPrimOp;
+      tag info is not readily available in the latter function.
+      (Wrinkle DTW4 describes what we generate after the eval.)
 
 Wrinkles:
 
@@ -784,30 +790,42 @@ Wrinkles:
   keepAlive on the constructor names.
   (Contrast with Note [Unused name reporting and HasField].)
 
-(DTW4) The way tag information is stored at runtime is described in
+(DTW4) Why have two primops, `dataToTagSmall#` and `dataToTagLarge#`?
+  The way tag information is stored at runtime is described in
   Note [Tagging big families] in GHC.StgToCmm.Expr.  In particular,
   for "big data types" we must consult the heap object's info table at
-  least in the MAX_PTR_TAG case, while for "small data types" we can
-  always just examine the tag bits on the pointer itself.
-
-  Although it is always correct to consult the info table, we can
-  produce slightly smaller and faster code by not doing so for "small
-  data types."  Since types and coercions are largely erased in STG,
-  the simplest reliable way to achieve this is to produce different
-  primops in DataToTag instances depending on the number of data
-  constructors the relevant TyCon has.
-
-(DTW5) We consider a call `dataToTagSmall# x` to result in undefined
-  behavior whenever the target supports pointer tagging but the actual
-  constructor index for `x` is too large to fit in the pointer's tag
-  bits.  Otherwise, `dataToTagSmall#` behaves identically to
-  `dataToTagLarge#`.
+  least in the mAX_PTR_TAG case, while for "small data types" we can
+  always just examine the tag bits on the pointer itself. So:
+
+  * dataToTagSmall# consults the tag bits in the pointer, ignoring the
+    info table.  It should, therefore, be used only for data type with
+    few enough contructors that the tag always fits in the pointer.
+
+  * dataToTagLarge# also consults the tag bits in the pointer, but
+    must fall back te examining the info table whenever those tag
+    bits are equal to mAX_PTR_TAG.
+
+  One could imagine having one primop with a small/large tag, or just
+  the data type width, but the PrimOp data type is not currently set
+  up for that.  Looking at the type information on the argument during
+  code generation is also possible, but would be less reliable.
+  Remember: type information is not always preserved in STG.
+
+(DTW5) How do the two primops differ in their semantics?  We consider
+  a call `dataToTagSmall# x` to result in undefined behavior whenever
+  the target supports pointer tagging but the actual constructor index
+  for `x` is too large to fit in the pointer's tag bits.  Otherwise,
+  `dataToTagSmall#` behaves identically to `dataToTagLarge#`.
 
   This allows the rewrites performed in GHC.Core.Opt.ConstantFold to
   safely treat `dataToTagSmall#` identically to `dataToTagLarge#`:
   the allowed program behaviors for the former is always a superset of
   the allowed program behaviors for the latter.
 
+  This undefined behavior is only observable if a user writes a
+  wrongly-sized primop call.  The calls we generate are properly-sized
+  (condition DTT3 above) so that the type system protects us.
+
 (DTW6) We make no promises about the primops used to implement
   DataToTag instances.  Changes to GHC's representation of algebraic
   data types at runtime may force us to redesign these primops.
@@ -815,11 +833,12 @@ Wrinkles:
   original (no longer existing) "dataToTag#" primop is one of the
   main reasons the DataToTag class exists!
 
-  We can currently get away with using the same primop for every
-  DataToTag instance because every Haskell-land data constructor use
-  gets translated to its own "real" heap or static data object at
-  runtime and the index of that constructor is always exposed via
-  pointer tagging and via the object's info table.
+  In particular, our current two primop implementations (as described
+  in wrinkle DTW4) are adequate for every DataToTag instance only
+  because every Haskell-land data constructor use gets translated to
+  its own "real" heap or static data object at runtime and the index
+  of that constructor is always exposed via pointer tagging and via
+  the object's info table.
 
 (DTW7) Currently, the generated module GHC.PrimopWrappers in ghc-prim
   contains the following non-sense definitions:



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/82795106afb05801c8220c6cc81e667d37303381...de1c039cbf4ed8bb07d33fb0768e55a8e79ef3fd

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/82795106afb05801c8220c6cc81e667d37303381...de1c039cbf4ed8bb07d33fb0768e55a8e79ef3fd
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231209/dedcddb4/attachment-0001.html>


More information about the ghc-commits mailing list