[Git][ghc/ghc][wip/T23146] Merge outdated Note [Data con representation] into Note [Data constructor representation]

Rodrigo Mesquita (@alt-romes) gitlab at gitlab.haskell.org
Fri May 5 14:55:18 UTC 2023



Rodrigo Mesquita pushed to branch wip/T23146 at Glasgow Haskell Compiler / GHC


Commits:
c77f81c8 by Rodrigo Mesquita at 2023-05-05T15:55:10+01:00
Merge outdated Note [Data con representation] into Note [Data constructor representation]

Introduce new Note [Constructor applications in STG] to better support
the merge, and reference it from the relevant bits in the STG syntax.

- - - - -


2 changed files:

- compiler/GHC/Core/DataCon.hs
- compiler/GHC/Stg/Syntax.hs


Changes:

=====================================
compiler/GHC/Core/DataCon.hs
=====================================
@@ -141,7 +141,19 @@ becomes
         case e of { T a' b -> let a = I# a' in ... }
 
 To keep ourselves sane, we name the different versions of the data constructor
-differently, as follows.
+differently, as follows in Note [Data Constructor Naming].
+
+The `dcRepType` field of a `DataCon` contains the type of the representation of
+the constructor /worker/, also called the Core representation.
+
+The Core representation may differ from the type of the constructor /wrapper/
+(built by `mkDataConId`). Besides unpacking (as seen in the example above),
+dictionaries and coercions become explict arguments in the Core representation
+of a constructor.
+
+Note that this representation is still *different* from runtime
+representation. (Which is what STG uses after unarise).
+See Note [Constructor applications in STG] in GHC.Stg.Syntax.
 
 
 Note [Data Constructor Naming]
@@ -528,7 +540,7 @@ data DataCon
                                 --      forall a x y. (a~(x,y), x~y, Ord x) =>
                                 --        x -> y -> T a
                                 -- (this is *not* of the constructor wrapper Id:
-                                --  see Note [Data con representation] below)
+                                --  see Note [Data constructor representation])
         -- Notice that the existential type parameters come *second*.
         -- Reason: in a case expression we may find:
         --      case (e :: T t) of
@@ -985,51 +997,6 @@ we consult HsImplBang:
 The boolean flag is used only for this warning.
 See #11270 for motivation.
 
-Note [Data con representation]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The dcRepType field contains the type of the representation of a constructor
-This may differ from the type of the constructor *Id* (built
-by MkId.mkDataConId) for two reasons:
-        a) the constructor Id may be overloaded, but the dictionary isn't stored
-           e.g.    data Eq a => T a = MkT a a
-
-        b) the constructor may store an unboxed version of a strict field.
-
-So whenever this module talks about the representation of a data constructor
-what it means is the DataCon with all Unpacking having been applied.
-We can think of this as the Core representation.
-
-Here's an example illustrating the Core representation:
-        data Ord a => T a = MkT Int! a Void#
-Here
-        T :: Ord a => Int -> a -> Void# -> T a
-but the rep type is
-        Trep :: Int# -> a -> Void# -> T a
-Actually, the unboxed part isn't implemented yet!
-
-Note that this representation is still *different* from runtime
-representation. (Which is what STG uses after unarise).
-
-This is how T would end up being used in STG post-unarise:
-
-  let x = T 1# y
-  in ...
-      case x of
-        T int a -> ...
-
-The Void# argument is dropped and the boxed int is replaced by an unboxed
-one. In essence we only generate binders for runtime relevant values.
-
-We also flatten out unboxed tuples in this process. See the unarise
-pass for details on how this is done. But as an example consider
-`data S = MkS Bool (# Bool | Char #)` which when matched on would
-result in an alternative with three binders like this
-
-    MkS bool tag tpl_field ->
-
-See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation]
-for the details of this transformation.
-
 
 ************************************************************************
 *                                                                      *


=====================================
compiler/GHC/Stg/Syntax.hs
=====================================
@@ -237,6 +237,52 @@ StgConApp and StgPrimApp --- saturated applications
 
 There are specialised forms of application, for constructors, primitives, and
 literals.
+
+Note [Constructor applications in STG]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+After the unarisation pass:
+* In `StgConApp` and `StgRhsCon` and `StgAlt` we filter out the void arguments,
+  leaving only non-void ones.
+* In `StgApp` and `StgOpApp` we retain void arguments.
+
+We can do this because we know that `StgConApp` and `StgRhsCon` are saturated applications,
+so we lose no information by dropping those void args.  In contrast, in `StgApp` we need the
+ void argument to compare the number of args in the call with the arity of the function.
+
+This is an open design choice.  We could instead choose to treat all these applications
+consistently (keeping the void args).  But for some reason we don't, and this Note simply
+documents that design choice.
+
+As an example, consider:
+
+        data T a = MkT Int! a Void#
+
+The wrapper's representation and the worker's representation (i.e. the
+datacon's Core representation) are respectively:
+
+        $WT :: Int  -> a -> Void# -> T a
+        T   :: Int# -> a -> Void# -> T a
+
+T would end up being used in STG post-unarise as:
+
+  let x = T 1# y
+  in ...
+      case x of
+        T int a -> ...
+
+The Void# argument is dropped. In essence we only generate binders for runtime
+relevant values.
+
+We also flatten out unboxed tuples in this process. See the unarise
+pass for details on how this is done. But as an example consider
+`data S = MkS Bool (# Bool | Char #)` which when matched on would
+result in an alternative with three binders like this
+
+    MkS bool tag tpl_field ->
+
+See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation]
+for the details of this transformation.
+
 -}
 
   | StgLit      Literal
@@ -245,7 +291,7 @@ literals.
         -- which can't be let-bound
   | StgConApp   DataCon
                 ConstructorNumber
-                [StgArg] -- Saturated. (After Unarisation, [NonVoid StgArg])
+                [StgArg] -- Saturated. See Note [Constructor applications in STG]
                 [Type]   -- See Note [Types in StgConApp] in GHC.Stg.Unarise
 
   | StgOpApp    StgOp    -- Primitive op or foreign call
@@ -422,7 +468,7 @@ important):
                         -- are not allocated.
         ConstructorNumber
         [StgTickish]
-        [StgArg]        -- Args
+        [StgArg]        -- Saturated Args. See Note [Constructor applications in STG]
         Type            -- Type, for rewriting to an StgRhsClosure
 
 -- | Like 'GHC.Hs.Extension.NoExtField', but with an 'Outputable' instance that



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/c77f81c87e43b1dd1d34d7c928b323b675cab8c5

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/c77f81c87e43b1dd1d34d7c928b323b675cab8c5
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230505/a363fd1e/attachment-0001.html>


More information about the ghc-commits mailing list