[Git][ghc/ghc][wip/T17521] Rewrite and document
Jaro Reinders (@Noughtmare)
gitlab at gitlab.haskell.org
Fri Aug 11 11:19:58 UTC 2023
Jaro Reinders pushed to branch wip/T17521 at Glasgow Haskell Compiler / GHC
Commits:
8d90cc76 by Jaro Reinders at 2023-08-11T13:19:50+02:00
Rewrite and document
- - - - -
2 changed files:
- compiler/GHC/Core.hs
- compiler/GHC/Core/Utils.hs
Changes:
=====================================
compiler/GHC/Core.hs
=====================================
@@ -399,11 +399,9 @@ At top level, however, there are three exceptions to this rule:
S1 = S1
We allow this top-level unlifted binding to exist.
-(TL3) A boxed top-level binding is allowed to bind the application of a data
- constructor worker to trivial arguments. These bindings are guaranteed
- to not require any evaluation and can thus be compiled to static data.
- Unboxed top-level bindings are still not allowed because references
- to them might have to be pointers.
+(TL3) A boxed top-level binding is allowed to bind the application of
+ unlifted data constructor values.
+ See Note [Core top-level unlifted data-con values].
Note [Core let-can-float invariant]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -498,6 +496,46 @@ parts of the compilation pipeline.
in the object file, the content of the exported literal is given a label with
the _bytes suffix.
+Note [Core top-level unlifted data-con values]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As another exception to the usual rule that top-level binders must be lifted,
+we allow binding unlifted data constructor values at the top level. This allows
+us to store these values directly as data in the binary that we produce, instead
+of allocating them potentially many times if they're inside a tight loop.
+
+However, we have to be very careful that we only allow data constructors that
+are really values.
+
+* We only consider data constructor workers and not wrappers, because wrappers
+ are generally not fully evaluated. See Note [The need for a wrapper].
+
+* Even data constructor workers might still be expanded by the STG rewriter to
+ perform some work, if they have arguments that are marked strict.
+ See Note [Data-con worker strictness].
+
+* If the data constructor has unlifted arguments, then those could cause further
+ evaluation to be necessary, unless they are fully evaluated data constructor
+ values themselves.
+
+Furthermore, there is another complication. The data constructor worker may be
+applied to a variable which is defined in another module, or worse, in an
+hs-boot file. So, we cannot always get all the information we need and even for
+variables defined in the same module it might still be hard or computationally
+expensive to collect the necessary information.
+
+So, for the first incarnation of this feature we choose very restrictive
+conditions, which are still useful in practice. We allow top-level unlifted
+data constructor workers if they are applied to arguments that are one of:
+
+* A Literal
+
+* Any expressions of lifted type, but only if that argument is not
+ marked strict.
+
+* An unlifted variable
+
+In the future, we hope to relax this condition (#23811).
+
Note [NON-BOTTOM-DICTS invariant]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is a global invariant (not checkable by Lint) that
=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -2017,43 +2017,65 @@ exprIsTopLevelBindable expr ty
-- consequently we must use 'mightBeUnliftedType' rather than 'isUnliftedType',
-- as the latter would panic.
|| exprIsTickedString expr
- || isBoxedType ty && exprIsDefinitelyWHNF expr
+ || isBoxedType ty && exprIsDataConValue expr
-- | Check if the expression is zero or more Ticks wrapped around a literal
-- string.
exprIsTickedString :: CoreExpr -> Bool
exprIsTickedString = isJust . exprIsTickedString_maybe
--- | This function is a very conservative approximation that only deals with
--- trivial expressions and datacon worker applications. Additionally, this
--- conservatively rejects data constructors that are applied to lifted variables
--- in argument positions that are marked strict.
--- In the future, we hope to loosen this requirement (See #23811).
-exprIsDefinitelyWHNF :: CoreExpr -> Bool
-exprIsDefinitelyWHNF x = is_definitely_whnf False x
+-- | This function checks if its argument is a data constructor value (WHNF).
+-- We use this function to determine if unlifted expressions can be floated
+-- to the top level.
+-- See Note [Core top-level unlifted data-con values] in GHC.Core
+exprIsDataConValue :: CoreExpr -> Bool
+exprIsDataConValue x = is_datacon_app x
where
- -- The first argument of 'is_definitely_whnf' indicates whether the
- -- expression is forced due to being a strict argument of an enclosing data
- -- constructor. If that is the case and the expression is a variable, then
- -- we are only sure it is fully evaluated if the variable has an unlifted
- -- type.
- is_definitely_whnf True v at Var{} = definitelyUnliftedType (exprType v)
- is_definitely_whnf _ x
+ -- We cannot use @exprIsHNF@ because it does not handle
+ -- data constructor worker strictness properly.
+ -- See Note [Data-con worker strictness] in GHC.Core.DataCon.
+ is_datacon_app x
| (Var v, xs) <- collect_args x
, Just dc <- isDataConWorkId_maybe v
- = and (zipWith is_definitely_whnf (map isMarkedStrict (dataConRepStrictness dc)) xs)
- | otherwise = exprIsTrivial x
-
- -- This is slightly different from 'collectArgs' as it looks through ticks and casts
- -- and it only collects run-time arguments.
+ = and $
+ zipWith3 arg_ok
+ (map (typeLevity_maybe . scaledThing) (dataConRepArgTys dc))
+ (map isMarkedStrict (dataConRepStrictness dc))
+ xs
+ is_datacon_app _ = False
+
+ arg_ok _ _ Lit{} = True
+ arg_ok (Just Lifted) False _ = True
+ arg_ok (Just Unlifted) _ Var{} = True
+ -- We allow nested constructor applications; we trust that they will
+ -- be ANFised in the future. E.g. given the unlifted data type:
+ --
+ -- data UNat :: UnliftedType = UZero | USucc UNat
+ --
+ -- If we lift an expression to the top level we might get:
+ --
+ -- foo = USucc (USucc UNil)
+ --
+ -- But that needs to be expanded (ANFised) to:
+ --
+ -- foo = USucc foo1
+ -- foo1 = USucc foo2
+ -- foo2 = UNil
+ --
+ -- This does always seem to happen and is checked by the Core linter.
+ arg_ok (Just Unlifted) _ x = is_datacon_app x
+ arg_ok _ _ _ = False
+
+ -- This is slightly different from 'collectArgs' as it looks through ticks
+ -- and casts and it only collects run-time arguments.
collect_args expr = go expr []
where
go (App f a) as
| isRuntimeArg a = go f (a:as)
- | otherwise = go f as
- go (Tick _ e) as = go e as
- go (Cast e _) as = go e as
- go e as = (e, as)
+ | otherwise = go f as
+ go (Tick _ e) as = go e as
+ go (Cast e _) as = go e as
+ go e as = (e, as)
-- | Extract a literal string from an expression that is zero or more Ticks
-- wrapped around a literal string. Returns Nothing if the expression has a
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/8d90cc76d4b3d2156bddf35a48ec8906ca2e7a9c
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/8d90cc76d4b3d2156bddf35a48ec8906ca2e7a9c
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230811/b80d973f/attachment-0001.html>
More information about the ghc-commits
mailing list