[Git][ghc/ghc][wip/T22935] Make pseq robust by using seq#
Matthew Craven (@clyring)
gitlab at gitlab.haskell.org
Thu Jul 27 16:33:01 UTC 2023
Matthew Craven pushed to branch wip/T22935 at Glasgow Haskell Compiler / GHC
Commits:
7d2e5b17 by Matthew Craven at 2023-07-27T12:31:51-04:00
Make pseq robust by using seq#
Fixes #23699, fixes #23233.
Note [seq# magic] is expanded, fixing #22935.
The two tests for T15226 do not really test issue #15226;
instead they check that we can unbox through seq# in some
conditions. The way we previously achieved that for these
two test cases is generally bogus; see the new wrinkle S6 in
Note [seq# magic]. Safely unboxing through seq# is probably
possible under the right conditions but deserves a new primop
and is left as future work.
Metric Increase:
T15226
T15226a
- - - - -
2 changed files:
- compiler/GHC/Core/Opt/ConstantFold.hs
- libraries/base/GHC/Conc/Sync.hs
Changes:
=====================================
compiler/GHC/Core/Opt/ConstantFold.hs
=====================================
@@ -823,7 +823,6 @@ primOpRules nm = \case
AddrAddOp -> mkPrimOpRule nm 2 [ rightIdentityPlatform zeroi ]
- SeqOp -> mkPrimOpRule nm 4 [ seqRule ]
SparkOp -> mkPrimOpRule nm 4 [ sparkRule ]
_ -> Nothing
@@ -2080,36 +2079,86 @@ mechanism for 'evaluate'
evaluate a = IO $ \s -> seq# a s
The semantics of seq# is
+ * wait for all earlier actions in the State#-token-thread to complete
* evaluate its first argument
* and return it
Things to note
-* Why do we need a primop at all? That is, instead of
- case seq# x s of (# x, s #) -> blah
- why not instead say this?
- case x of { DEFAULT -> blah)
+S1. Why do we need a primop at all? That is, instead of
+ case seq# x s of (# x, s #) -> blah
+ why not instead say this?
+ case x of { DEFAULT -> blah)
- Reason (see #5129): if we saw
- catch# (\s -> case x of { DEFAULT -> raiseIO# exn s }) handler
+ Reason (see #5129): if we saw
+ catch# (\s -> case x of { DEFAULT -> raiseIO# exn s }) handler
- then we'd drop the 'case x' because the body of the case is bottom
- anyway. But we don't want to do that; the whole /point/ of
- seq#/evaluate is to evaluate 'x' first in the IO monad.
+ then we'd drop the 'case x' because the body of the case is bottom
+ anyway. But we don't want to do that; the whole /point/ of
+ seq#/evaluate is to evaluate 'x' first in the IO monad.
- In short, we /always/ evaluate the first argument and never
- just discard it.
+ In short, we /always/ evaluate the first argument and never
+ just discard it.
-* Why return the value? So that we can control sharing of seq'd
- values: in
- let x = e in x `seq` ... x ...
- We don't want to inline x, so better to represent it as
- let x = e in case seq# x RW of (# _, x' #) -> ... x' ...
- also it matches the type of rseq in the Eval monad.
+S2. Why return the value? So that we can control sharing of seq'd
+ values: in
+ let x = e in x `seq` ... x ...
+ We don't want to inline x, so better to represent it as
+ let x = e in case seq# x RW of (# _, x' #) -> ... x' ...
+ also it matches the type of rseq in the Eval monad.
-Implementing seq#. The compiler has magic for SeqOp in
+S3. seq# is also used to implement pseq#:
+
+ pseq !x y = case seq# x realWorld# of
+ (# s, _ #) -> case seq# y s of
+ (# _, y' #) -> y'
+
+ The state token threading between `seq# x realWorld#` and `seq# y s`
+ should ensure that `y` is not evaluated before `x`. Historically we
+ used instead ``pseq x y = x `seq` lazy y``, but this was not robust;
+ see #23233 and #23699.
+
+S4. We do not consider `seq# x s` to raise a precise exception even
+ when `x` certainly raises an exception; doing so would cause
+ `evaluate` and `pseq` to interfere with unrelated strictness properties.
+
+S5. Because seq# waits for all earlier actions in its
+ State#-token-thread to complete before evaluating its first
+ argument, it is (perhaps surprisingly) NOT unconditionally strict
+ in that first argument. Making it strict in its first argument
+ would semantically permit us to re-order the eval with respect to
+ earlier actions in the State#-token-thread, undermining the
+ utility of `evaluate` for sequencing evaluation with respect to
+ side-effects.
+
+ A user who wants strictness can ask for it as easily as writing
+ `evaluate $! x` instead of `evaluate x`:
+
+ * `evaluate x` means "evaluate `x` at exactly this moment"
+ * `evaluate $! x` means "evaluate `x` at this moment /or earlier/"
+
+S6. We used to rewrite `seq# x<whnf> s` to `(# s, x #)` with the
+ reasoning that x has already been evaluated, so seq# has nothing
+ to do. But doing so defeats the ordering that seq# provides.
+ Consider this example:
+
+ (start)
+ evaluate $! x
+
+ (inline $! and evaluate)
+ case x of !x' -> IO $ \s -> seq# x' s
-- GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <whnf> s)
+ (x' is whnf)
+ case x of !x' -> IO (\s -> (# s, x' #))
+
+ Note that this last expression is equivalent to `pure $! x`. Its
+ evaluation of `x` is completely untethered to the state thread of
+ any enclosing sequence of IO actions; as such the simplifier may
+ drop the eval entirely if `x` is used strictly later in the
+ sequence of IO actions.
+
+
+Implementing seq#. The compiler has magic for SeqOp in
- GHC.StgToCmm.Expr.cgExpr, and cgCase: special case for seq#
@@ -2121,15 +2170,14 @@ Implementing seq#. The compiler has magic for SeqOp in
in GHC.Core.Opt.Simplify
-}
-seqRule :: RuleM CoreExpr
-seqRule = do
- [Type _ty_a, Type _ty_s, a, s] <- getArgs
- guard $ exprIsHNF a
- return $ mkCoreUnboxedTuple [s, a]
-- spark# :: forall a s . a -> State# s -> (# State# s, a #)
sparkRule :: RuleM CoreExpr
-sparkRule = seqRule -- reduce on HNF, just the same
+sparkRule = do
+ [Type _ty_a, Type _ty_s, a, s] <- getArgs
+ guard $ exprIsHNF a
+ return $ mkCoreUnboxedTuple [s, a]
+ -- reduce on HNF
-- XXX perhaps we shouldn't do this, because a spark eliminated by
-- this rule won't be counted as a dud at runtime?
=====================================
libraries/base/GHC/Conc/Sync.hs
=====================================
@@ -519,19 +519,21 @@ labelThreadByteArray# (ThreadId t) str =
-- but 'seq' is now defined in GHC.Prim
--
-- "pseq" is defined a bit weirdly (see below)
---
--- The reason for the strange "lazy" call is that
--- it fools the compiler into thinking that pseq and par are non-strict in
--- their second argument (even if it inlines pseq at the call site).
--- If it thinks pseq is strict in "y", then it often evaluates
--- "y" before "x", which is totally wrong.
-
+-- The state token threading is meant to ensure that
+-- we cannot move the eval of y before the eval of x
{-# INLINE pseq #-}
pseq :: a -> b -> b
-pseq x y = x `seq` lazy y
+pseq !x y = case seq# x realWorld# of
+ (# s, _ #) -> case seq# y s of
+ (# _, y' #) -> y'
{-# INLINE par #-}
par :: a -> b -> b
+-- The reason for the strange "lazy" call is that
+-- it fools the compiler into thinking that par are non-strict in
+-- their second argument (even if it inlines par at the call site).
+-- If it thinks par is strict in "y", then it often evaluates
+-- "y" before "x", which is totally wrong.
par x y = case (par# x) of { _ -> lazy y }
-- | Internal function used by the RTS to run sparks.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7d2e5b17f1467c3181a43b82af5165a8106a03a5
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7d2e5b17f1467c3181a43b82af5165a8106a03a5
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230727/7a59fc87/attachment-0001.html>
More information about the ghc-commits
mailing list