[Git][ghc/ghc][wip/T22935] Make pseq robust by using seq#

Thu Jul 27 16:33:01 UTC 2023

Matthew Craven pushed to branch wip/T22935 at Glasgow Haskell Compiler / GHC


Commits:
7d2e5b17 by Matthew Craven at 2023-07-27T12:31:51-04:00
Make pseq robust by using seq#

Fixes #23699, fixes #23233.
Note [seq# magic] is expanded, fixing #22935.

The two tests for T15226 do not really test issue #15226;
instead they check that we can unbox through seq# in some
conditions.  The way we previously achieved that for these
two test cases is generally bogus; see the new wrinkle S6 in
Note [seq# magic].  Safely unboxing through seq# is probably
possible under the right conditions but deserves a new primop
and is left as future work.

Metric Increase:
    T15226
    T15226a

- - - - -


2 changed files:

- compiler/GHC/Core/Opt/ConstantFold.hs
- libraries/base/GHC/Conc/Sync.hs


Changes:

=====================================
compiler/GHC/Core/Opt/ConstantFold.hs
=====================================
@@ -823,7 +823,6 @@ primOpRules nm = \case
 
    AddrAddOp  -> mkPrimOpRule nm 2 [ rightIdentityPlatform zeroi ]
 
-   SeqOp      -> mkPrimOpRule nm 4 [ seqRule ]
    SparkOp    -> mkPrimOpRule nm 4 [ sparkRule ]
 
    _          -> Nothing
@@ -2080,36 +2079,86 @@ mechanism for 'evaluate'
    evaluate a = IO $ \s -> seq# a s
 
 The semantics of seq# is
+  * wait for all earlier actions in the State#-token-thread to complete
   * evaluate its first argument
   * and return it
 
 Things to note
 
-* Why do we need a primop at all?  That is, instead of
-      case seq# x s of (# x, s #) -> blah
-  why not instead say this?
-      case x of { DEFAULT -> blah)
+S1. Why do we need a primop at all?  That is, instead of
+        case seq# x s of (# x, s #) -> blah
+    why not instead say this?
+        case x of { DEFAULT -> blah)
 
-  Reason (see #5129): if we saw
-    catch# (\s -> case x of { DEFAULT -> raiseIO# exn s }) handler
+    Reason (see #5129): if we saw
+      catch# (\s -> case x of { DEFAULT -> raiseIO# exn s }) handler
 
-  then we'd drop the 'case x' because the body of the case is bottom
-  anyway. But we don't want to do that; the whole /point/ of
-  seq#/evaluate is to evaluate 'x' first in the IO monad.
+    then we'd drop the 'case x' because the body of the case is bottom
+    anyway. But we don't want to do that; the whole /point/ of
+    seq#/evaluate is to evaluate 'x' first in the IO monad.
 
-  In short, we /always/ evaluate the first argument and never
-  just discard it.
+    In short, we /always/ evaluate the first argument and never
+    just discard it.
 
-* Why return the value?  So that we can control sharing of seq'd
-  values: in
-     let x = e in x `seq` ... x ...
-  We don't want to inline x, so better to represent it as
-       let x = e in case seq# x RW of (# _, x' #) -> ... x' ...
-  also it matches the type of rseq in the Eval monad.
+S2. Why return the value?  So that we can control sharing of seq'd
+    values: in
+       let x = e in x `seq` ... x ...
+    We don't want to inline x, so better to represent it as
+         let x = e in case seq# x RW of (# _, x' #) -> ... x' ...
+    also it matches the type of rseq in the Eval monad.
 
-Implementing seq#.  The compiler has magic for SeqOp in
+S3. seq# is also used to implement pseq#:
+
+      pseq !x y = case seq# x realWorld# of
+        (# s, _ #) -> case seq# y s of
+          (# _, y' #) -> y'
+
+    The state token threading between `seq# x realWorld#` and `seq# y s`
+    should ensure that `y` is not evaluated before `x`.  Historically we
+    used instead ``pseq x y = x `seq` lazy y``, but this was not robust;
+    see #23233 and #23699.
+
+S4. We do not consider `seq# x s` to raise a precise exception even
+    when `x` certainly raises an exception; doing so would cause
+    `evaluate` and `pseq` to interfere with unrelated strictness properties.
+
+S5. Because seq# waits for all earlier actions in its
+    State#-token-thread to complete before evaluating its first
+    argument, it is (perhaps surprisingly) NOT unconditionally strict
+    in that first argument.  Making it strict in its first argument
+    would semantically permit us to re-order the eval with respect to
+    earlier actions in the State#-token-thread, undermining the
+    utility of `evaluate` for sequencing evaluation with respect to
+    side-effects.
+
+    A user who wants strictness can ask for it as easily as writing
+    `evaluate $! x` instead of `evaluate x`:
+
+    * `evaluate x` means "evaluate `x` at exactly this moment"
+    * `evaluate $! x` means "evaluate `x` at this moment /or earlier/"
+
+S6. We used to rewrite `seq# x<whnf> s` to `(# s, x #)` with the
+    reasoning that x has already been evaluated, so seq# has nothing
+    to do.  But doing so defeats the ordering that seq# provides.
+    Consider this example:
+
+       (start)
+       evaluate $! x
+
+       (inline $! and evaluate)
+       case x of !x' -> IO $ \s -> seq# x' s
 
-- GHC.Core.Opt.ConstantFold.seqRule: eliminate (seq# <whnf> s)
+       (x' is whnf)
+       case x of !x' -> IO (\s -> (# s, x' #))
+
+    Note that this last expression is equivalent to `pure $! x`.  Its
+    evaluation of `x` is completely untethered to the state thread of
+    any enclosing sequence of IO actions; as such the simplifier may
+    drop the eval entirely if `x` is used strictly later in the
+    sequence of IO actions.
+
+
+Implementing seq#.  The compiler has magic for SeqOp in
 
 - GHC.StgToCmm.Expr.cgExpr, and cgCase: special case for seq#
 
@@ -2121,15 +2170,14 @@ Implementing seq#.  The compiler has magic for SeqOp in
   in GHC.Core.Opt.Simplify
 -}
 
-seqRule :: RuleM CoreExpr
-seqRule = do
-  [Type _ty_a, Type _ty_s, a, s] <- getArgs
-  guard $ exprIsHNF a
-  return $ mkCoreUnboxedTuple [s, a]
 
 -- spark# :: forall a s . a -> State# s -> (# State# s, a #)
 sparkRule :: RuleM CoreExpr
-sparkRule = seqRule -- reduce on HNF, just the same
+sparkRule = do
+  [Type _ty_a, Type _ty_s, a, s] <- getArgs
+  guard $ exprIsHNF a
+  return $ mkCoreUnboxedTuple [s, a]
+  -- reduce on HNF
   -- XXX perhaps we shouldn't do this, because a spark eliminated by
   -- this rule won't be counted as a dud at runtime?
 


=====================================
libraries/base/GHC/Conc/Sync.hs
=====================================
@@ -519,19 +519,21 @@ labelThreadByteArray# (ThreadId t) str =
 --                 but 'seq' is now defined in GHC.Prim
 --
 -- "pseq" is defined a bit weirdly (see below)
---

--- The reason for the strange "lazy" call is that
--- it fools the compiler into thinking that pseq  and par are non-strict in
--- their second argument (even if it inlines pseq at the call site).
--- If it thinks pseq is strict in "y", then it often evaluates
--- "y" before "x", which is totally wrong.
-
+-- The state token threading is meant to ensure that
+-- we cannot move the eval of y before the eval of x
 {-# INLINE pseq  #-}
 pseq :: a -> b -> b
-pseq  x y = x `seq` lazy y
+pseq !x y = case seq# x realWorld# of
+  (# s, _ #) -> case seq# y s of
+    (# _, y' #) -> y'
 
 {-# INLINE par  #-}
 par :: a -> b -> b
+-- The reason for the strange "lazy" call is that
+-- it fools the compiler into thinking that  par are non-strict in
+-- their second argument (even if it inlines par at the call site).
+-- If it thinks par is strict in "y", then it often evaluates
+-- "y" before "x", which is totally wrong.
 par  x y = case (par# x) of { _ -> lazy y }
 
 -- | Internal function used by the RTS to run sparks.



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7d2e5b17f1467c3181a43b82af5165a8106a03a5

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7d2e5b17f1467c3181a43b82af5165a8106a03a5
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230727/7a59fc87/attachment-0001.html>