[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 4 commits: StgToCmm: Detect some no-op case-continuations

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Mon Jan 1 11:55:42 UTC 2024



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
bc4d67b7 by Matthew Craven at 2023-12-31T06:15:42-05:00
StgToCmm: Detect some no-op case-continuations

...and generate no code for them. Fixes #24264.

- - - - -
5b603139 by Krzysztof Gogolewski at 2023-12-31T06:16:18-05:00
Revert "testsuite: mark jspace as fragile on i386."

This reverts commit 0038d052c8c80b4b430bb2aa1c66d5280be1aa95.

The atomicity bug should be fixed by !11802.

- - - - -
2cfb3828 by Krzysztof Gogolewski at 2024-01-01T06:54:54-05:00
Refactor: store [[PrimRep]] rather than [Type] in STG

StgConApp stored a list of types. This list was used exclusively
during unarisation of unboxed sums (mkUbxSum).
However, this is at a wrong level of abstraction:
STG shouldn't be concerned with Haskell types, only PrimReps.
Update the code to store a [[PrimRep]]. Also, there's no point in storing
this list when we're not dealing with an unboxed sum.

- - - - -
8d4c5048 by Ömer Sinan Ağacan at 2024-01-01T06:55:03-05:00
Kind signatures docs: mention that they're allowed in newtypes

- - - - -


11 changed files:

- compiler/GHC/CoreToStg.hs
- compiler/GHC/Stg/Syntax.hs
- compiler/GHC/Stg/Unarise.hs
- compiler/GHC/StgToCmm/Expr.hs
- docs/users_guide/exts/kind_signatures.rst
- + testsuite/tests/codeGen/should_compile/T24264.hs
- + testsuite/tests/codeGen/should_compile/T24264.stderr
- testsuite/tests/codeGen/should_compile/all.T
- + testsuite/tests/codeGen/should_run/T24264run.hs
- testsuite/tests/codeGen/should_run/all.T
- testsuite/tests/driver/j-space/all.T


Changes:

=====================================
compiler/GHC/CoreToStg.hs
=====================================
@@ -56,7 +56,6 @@ import GHC.Utils.Misc (HasDebugCallStack)
 import GHC.Utils.Panic
 
 import Control.Monad (ap)
-import Data.Maybe (fromMaybe)
 
 -- Note [Live vs free]
 -- ~~~~~~~~~~~~~~~~~~~
@@ -531,8 +530,10 @@ coreToStgApp f args ticks = do
         res_ty = exprType (mkApps (Var f) args)
         app = case idDetails f of
                 DataConWorkId dc
-                  | saturated    -> StgConApp dc NoNumber args'
-                                      (dropRuntimeRepArgs (fromMaybe [] (tyConAppArgs_maybe res_ty)))
+                  | saturated    -> if isUnboxedSumDataCon dc then
+                                      StgConApp dc NoNumber args' (sumPrimReps args)
+                                    else
+                                      StgConApp dc NoNumber args' []
 
                 -- Some primitive operator that might be implemented as a library call.
                 -- As noted by Note [Eta expanding primops] in GHC.Builtin.PrimOps
@@ -560,6 +561,16 @@ coreToStgApp f args ticks = do
     -- profiling for #4367
     app `seq` return tapp
 
+
+-- Given Core arguments to an unboxed sum datacon, return the 'PrimRep's
+-- of every alternative. For example, in (#_|#) @LiftedRep @IntRep @Int @Int# 0
+-- the arguments are [Type LiftedRep, Type IntRep, Type Int, Type Int#, 0]
+-- and we return the list [[LiftedRep], [IntRep]].
+-- See Note [Representations in StgConApp] in GHC.Stg.Unarise.
+sumPrimReps :: [CoreArg] -> [[PrimRep]]
+sumPrimReps (Type ty : args) | isRuntimeRepKindedTy ty
+  = runtimeRepPrimRep (text "sumPrimReps") ty : sumPrimReps args
+sumPrimReps _ = []
 -- ---------------------------------------------------------------------------
 -- Argument lists
 -- This is the guy that turns applications into A-normal form


=====================================
compiler/GHC/Stg/Syntax.hs
=====================================
@@ -310,7 +310,7 @@ for the details of this transformation.
   | StgConApp   DataCon
                 ConstructorNumber
                 [StgArg] -- Saturated. See Note [Constructor applications in STG]
-                [Type]   -- See Note [Types in StgConApp] in GHC.Stg.Unarise
+                [[PrimRep]]   -- See Note [Representations in StgConApp] in GHC.Stg.Unarise
 
   | StgOpApp    StgOp    -- Primitive op or foreign call
                 [StgArg] -- Saturated.


=====================================
compiler/GHC/Stg/Unarise.hs
=====================================
@@ -166,8 +166,8 @@ avoid #19645. Other alternatives considered include:
     way to fix what is ultimately a corner-case.
 
 
-Note [Types in StgConApp]
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Representations in StgConApp]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Suppose we have this unboxed sum term:
 
   (# 123 | #)
@@ -180,9 +180,21 @@ type of this term. For example, these are all valid tuples for this:
   (# 1#, 123, rubbish, rubbish #)
                          -- when type is (# Int | (# Int, Int, Int #) #)
 
-So we pass type arguments of the DataCon's TyCon in StgConApp to decide what
-layout to use. Note that unlifted values can't be let-bound, so we don't need
-types in StgRhsCon.
+Therefore, in StgConApp we store a list [[PrimRep]] of representations
+to decide what layout to use.
+Given (# T_1 | ... | T_n #), this list will be
+[typePrimRep T_1, ..., typePrimRep T_n].
+For example, given type
+  (# Int | String #)              we will store [[LiftedRep], [LiftedRep]]
+  (# Int | Float# #)              we will store [[LiftedRep], [FloatRep]]
+  (# Int | (# Int, Int, Int #) #) we will store [[LiftedRep], [LiftedRep, LiftedRep, LiftedRep]].
+
+This field is used for unboxed sums only and it's an empty list otherwise.
+Perhaps it would be more elegant to have a separate StgUnboxedSumCon,
+but that would require duplication of code in cases where the logic is shared.
+
+Note that unlifted values can't be let-bound, so we don't need
+representations in StgRhsCon.
 
 Note [Casting slot arguments]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -527,7 +539,7 @@ unariseExpr rho (StgConApp dc n args ty_args)
           -> return $ (mkTuple args')
   | otherwise =
       let args' = unariseConArgs rho args in
-      return $ (StgConApp dc n args' (map stgArgType args'))
+      return $ (StgConApp dc n args' [])
 
 unariseExpr rho (StgOpApp op args ty)
   = return (StgOpApp op (unariseFunArgs rho args) ty)
@@ -572,7 +584,7 @@ unariseExpr rho (StgTick tick e)
   = StgTick tick <$> unariseExpr rho e
 
 -- Doesn't return void args.
-unariseUbxSumOrTupleArgs :: UnariseEnv -> UniqSupply -> DataCon -> [InStgArg] -> [Type]
+unariseUbxSumOrTupleArgs :: UnariseEnv -> UniqSupply -> DataCon -> [InStgArg] -> [[PrimRep]]
                    -> ( [OutStgArg]           -- Arguments representing the unboxed sum
                       , Maybe (StgExpr -> StgExpr)) -- Transformation to apply to the arguments, to bring them
                                                     -- into the right Rep
@@ -860,7 +872,7 @@ mkCast arg_in cast_op out_id out_ty in_rhs =
 --
 -- Example, for (# x | #) :: (# (# #) | Int #) we call
 --
---   mkUbxSum (# _ | #) [ (# #), Int ] [ voidPrimId ]
+--   mkUbxSum (# _ | #) [ [], [LiftedRep] ] [ voidPrimId ]
 --
 -- which returns
 --
@@ -869,7 +881,7 @@ mkCast arg_in cast_op out_id out_ty in_rhs =
 mkUbxSum
   :: HasDebugCallStack
   => DataCon      -- Sum data con
-  -> [Type]       -- Type arguments of the sum data con
+  -> [[PrimRep]]  -- Representations of type arguments of the sum data con
   -> [OutStgArg]  -- Actual arguments of the alternative.
   -> UniqSupply
   -> ([OutStgArg] -- Final tuple arguments
@@ -877,7 +889,7 @@ mkUbxSum
      )
 mkUbxSum dc ty_args args0 us
   = let
-      _ :| sum_slots = ubxSumRepType (map typePrimRep ty_args)
+      _ :| sum_slots = ubxSumRepType ty_args
       -- drop tag slot
       field_slots = (mapMaybe (repSlotTy . stgArgRep) args0)
       tag = dataConTag dc
@@ -1121,7 +1133,7 @@ isUnboxedTupleBndr :: Id -> Bool
 isUnboxedTupleBndr = isUnboxedTupleType . idType
 
 mkTuple :: [StgArg] -> StgExpr
-mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args (map stgArgType args)
+mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args []
 
 tagAltTy :: AltType
 tagAltTy = PrimAlt IntRep


=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -570,6 +570,58 @@ cgCase (StgOpApp (StgPrimOp SeqOp) [StgVarArg a, _] _) bndr alt_type alts
     -- Use the same return convention as vanilla 'a'.
     cgCase (StgApp a []) bndr alt_type alts
 
+{-
+Note [Eliminate trivial Solo# continuations]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we have code like this:
+
+    case scrut of bndr {
+      alt -> Solo# bndr
+    }
+
+The RHS of the only branch does nothing except wrap the case-binder
+returned by 'scrut' in a unary unboxed tuple.  But unboxed tuples
+don't exist at run-time, i.e. the branch is a no-op!  So we can
+generate code as if we just had 'scrut' instead of a case-expression.
+
+This situation can easily arise for IO or ST code, where the last
+operation a function performs is commonly 'pure $! someExpr'.
+See also #24264 and !11778.  More concretely, as of December 2023,
+when building a stage2 "perf+no_profiled_libs" ghc:
+
+ * The special case is reached 398 times.
+ * Of these, 158 have scrutinees that call a function or enter a
+   potential thunk, and would need to push a useless stack frame if
+   not for this optimisation.
+
+We might consider rewriting such case expressions in GHC.Stg.CSE as a
+slight extension of Note [All alternatives are the binder].  But the
+RuntimeReps of 'bndr' and 'Solo# bndr' are not exactly the same, and
+per Note [Typing the STG language] in GHC.Stg.Lint, we do expect Stg
+code to remain RuntimeRep-correct.  So we just detect the situation in
+StgToCmm instead.
+
+Crucially, the return conventions for 'ty' and '(# ty #)' are compatible:
+The returned value is passed in the same register(s) or stack slot in
+both conventions, and the set of allowed return values for 'ty'
+is a subset of the allowed return values for '(# ty #)':
+
+ * For a lifted type 'ty', the return convention for 'ty' promises to
+   return an evaluated-properly-tagged heap pointer, while a return
+   type '(# ty #)' only promises to return a heap pointer to an object
+   that can be evaluated later if need be.
+
+ * If 'ty' is unlifted, the allowed return
+   values for 'ty' and '(# ty #)' are identical.
+-}
+
+cgCase scrut bndr _alt_type [GenStgAlt { alt_rhs = rhs}]
+  -- see Note [Eliminate trivial Solo# continuations]
+  | StgConApp dc _ [StgVarArg v] _ <- rhs
+  , isUnboxedTupleDataCon dc
+  , v == bndr
+  = cgExpr scrut
+
 cgCase scrut bndr alt_type alts
   = -- the general case
     do { platform <- getPlatform


=====================================
docs/users_guide/exts/kind_signatures.rst
=====================================
@@ -37,6 +37,10 @@ This extension enables kind signatures in the following places:
 
          data Set (cxt :: Type -> Type) a = Set [a]
 
+-  ``newtype`` declarations: ::
+
+         newtype Set (cxt :: Type -> Type) a = Set [a]
+
 -  ``type`` declarations: ::
 
          type T (f :: Type -> Type) = f Int


=====================================
testsuite/tests/codeGen/should_compile/T24264.hs
=====================================
@@ -0,0 +1,42 @@
+{-# LANGUAGE MagicHash #-}
+
+module T24264 where
+
+import Control.Exception (evaluate)
+import GHC.Exts (seq#, noinline)
+import GHC.IO (IO(..))
+
+fun1 :: a -> IO a
+{-# OPAQUE fun1 #-}
+fun1 x = do
+  pure ()
+  pure $! x
+  -- This should not push a continuation to the stack before entering 'x'
+
+fun2 :: a -> IO a
+{-# OPAQUE fun2 #-}
+fun2 x = do
+  pure ()
+  evaluate x
+  -- This should not push a continuation to the stack before entering 'x'
+
+fun3 :: a -> IO a
+{-# OPAQUE fun3 #-}
+fun3 x = do
+  pure ()
+  -- "evaluate $! x"
+  case x of !x' -> IO (noinline seq# x')
+                    -- noinline to work around the bogus seqRule
+  -- This ideally also should not push a continuation to the stack
+  -- before entering 'x'.  It currently does, but let's wait for
+  -- !11515 to land before worrying about that.
+
+funPair :: a -> IO (a, a)
+{-# OPAQUE funPair #-}
+funPair x = do
+  pure ()
+  x' <- pure $! x
+  -- This should push a continuation to the stack before entering 'x',
+  -- so the pair can be returned instead.  (It's here to make sure
+  -- that the 'returns to' detection continues working correctly.)
+  pure (x', x')


=====================================
testsuite/tests/codeGen/should_compile/T24264.stderr
=====================================
@@ -0,0 +1,142 @@
+
+==================== Cmm produced by codegen ====================
+[T24264.fun1_entry() { //  [R2]
+         { info_tbls: [(cQO,
+                        label: T24264.fun1_info
+                        rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 5} }
+                        srt: Nothing)]
+           stack_info: arg_space: 8
+         }
+     {offset
+       cQO: // global
+           _sQw::P64 = R2;   // CmmAssign
+           goto cQM;   // CmmBranch
+       cQM: // global
+           if ((old + 0) - <highSp> < SpLim) (likely: False) goto cQP; else goto cQQ;   // CmmCondBranch
+       cQP: // global
+           R2 = _sQw::P64;   // CmmAssign
+           R1 = T24264.fun1_closure;   // CmmAssign
+           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;   // CmmCall
+       cQQ: // global
+           goto cQL;   // CmmBranch
+       cQL: // global
+           // slowCall
+           R1 = _sQw::P64;   // CmmAssign
+           call stg_ap_0_fast(R1) args: 8, res: 0, upd: 8;   // CmmCall
+     }
+ }]
+
+
+
+==================== Cmm produced by codegen ====================
+[T24264.fun2_entry() { //  [R2]
+         { info_tbls: [(cQY,
+                        label: T24264.fun2_info
+                        rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 5} }
+                        srt: Nothing)]
+           stack_info: arg_space: 8
+         }
+     {offset
+       cQY: // global
+           _sQz::P64 = R2;   // CmmAssign
+           goto cQW;   // CmmBranch
+       cQW: // global
+           if ((old + 0) - <highSp> < SpLim) (likely: False) goto cQZ; else goto cR0;   // CmmCondBranch
+       cQZ: // global
+           R2 = _sQz::P64;   // CmmAssign
+           R1 = T24264.fun2_closure;   // CmmAssign
+           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;   // CmmCall
+       cR0: // global
+           goto cQV;   // CmmBranch
+       cQV: // global
+           // slowCall
+           R1 = _sQz::P64;   // CmmAssign
+           call stg_ap_0_fast(R1) args: 8, res: 0, upd: 8;   // CmmCall
+     }
+ }]
+
+
+
+==================== Cmm produced by codegen ====================
+[T24264.fun3_entry() { //  [R2]
+         { info_tbls: [(cRb,
+                        label: T24264.fun3_info
+                        rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 5} }
+                        srt: Nothing)]
+           stack_info: arg_space: 8
+         }
+     {offset
+       cRb: // global
+           _sQB::P64 = R2;   // CmmAssign
+           goto cR6;   // CmmBranch
+       cR6: // global
+           if ((old + 0) - <highSp> < SpLim) (likely: False) goto cRc; else goto cRd;   // CmmCondBranch
+       cRc: // global
+           R2 = _sQB::P64;   // CmmAssign
+           R1 = T24264.fun3_closure;   // CmmAssign
+           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;   // CmmCall
+       cRd: // global
+           goto cR5;   // CmmBranch
+       cR5: // global
+           // slowCall
+           I64[(young<cR8> + 8)] = cR8;   // CmmStore
+           R1 = _sQB::P64;   // CmmAssign
+           call stg_ap_0_fast(R1) returns to cR8, args: 8, res: 8, upd: 8;   // CmmCall
+       cR8: // global
+           _sQD::P64 = R1;   // CmmAssign
+           // slow_call for _sQB::P64 with pat stg_ap_0
+           R1 = _sQD::P64;   // CmmAssign
+           call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;   // CmmCall
+     }
+ }]
+
+
+
+==================== Cmm produced by codegen ====================
+[T24264.funPair_entry() { //  [R2]
+         { info_tbls: [(cRq,
+                        label: T24264.funPair_info
+                        rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 5} }
+                        srt: Nothing)]
+           stack_info: arg_space: 8
+         }
+     {offset
+       cRq: // global
+           _sQE::P64 = R2;   // CmmAssign
+           goto cRl;   // CmmBranch
+       cRl: // global
+           if ((old + 0) - <highSp> < SpLim) (likely: False) goto cRr; else goto cRs;   // CmmCondBranch
+       cRr: // global
+           R2 = _sQE::P64;   // CmmAssign
+           R1 = T24264.funPair_closure;   // CmmAssign
+           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;   // CmmCall
+       cRs: // global
+           goto cRk;   // CmmBranch
+       cRk: // global
+           // slowCall
+           I64[(young<cRn> + 8)] = cRn;   // CmmStore
+           R1 = _sQE::P64;   // CmmAssign
+           call stg_ap_0_fast(R1) returns to cRn, args: 8, res: 8, upd: 8;   // CmmCall
+       cRn: // global
+           _sQG::P64 = R1;   // CmmAssign
+           // slow_call for _sQE::P64 with pat stg_ap_0
+           Hp = Hp + 24;   // CmmAssign
+           if (Hp > HpLim) (likely: False) goto cRv; else goto cRu;   // CmmCondBranch
+       cRv: // global
+           HpAlloc = 24;   // CmmAssign
+           goto cRt;   // CmmBranch
+       cRt: // global
+           R1 = _sQG::P64;   // CmmAssign
+           call stg_gc_unpt_r1(R1) returns to cRn, args: 8, res: 8, upd: 8;   // CmmCall
+       cRu: // global
+           // allocHeapClosure
+           I64[Hp - 16] = (,)_con_info;   // CmmStore
+           P64[Hp - 8] = _sQG::P64;   // CmmStore
+           P64[Hp] = _sQG::P64;   // CmmStore
+           _cRp::P64 = Hp - 15;   // CmmAssign
+           R1 = _cRp::P64;   // CmmAssign
+           call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;   // CmmCall
+     }
+ }]
+
+


=====================================
testsuite/tests/codeGen/should_compile/all.T
=====================================
@@ -128,3 +128,5 @@ test('T21710a', [ unless(tables_next_to_code(), skip) , when(wordsize(32), skip)
                 , grep_errmsg('(call)',[1]) ]
                 , compile, ['-ddump-cmm -dno-typeable-binds'])
 test('T23002', normal, compile, ['-fregs-graph'])
+test('T24264', [req_cmm, grep_errmsg(r'(.*\().*(\) returns to)', [1,2])],
+     compile, ['-O -ddump-cmm-from-stg -dno-typeable-binds'])


=====================================
testsuite/tests/codeGen/should_run/T24264run.hs
=====================================
@@ -0,0 +1,32 @@
+module Main where
+
+import Control.Exception (evaluate)
+import GHC.Exts (lazy, noinline)
+
+data StrictPair a b = !a :*: !b
+
+tailEval1 :: a -> IO a
+{-# OPAQUE tailEval1 #-}
+tailEval1 = lazy $ \x -> do
+  pure ()
+  pure $! x
+
+tailEval2 :: a -> IO a
+{-# OPAQUE tailEval2 #-}
+tailEval2 x = evaluate x
+
+go :: [a] -> IO ()
+go = noinline mapM_ $ \x -> do
+  y1 <- tailEval1 x
+  y2 <- tailEval2 x
+  evaluate (y1 :*: y2)
+
+main :: IO ()
+main = do
+  let ints :: [Int]
+      ints = take 1000 $ noinline iterate (\x -> x * 35) 1
+  go ints
+  go [LT, EQ, GT]
+  go $ noinline map (toEnum @Ordering . flip mod 3) ints
+  go $ noinline map Left ints
+  go $ noinline map (+)  ints


=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -240,3 +240,4 @@ test('MulMayOflo_full',
         ignore_stdout],
      multi_compile_and_run,
      ['MulMayOflo', [('MulMayOflo_full.cmm', '')], ''])
+test('T24264run', normal, compile_and_run, [''])


=====================================
testsuite/tests/driver/j-space/all.T
=====================================
@@ -1 +1 @@
-test('jspace', [extra_files(['genJspace']), req_target_smp, req_ghc_smp, js_broken(22573),when(arch('i386'), fragile(24261))], makefile_test, ['jspace'])
+test('jspace', [extra_files(['genJspace']), req_target_smp, req_ghc_smp, js_broken(22573)], makefile_test, ['jspace'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8cf80131a6ad5d60114879db17738be445e07075...8d4c5048b555654c6ce1ae0037eb6f4092afc59f

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/8cf80131a6ad5d60114879db17738be445e07075...8d4c5048b555654c6ce1ae0037eb6f4092afc59f
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240101/84dc73d8/attachment-0001.html>


More information about the ghc-commits mailing list