[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 7 commits: rts: Use pthread_setname_np correctly on Darwin

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Mon Oct 17 10:48:11 UTC 2022



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
ee0deb80 by Ben Gamari at 2022-10-14T18:29:20-04:00
rts: Use pthread_setname_np correctly on Darwin

As noted in #22206, pthread_setname_np on Darwin only supports
setting the name of the calling thread. Consequently we must introduce
a trampoline which first sets the thread name before entering the thread
entrypoint.
- - - - -
8eff62a4 by Ben Gamari at 2022-10-14T18:29:57-04:00
testsuite: Add test for #22282

This will complement mpickering's more general port of foundation's
numerical testsuite, providing a test for the specific case found
in #22282.

- - - - -
62a55001 by Ben Gamari at 2022-10-14T18:29:57-04:00
ncg/aarch64: Fix sub-word sign extension yet again

In adc7f108141a973b6dcb02a7836eed65d61230e8 we fixed a number of issues
to do with sign extension in the AArch64 NCG found by ghc/test-primops>.
However, this patch made a critical error, assuming that getSomeReg
would allocate a fresh register for the result of its evaluation.
However, this is not the case as `getSomeReg (CmmReg r) == r`.
Consequently, any mutation of the register returned by `getSomeReg` may
have unwanted side-effects on other expressions also mentioning `r`. In
the fix listed above, this manifested as the registers containing the
operands of binary arithmetic operations being incorrectly
sign-extended. This resulted in #22282.

Sadly, the rather simple structure of the tests generated
by `test-primops` meant that this particular case was not exercised.
Even more surprisingly, none of our testsuite caught this case.

Here we fix this by ensuring that intermediate sign extension is
performed in a fresh register.

Fixes #22282.

- - - - -
54e41b16 by Teo Camarasu at 2022-10-15T18:09:24+01:00
rts: ensure we are below maxHeapSize after returning megablocks

When the heap is heavily block fragmented the live byte size might be
low while the memory usage is high. We want to ensure that heap overflow
triggers in these cases.

We do so by checking that we can return enough megablocks to
under maxHeapSize at the end of GC.

- - - - -
29bb90db by Teo Camarasu at 2022-10-15T18:09:24+01:00
rts: trigger a major collection if megablock usage exceeds maxHeapSize

When the heap is suffering from block fragmentation, live bytes might be
low while megablock usage is high.

If megablock usage exceeds maxHeapSize, we want to trigger a major GC to
try to recover some memory otherwise we will die from a heapOverflow at
the end of the GC.

Fixes #21927

- - - - -
4a4641ca by Teo Camarasu at 2022-10-15T18:11:29+01:00
Add realease note for #21927

- - - - -
e09bd1d3 by Sebastian Graf at 2022-10-17T06:48:00-04:00
DmdAnal: Look through unfoldings of DataCon wrappers (#22241)

Previously, the demand signature we computed upfront for a DataCon wrapper

lacked boxity information and was much less precise than the demand transformer

for the DataCon worker.

In this patch we adopt the solution to look through unfoldings of DataCon

wrappers during Demand Analysis, but still attach a demand signature for other

passes such as the Simplifier.

See `Note [DmdAnal for DataCon wrappers]` for more details.

Fixes #22241.

- - - - -


15 changed files:

- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Types/Id/Make.hs
- docs/users_guide/9.6.1-notes.rst
- rts/Schedule.c
- rts/posix/OSThreads.c
- rts/sm/GC.c
- + testsuite/tests/numeric/should_run/T22282.hs
- + testsuite/tests/numeric/should_run/T22282.stdout
- + testsuite/tests/numeric/should_run/T22282A.hs
- testsuite/tests/numeric/should_run/all.T
- + testsuite/tests/stranal/sigs/T22241.hs
- + testsuite/tests/stranal/sigs/T22241.stderr
- testsuite/tests/stranal/sigs/all.T


Changes:

=====================================
compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
=====================================
@@ -664,10 +664,11 @@ getRegister' config plat expr
         -- See Note [Signed arithmetic on AArch64].
         negate code w reg = do
             let w' = opRegWidth w
+            (reg', code_sx) <- signExtendReg w w' reg
             return $ Any (intFormat w) $ \dst ->
                 code `appOL`
-                signExtendReg w w' reg `snocOL`
-                NEG (OpReg w' dst) (OpReg w' reg) `appOL`
+                code_sx `snocOL`
+                NEG (OpReg w' dst) (OpReg w' reg') `appOL`
                 truncateReg w' w dst
 
         ss_conv from to reg code =
@@ -817,15 +818,17 @@ getRegister' config plat expr
               -- should be performed.
               let w' = opRegWidth w
                   signExt r
-                    | not is_signed  = nilOL
+                    | not is_signed  = return (r, nilOL)
                     | otherwise      = signExtendReg w w' r
+              (reg_x_sx, code_x_sx) <- signExt reg_x
+              (reg_y_sx, code_y_sx) <- signExt reg_y
               return $ Any (intFormat w) $ \dst ->
                   code_x `appOL`
                   code_y `appOL`
                   -- sign-extend both operands
-                  signExt reg_x `appOL`
-                  signExt reg_y `appOL`
-                  op (OpReg w' dst) (OpReg w' reg_x) (OpReg w' reg_y) `appOL`
+                  code_x_sx `appOL`
+                  code_y_sx `appOL`
+                  op (OpReg w' dst) (OpReg w' reg_x_sx) (OpReg w' reg_y_sx) `appOL`
                   truncateReg w' w dst -- truncate back to the operand's original width
 
           floatOp w op = do
@@ -1021,16 +1024,21 @@ getRegister' config plat expr
 
 -- | Instructions to sign-extend the value in the given register from width @w@
 -- up to width @w'@.
-signExtendReg :: Width -> Width -> Reg -> OrdList Instr
+signExtendReg :: Width -> Width -> Reg -> NatM (Reg, OrdList Instr)
 signExtendReg w w' r =
     case w of
-      W64 -> nilOL
+      W64 -> noop
       W32
-        | w' == W32 -> nilOL
-        | otherwise -> unitOL $ SXTH (OpReg w' r) (OpReg w' r)
-      W16           -> unitOL $ SXTH (OpReg w' r) (OpReg w' r)
-      W8            -> unitOL $ SXTB (OpReg w' r) (OpReg w' r)
+        | w' == W32 -> noop
+        | otherwise -> extend SXTH
+      W16           -> extend SXTH
+      W8            -> extend SXTB
       _             -> panic "intOp"
+  where
+    noop = return (r, nilOL)
+    extend instr = do
+        r' <- getNewRegNat II64
+        return (r', unitOL $ instr (OpReg w' r') (OpReg w' r))
 
 -- | Instructions to truncate the value in the given register from width @w@
 -- down to width @w'@.


=====================================
compiler/GHC/Core/Opt/CprAnal.hs
=====================================
@@ -553,13 +553,16 @@ analysing their unfolding. A few reasons for the change:
      *workers*, because their transformers need to adapt to CPR for their
      arguments in 'cprTransformDataConWork' to enable Note [Nested CPR].
      Better keep it all in this module! The alternative would be that
-     'GHC.Types.Id.Make' depends on DmdAnal.
+     'GHC.Types.Id.Make' depends on CprAnal.
   3. In the future, Nested CPR could take a better account of incoming args
      in cprAnalApp and do some beta-reduction on the fly, like !1866 did. If
      any of those args had the CPR property, then we'd even get Nested CPR for
      DataCon wrapper calls, for free. Not so if we simply give the wrapper a
      single CPR sig in 'GHC.Types.Id.Make.mkDataConRep'!
 
+DmdAnal also looks through the wrapper's unfolding:
+See Note [DmdAnal for DataCon wrappers].
+
 Note [Trimming to mAX_CPR_SIZE]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We do not treat very big tuples as CPR-ish:


=====================================
compiler/GHC/Core/Opt/DmdAnal.hs
=====================================
@@ -985,6 +985,10 @@ dmdTransform env var sd
   | isDataConWorkId var
   = -- pprTraceWith "dmdTransform:DataCon" (\ty -> ppr var $$ ppr sd $$ ppr ty) $
     dmdTransformDataConSig (idArity var) sd
+  -- See Note [DmdAnal for DataCon wrappers]
+  | isDataConWrapId var, let rhs = uf_tmpl (realIdUnfolding var)
+  , WithDmdType dmd_ty _rhs' <- dmdAnal env sd rhs
+  = dmd_ty
   -- Dictionary component selectors
   -- Used to be controlled by a flag.
   -- See #18429 for some perf measurements.
@@ -1388,6 +1392,45 @@ Now f's optimised RHS will be \x.a, but if we change g to (error "..")
 disaster.  But regardless, #18638 was a more complicated version of
 this, that actually happened in practice.
 
+Note [DmdAnal for DataCon wrappers]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We give DataCon wrappers a (necessarily flat) demand signature in
+`GHC.Types.Id.Make.mkDataConRep`, so that passes such as the Simplifier can
+exploit it via the call to `GHC.Core.Opt.Simplify.Utils.isStrictArgInfo` in
+`GHC.Core.Opt.Simplify.Iteration.rebuildCall`. But during DmdAnal, we *ignore*
+the demand signature of a DataCon wrapper, and instead analyse its unfolding at
+every call site.
+
+The reason is that DataCon *worker*s have very precise demand transformers,
+computed by `dmdTransformDataConSig`. It would be awkward if DataCon *wrappers*
+would behave much less precisely during DmdAnal. Example:
+
+   data T1 = MkT1 { get_x1 :: Int,  get_y1 :: Int }
+   data T2 = MkT2 { get_x2 :: !Int, get_y2 :: Int }
+   f1 x y = get_x1 (MkT1 x y)
+   f2 x y = get_x2 (MkT2 x y)
+
+Here `MkT1` has no wrapper. `get_x1` puts a demand `!P(1!L,A)` on its argument,
+and `dmdTransformDataConSig` will transform that demand to an absent demand on
+`y` in `f1` and an unboxing demand on `x`.
+But `MkT2` has a wrapper (to evaluate the first field). If demand analysis deals
+with `MkT2` only through its demand signature, demand signatures can't transform
+an incoming demand `P(1!L,A)` in a useful way, so we won't get an absent demand
+on `y` in `f2` or see that `x` can be unboxed. That's a serious loss.
+
+The example above will not actually occur, because $WMkT2 would be inlined.
+Nevertheless, we can get interesting sub-demands on DataCon wrapper
+applications in boring contexts; see T22241.
+
+You might worry about the efficiency cost of demand-analysing datacon wrappers
+at every call site. But in fact they are inlined /anyway/ in the Final phase,
+which happens before DmdAnal, so few wrappers remain. And analysing the
+unfoldings for the remaining calls (which are those in a boring context) will be
+exactly as (in)efficent as if we'd inlined those calls. It turns out to be not
+measurable in practice.
+
+See also Note [CPR for DataCon wrappers] in `GHC.Core.Opt.CprAnal`.
+
 Note [Boxity for bottoming functions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider (A)


=====================================
compiler/GHC/Types/Id/Make.hs
=====================================
@@ -477,9 +477,9 @@ mkDictSelId name clas
                -- See Note [Type classes and linear types]
 
     base_info = noCafIdInfo
-                `setArityInfo`          1
-                `setDmdSigInfo`     strict_sig
-                `setCprSigInfo`            topCprSig
+                `setArityInfo`  1
+                `setDmdSigInfo` strict_sig
+                `setCprSigInfo` topCprSig
 
     info | new_tycon
          = base_info `setInlinePragInfo` alwaysInlinePragma
@@ -697,6 +697,8 @@ mkDataConRep dc_bang_opts fam_envs wrap_name data_con
                              -- does not tidy the IdInfo of implicit bindings (like the wrapper)
                              -- so it not make sure that the CAF info is sane
 
+             -- The signature is purely for passes like the Simplifier, not for
+             -- DmdAnal itself; see Note [DmdAnal for DataCon wrappers].
              wrap_sig = mkClosedDmdSig wrap_arg_dmds topDiv
 
              wrap_arg_dmds =
@@ -1321,9 +1323,9 @@ mkFCallId uniq fcall ty
     name = mkFCallName uniq occ_str
 
     info = noCafIdInfo
-           `setArityInfo`          arity
-           `setDmdSigInfo`     strict_sig
-           `setCprSigInfo`            topCprSig
+           `setArityInfo`  arity
+           `setDmdSigInfo` strict_sig
+           `setCprSigInfo` topCprSig
 
     (bndrs, _) = tcSplitPiTys ty
     arity      = count isAnonTyCoBinder bndrs


=====================================
docs/users_guide/9.6.1-notes.rst
=====================================
@@ -121,6 +121,12 @@ Runtime system
   by library authors directly, who may wrap them a safe API that maintains the
   necessary invariants. See the documentation in ``GHC.Prim`` for more details.
 
+- The behaviour of the ``-M`` flag has been made more strict. It will now trigger
+  a heap overflow if the total amount of memory used by the Haskell heap exceeds the limit.
+  Previously only live blocks were taken into account.
+  This makes it more likely to trigger promptly when the heap is highly fragmented.
+
+
 ``base`` library
 ~~~~~~~~~~~~~~~~
 


=====================================
rts/Schedule.c
=====================================
@@ -1592,9 +1592,13 @@ scheduleDoGC (Capability **pcap, Task *task USED_IF_THREADS,
 
     heap_census = scheduleNeedHeapProfile(true);
 
+    // We force a major collection if the size of the heap exceeds maxHeapSize.
+    // We will either return memory until we are below maxHeapSize or trigger heapOverflow.
+    bool mblock_overflow = RtsFlags.GcFlags.maxHeapSize != 0 && mblocks_allocated > BLOCKS_TO_MBLOCKS(RtsFlags.GcFlags.maxHeapSize);
+
     // Figure out which generation we are collecting, so that we can
     // decide whether this is a parallel GC or not.
-    collect_gen = calcNeeded(force_major || heap_census, NULL);
+    collect_gen = calcNeeded(force_major || heap_census || mblock_overflow , NULL);
     major_gc = (collect_gen == RtsFlags.GcFlags.generations-1);
 
 #if defined(THREADED_RTS)


=====================================
rts/posix/OSThreads.c
=====================================
@@ -186,22 +186,48 @@ shutdownThread(void)
   pthread_exit(NULL);
 }
 
-int
-createOSThread (OSThreadId* pId, char *name STG_UNUSED,
-                OSThreadProc *startProc, void *param)
+struct ThreadDesc {
+    OSThreadProc *startProc;
+    void *param;
+    char *name;
+};
+
+// N.B. Darwin's pthread_setname_np only allows the name of the
+// calling thread to be set. Consequently we must use this
+// trampoline.
+static void *
+start_thread (void *param)
 {
-  int result = pthread_create(pId, NULL, startProc, param);
-  if (!result) {
-    pthread_detach(*pId);
+    struct ThreadDesc desc = *(struct ThreadDesc *) param;
+    stgFree(param);
+
 #if defined(HAVE_PTHREAD_SET_NAME_NP)
-    pthread_set_name_np(*pId, name);
+    pthread_set_name_np(pthread_self(), desc.name);
 #elif defined(HAVE_PTHREAD_SETNAME_NP)
-    pthread_setname_np(*pId, name);
+    pthread_setname_np(pthread_self(), desc.name);
 #elif defined(HAVE_PTHREAD_SETNAME_NP_DARWIN)
-    pthread_setname_np(name);
+    pthread_setname_np(desc.name);
 #elif defined(HAVE_PTHREAD_SETNAME_NP_NETBSD)
-    pthread_setname_np(*pId, "%s", name);
+    pthread_setname_np(pthread_self(), "%s", desc.name);
 #endif
+
+    return desc.startProc(desc.param);
+}
+
+int
+createOSThread (OSThreadId* pId, char *name STG_UNUSED,
+                OSThreadProc *startProc, void *param)
+{
+  struct ThreadDesc *desc = stgMallocBytes(sizeof(struct ThreadDesc), "createOSThread");
+  desc->startProc = startProc;
+  desc->param = param;
+  desc->name = name;
+
+  int result = pthread_create(pId, NULL, start_thread, desc);
+  if (!result) {
+      pthread_detach(*pId);
+  } else {
+      stgFree(desc);
   }
   return result;
 }


=====================================
rts/sm/GC.c
=====================================
@@ -1061,6 +1061,13 @@ GarbageCollect (uint32_t collect_gen,
           returned = returnMemoryToOS(got - need);
       }
       traceEventMemReturn(cap, got, need, returned);
+
+      // Ensure that we've returned enough mblocks to place us under maxHeapSize.
+      // This may fail for instance due to block fragmentation.
+      W_ after = got - returned;
+      if (RtsFlags.GcFlags.maxHeapSize != 0 && after > BLOCKS_TO_MBLOCKS(RtsFlags.GcFlags.maxHeapSize)) {
+        heapOverflow();
+      }
   }
 
   // extra GC trace info


=====================================
testsuite/tests/numeric/should_run/T22282.hs
=====================================
@@ -0,0 +1,3 @@
+import T22282A
+
+main = print $ testF 217 161


=====================================
testsuite/tests/numeric/should_run/T22282.stdout
=====================================
@@ -0,0 +1,2 @@
+217
+


=====================================
testsuite/tests/numeric/should_run/T22282A.hs
=====================================
@@ -0,0 +1,18 @@
+{-# OPTIONS_GHC -O1 #-}
+{-# LANGUAGE MagicHash #-}
+module T22282A where
+
+import Data.Word
+import GHC.Prim
+import GHC.Word
+
+wtestF :: GHC.Prim.Word8# -> GHC.Prim.Word8# -> GHC.Prim.Word8#
+wtestF a b = case word8ToWord# b of
+  0## -> a
+  _   -> plusWord8# (timesWord8# (quotWord8# a b) b) (remWord8# a b)
+{-# NOINLINE wtestF #-}
+
+testF :: Word8 -> Word8 -> Word8
+testF (W8# a) (W8# b) = W8# (wtestF a b)
+{-# INLINE testF #-}
+


=====================================
testsuite/tests/numeric/should_run/all.T
=====================================
@@ -78,3 +78,4 @@ test('T19931', normal, compile_and_run, ['-O2'])
 test('IntegerToFloat', normal, compile_and_run, [''])
 
 test('T20291', normal, compile_and_run, [''])
+test('T22282', normal, compile_and_run, [''])


=====================================
testsuite/tests/stranal/sigs/T22241.hs
=====================================
@@ -0,0 +1,31 @@
+module T22241 where
+
+data D = D { unD :: !Int }
+
+-- We should unbox y here, which only happens if DmdAnal sees that $WD will
+-- unbox it.
+f :: Bool -> Int -> D
+f x y = D (go x)
+  where
+    go False = y
+    go True  = go False
+{-# NOINLINE f #-}
+
+
+
+data T a = T Int !a
+get (T _ x) = x
+
+-- Here, the goal is to discard `unD (f True z)` and thus `z` as absent by
+-- looking through $WT in `j` *during the first pass of DmdAnal*!
+g :: Bool -> Int -> Int -> Bool
+g x y z | even y    = get (fst t)
+        | y > 13    = not (get (fst t))
+        | otherwise = False
+  where
+    t | x         = j (unD (f True z))
+      | otherwise = j (unD (f False z))
+      where
+        j a = (T a x, True)
+        {-# NOINLINE j #-}
+{-# NOINLINE g #-}


=====================================
testsuite/tests/stranal/sigs/T22241.stderr
=====================================
@@ -0,0 +1,24 @@
+
+==================== Strictness signatures ====================
+T22241.f: <1L><S!P(L)>
+T22241.g: <L><1!P(L)><A>
+T22241.get: <1!P(A,1L)>
+T22241.unD: <1!P(L)>
+
+
+
+==================== Cpr signatures ====================
+T22241.f: 1
+T22241.g:
+T22241.get:
+T22241.unD: 1
+
+
+
+==================== Strictness signatures ====================
+T22241.f: <1L><1!P(SL)>
+T22241.g: <ML><1!P(L)><A>
+T22241.get: <1!P(A,1L)>
+T22241.unD: <1!P(L)>
+
+


=====================================
testsuite/tests/stranal/sigs/all.T
=====================================
@@ -37,3 +37,4 @@ test('T21717', normal, compile, [''])
 test('T21754', normal, compile, [''])
 test('T21888', normal, compile, [''])
 test('T21888a', normal, compile, [''])
+test('T22241', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/eb334f8e669a0d7ae3f6ba3e3089af7a116d224d...e09bd1d388e06a978410886d3d998ba5ce64e1cf

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/eb334f8e669a0d7ae3f6ba3e3089af7a116d224d...e09bd1d388e06a978410886d3d998ba5ce64e1cf
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221017/ee400d9a/attachment-0001.html>


More information about the ghc-commits mailing list