[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 6 commits: base: Document GHC versions associated with past base versions in the changelog

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Fri Mar 24 23:32:42 UTC 2023



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
ac9736f0 by Li-yao Xia at 2023-03-24T19:32:21-04:00
base: Document GHC versions associated with past base versions in the changelog

- - - - -
1ffa4bdb by Teo Camarasu at 2023-03-24T19:32:23-04:00
Add regression test for #17574

This test currently fails in the nonmoving way

- - - - -
822ef5d0 by Teo Camarasu at 2023-03-24T19:32:23-04:00
fix: account for large and compact object stats with nonmoving gc

Make sure that we keep track of the size of large and compact objects that have been moved onto the nonmoving heap.
We keep track of their size and add it to the amount of live bytes in nonmoving segments to get the total size of the live nonmoving heap.

Resolves #17574

- - - - -
25e085c1 by David Feuer at 2023-03-24T19:32:26-04:00
Modify ThreadId documentation and comments

For a long time, `GHC.Conc.Sync` has said

```haskell
-- ToDo: data ThreadId = ThreadId (Weak ThreadId#)
-- But since ThreadId# is unlifted, the Weak type must use open
-- type variables.
```

We are now actually capable of using `Weak# ThreadId#`, but the
world has moved on. To support the `Show` and `Ord` instances, we'd
need to store the thread ID number in the `ThreadId`. And it seems
very difficult to continue to support `threadStatus` in that regime,
since it needs to be able to explain how threads died. In addition,
garbage collection of weak references can be quite expensive, and it
would be hard to evaluate the cost over he whole ecosystem. As discussed
in
[this CLC issue](https://github.com/haskell/core-libraries-committee/issues/125),
it doesn't seem very likely that we'll actually switch to weak
references here.

- - - - -
38250725 by Ben Gamari at 2023-03-24T19:32:26-04:00
rts: Fix barriers of IND and IND_STATIC

Previously IND and IND_STATIC lacked the acquire barriers enjoyed by
BLACKHOLE. As noted in the (now updated) Note [Heap memory barriers],
this barrier is critical to ensure that the indirectee is visible to the
entering core.

Fixes #22872.

- - - - -
fe1bf91c by Bodigrim at 2023-03-24T19:32:33-04:00
Improve documentation of atomicModifyMutVar2#

- - - - -


13 changed files:

- compiler/GHC/Builtin/primops.txt.pp
- libraries/base/GHC/Conc/Sync.hs
- libraries/base/changelog.md
- rts/StgMiscClosures.cmm
- rts/include/stg/SMP.h
- rts/sm/NonMoving.c
- rts/sm/NonMoving.h
- rts/sm/NonMovingMark.c
- rts/sm/NonMovingMark.h
- rts/sm/Storage.c
- + testsuite/tests/rts/T17574.hs
- + testsuite/tests/rts/T17574.stdout
- testsuite/tests/rts/all.T


Changes:

=====================================
compiler/GHC/Builtin/primops.txt.pp
=====================================
@@ -2528,11 +2528,23 @@ primop  WriteMutVarOp "writeMutVar#"  GenPrimOp
 primop  AtomicModifyMutVar2Op "atomicModifyMutVar2#" GenPrimOp
    MutVar# s a -> (a -> c) -> State# s -> (# State# s, a, c #)
    { Modify the contents of a 'MutVar#', returning the previous
-     contents and the result of applying the given function to the
-     previous contents. Note that this isn't strictly
-     speaking the correct type for this function; it should really be
-     @'MutVar#' s a -> (a -> (a,b)) -> 'State#' s -> (# 'State#' s, a, (a, b) #)@,
-     but we don't know about pairs here. }
+     contents @x :: a@ and the result of applying the given function to the
+     previous contents @f x :: c at .
+
+     The @data@ type @c@ (not a @newtype@!) must be a record whose first field
+     is of lifted type @a :: Type@ and is not unpacked. For example, product
+     types @c ~ Solo a@ or @c ~ (a, b)@ work well. If the record type is both
+     monomorphic and strict in its first field, it's recommended to mark the
+     latter @{-# NOUNPACK #-}@ explicitly.
+
+     Under the hood 'atomicModifyMutVar2#' atomically replaces a pointer to an
+     old @x :: a@ with a pointer to a selector thunk @fst r@, where
+     @fst@ is a selector for the first field of the record and @r@ is a
+     function application thunk @r = f x at .
+
+     @atomicModifyIORef2Native@ from @atomic-modify-general@ package makes an
+     effort to reflect restrictions on @c@ faithfully, providing a
+     well-typed high-level wrapper.}
    with
    out_of_line = True
    has_side_effects = True


=====================================
libraries/base/GHC/Conc/Sync.hs
=====================================
@@ -133,9 +133,6 @@ infixr 0 `par`, `pseq`
 -----------------------------------------------------------------------------
 
 data ThreadId = ThreadId ThreadId#
--- ToDo: data ThreadId = ThreadId (Weak ThreadId#)
--- But since ThreadId# is unlifted, the Weak type must use open
--- type variables.
 {- ^
 A 'ThreadId' is an abstract type representing a handle to a thread.
 'ThreadId' is an instance of 'Eq', 'Ord' and 'Show', where
@@ -146,10 +143,9 @@ useful when debugging or diagnosing the behaviour of a concurrent
 program.
 
 /Note/: in GHC, if you have a 'ThreadId', you essentially have
-a pointer to the thread itself.  This means the thread itself can\'t be
-garbage collected until you drop the 'ThreadId'.
-This misfeature will hopefully be corrected at a later date.
-
+a pointer to the thread itself. This means the thread itself can\'t be
+garbage collected until you drop the 'ThreadId'. This misfeature would
+be difficult to correct while continuing to support 'threadStatus'.
 -}
 
 -- | @since 4.2.0.0


=====================================
libraries/base/changelog.md
=====================================
@@ -16,10 +16,11 @@
       ([CLC proposal #57](https://github.com/haskell/core-libraries-committee/issues/57))
 
 ## 4.18.0.0 *TBA*
-
+  * Shipped with GHC 9.6.1
   * `Foreign.C.ConstPtr.ConstrPtr` was added to encode `const`-qualified
     pointer types in foreign declarations when using `CApiFFI` extension. ([CLC proposal #117](https://github.com/haskell/core-libraries-committee/issues/117))
   * Add `forall a. Functor (p a)` superclass for `Bifunctor p` ([CLC proposal #91](https://github.com/haskell/core-libraries-committee/issues/91))
+  * Add `forall a. Functor (p a)` superclass for `Bifunctor p`.
   * Add Functor instances for `(,,,,) a b c d`, `(,,,,,) a b c d e` and
     `(,,,,,) a b c d e f`.
   * Exceptions thrown by weak pointer finalizers can now be reported by setting
@@ -91,6 +92,8 @@
 
 ## 4.17.0.0 *August 2022*
 
+  * Shipped with GHC 9.4.1
+
   * Add explicitly bidirectional `pattern TypeRep` to `Type.Reflection`.
 
   * Add `Generically` and `Generically1` to `GHC.Generics` for deriving generic
@@ -200,6 +203,8 @@
 
 ## 4.16.0.0 *Nov 2021*
 
+  * Shipped with GHC 9.2.1
+
   * The unary tuple type, `Solo`, is now exported by `Data.Tuple`.
 
   * Add a `Typeable` constraint to `fromStaticPtr` in the class `GHC.StaticPtr.IsStatic`.
@@ -260,6 +265,8 @@
 
 ## 4.15.0.0 *Feb 2021*
 
+  * Shipped with GHC 9.0.1
+
   * `openFile` now calls the `open` system call with an `interruptible` FFI
     call, ensuring that the call can be interrupted with `SIGINT` on POSIX
     systems.


=====================================
rts/StgMiscClosures.cmm
=====================================
@@ -521,6 +521,7 @@ INFO_TABLE(stg_IND,1,0,IND,"IND","IND")
     (P_ node)
 {
     TICK_ENT_DYN_IND(); /* tick */
+    ACQUIRE_FENCE;
     node = UNTAG(StgInd_indirectee(node));
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(node) (node);
@@ -529,6 +530,7 @@ INFO_TABLE(stg_IND,1,0,IND,"IND","IND")
     /* explicit stack */
 {
     TICK_ENT_DYN_IND(); /* tick */
+    ACQUIRE_FENCE;
     R1 = UNTAG(StgInd_indirectee(R1));
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(R1) [R1];
@@ -539,6 +541,7 @@ INFO_TABLE(stg_IND_STATIC,1,0,IND_STATIC,"IND_STATIC","IND_STATIC")
     /* explicit stack */
 {
     TICK_ENT_STATIC_IND(); /* tick */
+    ACQUIRE_FENCE;
     R1 = UNTAG(StgInd_indirectee(R1));
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(R1) [R1];


=====================================
rts/include/stg/SMP.h
=====================================
@@ -214,23 +214,22 @@ EXTERN_INLINE void load_load_barrier(void);
  * examining a thunk being updated can see the indirectee. Consequently, a
  * thunk update (see rts/Updates.h) does the following:
  *
- *  1. Use a release-fence to ensure that the indirectee is visible
- *  2. Use a relaxed-store to place the new indirectee into the thunk's
+ *  1. Use a relaxed-store to place the new indirectee into the thunk's
  *     indirectee field
- *  3. use a release-store to set the info table to stg_BLACKHOLE (which
+ *  2. use a release-store to set the info table to stg_BLACKHOLE (which
  *     represents an indirection)
  *
  * Blackholing a thunk (either eagerly, by GHC.StgToCmm.Bind.emitBlackHoleCode,
  * or lazily, by ThreadPaused.c:threadPaused) is done similarly.
  *
- * Conversely, thunk entry (see the entry code of stg_BLACKHOLE in
- * rts/StgMiscClosure) does the following:
+ * Conversely, indirection entry (see the entry code of stg_BLACKHOLE, stg_IND,
+ * and stg_IND_STATIC in rts/StgMiscClosure.cmm) does the following:
  *
- *  1. We jump into the entry code for stg_BLACKHOLE; this of course implies
- *     that we have already read the thunk's info table pointer, which is done
- *     with a relaxed load.
+ *  1. We jump into the entry code for, e.g., stg_BLACKHOLE; this of course
+ *     implies that we have already read the thunk's info table pointer, which
+ *     is done with a relaxed load.
  *  2. use an acquire-fence to ensure that our view on the thunk is
- *     up-to-date. This synchronizes with step (3) in the update
+ *     up-to-date. This synchronizes with step (2) in the update
  *     procedure.
  *  3. relaxed-load the indirectee. Since thunks are updated at most
  *     once we know that the fence in the last step has given us


=====================================
rts/sm/NonMoving.c
=====================================
@@ -395,7 +395,8 @@ Mutex concurrent_coll_finished_lock;
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  * The nonmoving collector uses an approximate heuristic for reporting live
  * data quantity. Specifically, during mark we record how much live data we
- * find in nonmoving_live_words. At the end of mark we declare this amount to
+ * find in nonmoving_segment_live_words. At the end of mark this is combined with nonmoving_large_words
+ * and nonmoving_compact_words, and we declare this amount to
  * be how much live data we have on in the nonmoving heap (by setting
  * oldest_gen->live_estimate).
  *
@@ -540,7 +541,7 @@ Mutex concurrent_coll_finished_lock;
  *
  */
 
-memcount nonmoving_live_words = 0;
+memcount nonmoving_segment_live_words = 0;
 
 // See Note [Sync phase marking budget].
 MarkBudget sync_phase_marking_budget = 200000;
@@ -682,10 +683,11 @@ static void nonmovingPrepareMark(void)
         dbl_link_onto(bd, &nonmoving_large_objects);
     }
     n_nonmoving_large_blocks += oldest_gen->n_large_blocks;
+    nonmoving_large_words += oldest_gen->n_large_words;
     oldest_gen->large_objects = NULL;
     oldest_gen->n_large_words = 0;
     oldest_gen->n_large_blocks = 0;
-    nonmoving_live_words = 0;
+    nonmoving_segment_live_words = 0;
 
     // Clear compact object mark bits
     for (bdescr *bd = nonmoving_compact_objects; bd; bd = bd->link) {
@@ -700,6 +702,7 @@ static void nonmovingPrepareMark(void)
         dbl_link_onto(bd, &nonmoving_compact_objects);
     }
     n_nonmoving_compact_blocks += oldest_gen->n_compact_blocks;
+    nonmoving_compact_words += oldest_gen->n_compact_blocks * BLOCK_SIZE_W;
     oldest_gen->n_compact_blocks = 0;
     oldest_gen->compact_objects = NULL;
     // TODO (osa): what about "in import" stuff??
@@ -1053,7 +1056,9 @@ concurrent_marking:
     freeMarkQueue(mark_queue);
     stgFree(mark_queue);
 
-    oldest_gen->live_estimate = nonmoving_live_words;
+    nonmoving_large_words = countOccupied(nonmoving_marked_large_objects);
+    nonmoving_compact_words = n_nonmoving_marked_compact_blocks * BLOCK_SIZE_W;
+    oldest_gen->live_estimate = nonmoving_segment_live_words + nonmoving_large_words + nonmoving_compact_words;
     oldest_gen->n_old_blocks = 0;
     resizeGenerations();
 


=====================================
rts/sm/NonMoving.h
=====================================
@@ -122,7 +122,7 @@ struct NonmovingHeap {
 
 extern struct NonmovingHeap nonmovingHeap;
 
-extern memcount nonmoving_live_words;
+extern memcount nonmoving_segment_live_words;
 
 #if defined(THREADED_RTS)
 extern bool concurrent_coll_running;


=====================================
rts/sm/NonMovingMark.c
=====================================
@@ -76,6 +76,10 @@ static bool is_nonmoving_weak(StgWeak *weak);
  * consequently will trace the pointers of only one object per block. However,
  * this is okay since the only type of pinned object supported by GHC is the
  * pinned ByteArray#, which has no pointers.
+ *
+ * We need to take care that the stats department is made aware of the amount of
+ * live large (and compact) objects, since they no longer live on gen[i]->large_objects.
+ * Failing to do so caused #17574.
  */
 
 bdescr *nonmoving_large_objects = NULL;
@@ -83,6 +87,9 @@ bdescr *nonmoving_marked_large_objects = NULL;
 memcount n_nonmoving_large_blocks = 0;
 memcount n_nonmoving_marked_large_blocks = 0;
 
+memcount nonmoving_large_words = 0;
+memcount nonmoving_compact_words = 0;
+
 bdescr *nonmoving_compact_objects = NULL;
 bdescr *nonmoving_marked_compact_objects = NULL;
 memcount n_nonmoving_compact_blocks = 0;
@@ -1745,7 +1752,7 @@ mark_closure (MarkQueue *queue, const StgClosure *p0, StgClosure **origin)
         struct NonmovingSegment *seg = nonmovingGetSegment((StgPtr) p);
         nonmoving_block_idx block_idx = nonmovingGetBlockIdx((StgPtr) p);
         nonmovingSetMark(seg, block_idx);
-        nonmoving_live_words += nonmovingSegmentBlockSize(seg) / sizeof(W_);
+        nonmoving_segment_live_words += nonmovingSegmentBlockSize(seg) / sizeof(W_);
     }
 
     // If we found a indirection to shortcut keep going.


=====================================
rts/sm/NonMovingMark.h
=====================================
@@ -127,6 +127,11 @@ extern bdescr *nonmoving_large_objects, *nonmoving_marked_large_objects,
 extern memcount n_nonmoving_large_blocks, n_nonmoving_marked_large_blocks,
                 n_nonmoving_compact_blocks, n_nonmoving_marked_compact_blocks;
 
+// The size of live large/compact objects in words.
+// Only updated at the end of nonmoving GC.
+extern memcount nonmoving_large_words,
+                nonmoving_compact_words;
+
 extern StgTSO *nonmoving_old_threads;
 extern StgWeak *nonmoving_old_weak_ptr_list;
 extern StgTSO *nonmoving_threads;


=====================================
rts/sm/Storage.c
=====================================
@@ -42,6 +42,7 @@
 #include "GC.h"
 #include "Evac.h"
 #include "NonMovingAllocate.h"
+#include "sm/NonMovingMark.h"
 #if defined(ios_HOST_OS) || defined(darwin_HOST_OS)
 #include "Hash.h"
 #endif
@@ -1615,7 +1616,12 @@ W_ genLiveWords (generation *gen)
 
 W_ genLiveBlocks (generation *gen)
 {
-    return gen->n_blocks + gen->n_large_blocks + gen->n_compact_blocks;
+  W_ nonmoving_blocks = 0;
+  // The nonmoving heap contains some blocks that live outside the regular generation structure.
+  if (gen == oldest_gen && RtsFlags.GcFlags.useNonmoving){
+    nonmoving_blocks = n_nonmoving_large_blocks + n_nonmoving_marked_large_blocks + n_nonmoving_compact_blocks + n_nonmoving_marked_compact_blocks;
+  }
+  return gen->n_blocks + gen->n_large_blocks + gen->n_compact_blocks + nonmoving_blocks;
 }
 
 W_ gcThreadLiveWords (uint32_t i, uint32_t g)
@@ -1711,6 +1717,9 @@ StgWord calcTotalLargeObjectsW (void)
     for (g = 0; g < RtsFlags.GcFlags.generations; g++) {
         totalW += generations[g].n_large_words;
     }
+
+    totalW += nonmoving_large_words;
+
     return totalW;
 }
 
@@ -1722,6 +1731,9 @@ StgWord calcTotalCompactW (void)
     for (g = 0; g < RtsFlags.GcFlags.generations; g++) {
         totalW += generations[g].n_compact_blocks * BLOCK_SIZE_W;
     }
+
+    totalW += nonmoving_compact_words;
+
     return totalW;
 }
 


=====================================
testsuite/tests/rts/T17574.hs
=====================================
@@ -0,0 +1,40 @@
+-- | Check that large objects are properly accounted for by GHC.Stats
+module Main (main) where
+
+import Control.Monad
+import Control.Exception
+import Control.Concurrent
+import System.Mem
+import System.Exit
+import GHC.Stats
+import GHC.Compact
+import Data.List (replicate)
+
+import qualified Data.ByteString.Char8 as BS
+
+doGC :: IO ()
+doGC = do
+  performMajorGC
+  threadDelay 1000 -- small delay to allow GC to run when using concurrent gc
+
+main :: IO ()
+main = do
+  let size = 4096*2
+  largeString <- evaluate $ BS.replicate size 'A'
+  compactString <- compact $ replicate size 'A'
+  doGC
+  doGC -- run GC twice to make sure the objects end up in the oldest gen
+  stats <- getRTSStats
+  let large_obj_bytes = gcdetails_large_objects_bytes $ gc stats
+  let compact_obj_bytes = gcdetails_compact_bytes $ gc stats
+  -- assert that large_obj_bytes is at least as big as size
+  -- this indicates that `largeString` is being accounted for by the stats department
+  when (large_obj_bytes < fromIntegral size) $ do
+    putStrLn $ "large_obj_bytes is: " <> show large_obj_bytes <> " but expected at least: " <> show size
+    exitFailure
+  when (compact_obj_bytes < fromIntegral size) $ do
+    putStrLn $ "compact_obj_bytes is: " <> show large_obj_bytes <> " but expected at least: " <> show size
+    exitFailure
+  -- keep them alive
+  print $ BS.length largeString
+  print $ length $ getCompact compactString


=====================================
testsuite/tests/rts/T17574.stdout
=====================================
@@ -0,0 +1,2 @@
+8192
+8192


=====================================
testsuite/tests/rts/all.T
=====================================
@@ -573,3 +573,5 @@ test('decodeMyStack_emptyListForMissingFlag',
 test('T22795a', [only_ways(['normal']), js_skip], compile_and_run, ['-threaded'])
 test('T22795b', [only_ways(['normal']), js_skip], compile_and_run, ['-single-threaded'])
 test('T22795c', [only_ways(['normal']), js_skip], compile_and_run, ['-threaded -single-threaded'])
+
+test('T17574', [js_skip], compile_and_run, ['-with-rtsopts -T'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/08285a6a2f84613b1ec439cd47a9cc6ed4427aac...fe1bf91cd3ab288f29ea956f772d3d6b2134a56c

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/08285a6a2f84613b1ec439cd47a9cc6ed4427aac...fe1bf91cd3ab288f29ea956f772d3d6b2134a56c
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230324/c72edef6/attachment-0001.html>


More information about the ghc-commits mailing list