[Git][ghc/ghc][master] 2 commits: Add HasCallStack to T23221

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Thu Jul 4 15:10:05 UTC 2024



Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
4664997d by Teo Camarasu at 2024-07-04T11:09:18-04:00
Add HasCallStack to T23221

This makes the test a bit easier to debug

- - - - -
66919dcc by Teo Camarasu at 2024-07-04T11:09:18-04:00
rts: use live words to estimate heap size

We use live words rather than live blocks to determine the size of the
heap for determining memory retention.

Most of the time these two metrics align, but they can come apart in
normal usage when using the nonmoving collector.

The nonmoving collector leads to a lot of partially occupied blocks. So,
using live words is more accurate.

They can also come apart when the heap is suffering from high levels
fragmentation caused by small pinned objects, but in this case, the
block size is the more accurate metric. Since this case is best avoided
anyway. It is ok to accept the trade-off that we might try (and
probably) fail to return more memory in this case.

See also the Note [Statistics for retaining memory]

Resolves #23397

- - - - -


5 changed files:

- docs/users_guide/9.12.1-notes.rst
- rts/sm/GC.c
- rts/sm/Storage.c
- rts/sm/Storage.h
- testsuite/tests/rts/T23221.hs


Changes:

=====================================
docs/users_guide/9.12.1-notes.rst
=====================================
@@ -91,6 +91,11 @@ Runtime system
 
 - Reduce fragmentation incurred by the nonmoving GC's segment allocator. In one application this reduced resident set size by 26%. See :ghc-ticket:`24150`.
 
+- Memory return logic now uses live bytes rather than live blocks to measure the size of the heap.
+  This primarily affects the non-moving GC, which should now be more willing to return memory to the OS.
+  Users who have fine-tuned the :rts-flag:`-F ⟨factor⟩`, :rts-flag:`-Fd ⟨factor⟩`, or :rts-flag:`-O ⟨size⟩` flags,
+  and use the non-moving GC, should see if adjustments are needed in light of this change.
+
 - The new runtime flag :rts-flag:`--read-tix-file=\<yes|no\>` allows to modify whether a preexisting .tix file is read in at the beginning of a program run.
   The default is currently ``--read-tix-file=yes`` but will change to ``--read-tix-file=no`` in a future version of GHC.
   For this reason, a warning is emitted if a .tix file is read in implicitly. You can silence this warning by explicitly passing ``--read-tix-file=yes``.


=====================================
rts/sm/GC.c
=====================================
@@ -1005,10 +1005,15 @@ GarbageCollect (struct GcConfig config,
       need_copied_live = 0;
       need_uncopied_live = 0;
       for (i = 0; i < RtsFlags.GcFlags.generations; i++) {
-          need_copied_live += genLiveCopiedBlocks(&generations[i]);
-          need_uncopied_live += genLiveUncopiedBlocks(&generations[i]);
+          need_copied_live += genLiveCopiedWords(&generations[i]);
+          need_uncopied_live += genLiveUncopiedWords(&generations[i]);
       }
 
+      // Convert the live words into live blocks
+      // See Note [Statistics for retaining memory]
+      need_copied_live = BLOCK_ROUND_UP(need_copied_live) / BLOCK_SIZE_W;
+      need_uncopied_live = BLOCK_ROUND_UP(need_uncopied_live) / BLOCK_SIZE_W;
+
       debugTrace(DEBUG_gc, "(before) copied_live: %d; uncopied_live: %d", need_copied_live, need_uncopied_live );
 
 
@@ -1032,7 +1037,7 @@ GarbageCollect (struct GcConfig config,
 
       ASSERT(need_uncopied_live + need_copied_live >= RtsFlags.GcFlags.minOldGenSize );
 
-      debugTrace(DEBUG_gc, "(after) copyied_live: %d; uncopied_live: %d", need_copied_live, need_uncopied_live );
+      debugTrace(DEBUG_gc, "(after) copied_live: %d; uncopied_live: %d", need_copied_live, need_uncopied_live );
 
       need_prealloc = 0;
       for (i = 0; i < n_nurseries; i++) {
@@ -1070,7 +1075,7 @@ GarbageCollect (struct GcConfig config,
 
       W_ scaled_needed = ((scaled_factor + unavoidable_copied_need_factor) * need_copied_live)
                        + ((scaled_factor + unavoidable_uncopied_need_factor) * need_uncopied_live);
-      debugTrace(DEBUG_gc, "factors_2: %f %d", ((scaled_factor + unavoidable_copied_need_factor) * need_copied_live), ((scaled_factor + unavoidable_uncopied_need_factor) * need_uncopied_live));
+      debugTrace(DEBUG_gc, "factors_2: %f %f", ((scaled_factor + unavoidable_copied_need_factor) * need_copied_live), ((scaled_factor + unavoidable_uncopied_need_factor) * need_uncopied_live));
       need = need_prealloc + scaled_needed;
 
       /* Also, if user set heap size, do not drop below it.
@@ -2426,3 +2431,39 @@ bool doIdleGCWork(Capability *cap STG_UNUSED, bool all)
  *
 
 */
+
+/* Note [Statistics for retaining memory]
+*  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+*
+* At the end of GC, we want to determine the size of the heap in order to
+* determine the amount of memory we wish to return to the OS, or if we want
+* to increase the heap size to the minimum.
+*
+* There's two promising candidates for this metric: live words, and live blocks.
+*
+* Measuring live blocks is promising because blocks are the smallest unit
+* that the storage manager can (de)allocate.
+* Most of the time live words and live blocks are very similar.
+*
+* But the two metrics can come apart when the heap is dominated
+* by small pinned objects, or when using the non-moving collector.
+*
+* In both cases, this happens because objects cannot be copied, so
+* block occupancy can fall as objects in a block become garbage.
+* In situations like this, using live blocks to determine memory
+* retention behaviour can lead to us being overly conservative.
+*
+* Instead we use live words rounded up to the block size to measure
+* heap size. This gives us a more accurate picture of the heap.
+*
+* This works particularly well with the nonmoving collector as we
+* can reuse the space taken up by dead heap objects. This choice is less good
+* for fragmentation caused by a few pinned objects retaining blocks.
+* In that case, the block can only be reused if it is deallocated in its entirety.
+* And therefore using live blocks would be more accurate in this case.
+* We assume that this is relatively rare and when it does happen,
+* this fragmentation is a problem that should be addressed in its own right.
+*
+* See: #23397
+*
+*/


=====================================
rts/sm/Storage.c
=====================================
@@ -1645,11 +1645,36 @@ W_ countOccupied (bdescr *bd)
     return words;
 }
 
-// Returns the total number of live blocks
+// Returns the total number of live words
 W_ genLiveWords (generation *gen)
 {
-    return (gen->live_estimate ? gen->live_estimate : gen->n_words) +
-        gen->n_large_words + gen->n_compact_blocks * BLOCK_SIZE_W;
+    return genLiveCopiedWords(gen) + genLiveUncopiedWords(gen);
+}
+
+// The number of live words which will be copied by the copying collector.
+W_ genLiveCopiedWords (generation *gen)
+{
+  if (gen == oldest_gen && RtsFlags.GcFlags.useNonmoving){
+    // the non-moving generation doesn't contain any copied data
+    return 0;
+  } else {
+    return gen->live_estimate ? gen->live_estimate : gen->n_words;
+  }
+}
+
+// The number of live words which will not be copied by the copying collector
+// This includes data living in non-moving collector segments, compact blocks and large/pinned blocks.
+W_ genLiveUncopiedWords(generation *gen)
+{
+  W_ nonmoving_blocks = 0;
+  // The nonmoving heap contains some blocks that live outside the regular generation structure.
+  if (gen == oldest_gen && RtsFlags.GcFlags.useNonmoving){
+    nonmoving_blocks =
+        (gen->live_estimate ? gen->live_estimate : gen->n_words)
+      + nonmoving_large_words
+      + nonmoving_compact_words;
+  }
+  return gen->n_large_words + gen->n_compact_blocks * BLOCK_SIZE_W + nonmoving_blocks;
 }
 
 // The number of live blocks which will be copied by the copying collector.


=====================================
rts/sm/Storage.h
=====================================
@@ -111,6 +111,8 @@ StgWord gcThreadLiveWords  (uint32_t i, uint32_t g);
 StgWord gcThreadLiveBlocks (uint32_t i, uint32_t g);
 
 StgWord genLiveWords  (generation *gen);
+StgWord genLiveCopiedWords (generation *gen);
+StgWord genLiveUncopiedWords (generation *gen);
 StgWord genLiveBlocks (generation *gen);
 StgWord genLiveCopiedBlocks (generation *gen);
 StgWord genLiveUncopiedBlocks (generation *gen);


=====================================
testsuite/tests/rts/T23221.hs
=====================================
@@ -10,6 +10,7 @@ import Debug.Trace
 import Control.Monad
 import GHC.Stats
 import Data.Word
+import GHC.Stack (HasCallStack)
 
 -- This test is for checking the memory return behaviour of blocks which will be
 -- copied and blocks which are not copied (#23221)
@@ -25,6 +26,7 @@ main = do
 
 -- The upper_bound is the upper bound on how much total memory should be live at the end
 -- of the test as a factor of the expected live bytes.
+loop :: HasCallStack => (Int -> IO a) -> Double -> Double -> Int -> IO ()
 loop f lower_bound upper_bound n = do
   ba <- mapM (\_ -> f 128) [0..n]
   traceMarkerIO "Allocated_all"
@@ -39,7 +41,7 @@ loop f lower_bound upper_bound n = do
   evaluate (length (reverse ba'))
   return total_mem
 
-checkStats :: Double -> Double -> Int -> IO ()
+checkStats :: HasCallStack => Double -> Double -> Int -> IO ()
 checkStats lower_bound upper_bound n = do
   stats <- getRTSStats
   let expected_live_memory = fromIntegral n -- How many objects



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/0d170eafacba55325dc00d0434d4462275d4376e...66919dcca33b11eca182671ae299fe09effa92fb

-- 
This project does not include diff previews in email notifications.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/0d170eafacba55325dc00d0434d4462275d4376e...66919dcca33b11eca182671ae299fe09effa92fb
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240704/bcdee05f/attachment-0001.html>


More information about the ghc-commits mailing list