[Git][ghc/ghc][wip/gc/T18016] nonmoving: Fix handling of dirty objects

Ben Gamari gitlab at gitlab.haskell.org
Thu Apr 30 16:44:08 UTC 2020



Ben Gamari pushed to branch wip/gc/T18016 at Glasgow Haskell Compiler / GHC


Commits:
3bc91b7a by Ben Gamari at 2020-04-30T12:43:54-04:00
nonmoving: Fix handling of dirty objects

Previously we (incorrectly) relied on failed_to_evac to be "precise".
That is, we expected it to only be true if *all* of an object's fields
lived outside of the non-moving heap. However, does not match the
behavior of failed_to_evac, which is true if *any* of the object's
fields weren't promoted (meaning that some others *may* live in the
non-moving heap).

This is problematic as we skip the non-moving write barrier for dirty
objects (which we can only safely do if *all* fields point outside of
the non-moving heap).

Clearly this arises due to a fundamental difference in the behavior
expected of failed_to_evac in the moving and non-moving collector.
e.g., in the moving collector it is always safe to conservatively say
failed_to_evac=true whereas in the non-moving collector the safe value
is false.

This issue went unnoticed as I never wrote down the dirtiness
invariant enforced by the non-moving collector. We now define this
invariant as

    An object being marked as dirty implies that all of its fields are
    on the mark queue (or, equivalently, update remembered set).

To maintain this invariant we teach nonmovingScavengeOne to push the
fields of objects which we fail to evacuate to the update remembered
set. This is a simple and reasonably cheap solution and avoids the
complexity and fragility that other, more strict alternative invariants
would require.

All of this is described in a new Note, Note [Dirty flags in the
non-moving collector] in NonMoving.c.

- - - - -


4 changed files:

- rts/sm/NonMoving.c
- rts/sm/NonMovingMark.c
- rts/sm/NonMovingScav.c
- rts/sm/Storage.c


Changes:

=====================================
rts/sm/NonMoving.c
=====================================
@@ -228,6 +228,10 @@ Mutex concurrent_coll_finished_lock;
  *  - Note [Static objects under the nonmoving collector] (Storage.c) describes
  *    treatment of static objects.
  *
+ *  - Note [Dirty flags in the non-moving collector] (NonMoving.c) describes
+ *    how we use the DIRTY flags associated with MUT_VARs and TVARs to improve
+ *    barrier efficiency.
+ *
  *
  * Note [Concurrent non-moving collection]
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -369,6 +373,7 @@ Mutex concurrent_coll_finished_lock;
  * approximate due to concurrent collection and ultimately seems more costly
  * than the problem demands.
  *
+ *
  * Note [Spark management under the nonmoving collector]
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  * Every GC, both minor and major, prunes the spark queue (using
@@ -387,6 +392,88 @@ Mutex concurrent_coll_finished_lock;
  *    BF_EVACUATED flag won't be set on the nursery blocks) and will consequently
  *    only prune dead sparks living in the non-moving heap.
  *
+ *
+ * Note [Dirty flags in the non-moving collector]
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ * Some mutable object types (e.g. MUT_VARs, TVARs) have a one-bit dirty flag
+ * encoded in their info table pointer. The moving collector uses this flag
+ * to minimize redundant mut_list entries. The flag is preserves the following
+ * simple invariant:
+ *
+ *     An object being marked as dirty implies that the object is on mut_list.
+ *
+ * This allows a nice optimisation in the write barrier (e.g. dirty_MUT_VAR):
+ * if we write to an already-dirty object there is no need to
+ * push it to the mut_list as we know it's already there.
+ *
+ * During GC (scavenging) we will then keep track of whether all of the
+ * object's reference have been promoted. If so we can mark the object as clean.
+ * If not then we re-add it to mut_list and mark it as dirty.
+ *
+ * In the non-moving collector we use the same dirty flag to implement a
+ * related optimisation on the non-moving write barrier: Specifically, the
+ * snapshot invariant only requires that the non-moving write barrier applies
+ * to the *first* mutation to an object after collection begins. To achieve this,
+ * we impose the following invariant:
+ *
+ *     An object being marked as dirty implies that all of its fields are on
+ *     the mark queue (or, equivalently, update remembered set).
+ *
+ * With this guarantee we can safely make the the write barriers dirty objects
+ * no-ops. We perform this optimisation for the following object types:
+ *
+ *  - MVAR
+ *  - TVAR
+ *  - MUT_VAR
+ *
+ * However, maintaining this invariant requires great care. For instance,
+ * consider the case of an MVar (which has two pointer fields) before
+ * preparatory collection:
+ *
+ *    Non-moving heap     ┊      Moving heap
+ *         gen 1          ┊         gen 0
+ *  ──────────────────────┼────────────────────────────────
+ *                        ┊
+ *         MVAR A  ────────────────→ X
+ *        (dirty)  ───────────╮
+ *                        ┊   ╰────→ Y
+ *                        ┊          │
+ *                        ┊          │
+ *           ╭───────────────────────╯
+ *           │            ┊
+ *           ↓            ┊
+ *           Z            ┊
+ *                        ┊
+ *
+ * During the preparatory collection we promote Y to the nonmoving heap but
+ * fail to promote X. Since the failed_to_evac field is conservative (being set
+ * if *any* of the fields are not promoted), this gives us:
+ *
+ *    Non-moving heap     ┊      Moving heap
+ *         gen 1          ┊         gen 0
+ *  ──────────────────────┼────────────────────────────────
+ *                        ┊
+ *         MVAR A  ────────────────→ X
+ *        (dirty)         ┊
+ *           │            ┊
+ *           │            ┊
+ *           ↓            ┊
+ *           Y            ┊
+ *           │            ┊
+ *           │            ┊
+ *           ↓            ┊
+ *           Z            ┊
+ *                        ┊
+ *
+ * This is bad. When we resume mutation a mutator may mutate MVAR A; since it's
+ * already dirty we would fail to add Y to the update remembered set, breaking the
+ * snapshot invariant and potentially losing track of the liveness of Z.
+ *
+ * To avoid this nonmovingScavengeOne we eagerly pushes the values of the
+ * fields of all objects which it fails to evacuate (e.g. MVAR A) to the update
+ * remembered set during the preparatory GC. This allows us to safely skip the
+ * non-moving write barrier without jeopardizing the snapshot invariant.
+ *
  */
 
 memcount nonmoving_live_words = 0;


=====================================
rts/sm/NonMovingMark.c
=====================================
@@ -27,6 +27,7 @@
 #include "sm/Storage.h"
 #include "CNF.h"
 
+static bool check_in_nonmoving_heap(StgClosure *p);
 static void mark_closure (MarkQueue *queue, const StgClosure *p, StgClosure **origin);
 static void mark_tso (MarkQueue *queue, StgTSO *tso);
 static void mark_stack (MarkQueue *queue, StgStack *stack);
@@ -450,10 +451,17 @@ push (MarkQueue *q, const MarkQueueEnt *ent)
 void
 markQueuePushClosureGC (MarkQueue *q, StgClosure *p)
 {
+    if (!check_in_nonmoving_heap(p)) {
+        return;
+    }
+
     /* We should not make it here if we are doing a deadlock detect GC.
      * See Note [Deadlock detection under nonmoving collector].
+     * This is actually no longer true due to call in nonmovingScavengeOne
+     * introduced due to Note [Dirty flags in the non-moving collector]
+     * (see NonMoving.c).
      */
-    ASSERT(!deadlock_detect_gc);
+    //ASSERT(!deadlock_detect_gc);
 
     // Are we at the end of the block?
     if (q->top->head == MARK_QUEUE_BLOCK_ENTRIES) {


=====================================
rts/sm/NonMovingScav.c
=====================================
@@ -31,6 +31,11 @@ nonmovingScavengeOne (StgClosure *q)
         gct->eager_promotion = saved_eager_promotion;
         if (gct->failed_to_evac) {
             mvar->header.info = &stg_MVAR_DIRTY_info;
+
+            // Note [Dirty flags in the non-moving collector] in NonMoving.c
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) mvar->head);
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) mvar->tail);
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) mvar->value);
         } else {
             mvar->header.info = &stg_MVAR_CLEAN_info;
         }
@@ -46,6 +51,10 @@ nonmovingScavengeOne (StgClosure *q)
         gct->eager_promotion = saved_eager_promotion;
         if (gct->failed_to_evac) {
             tvar->header.info = &stg_TVAR_DIRTY_info;
+
+            // Note [Dirty flags in the non-moving collector] in NonMoving.c
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) tvar->current_value);
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) tvar->first_watch_queue_entry);
         } else {
             tvar->header.info = &stg_TVAR_CLEAN_info;
         }
@@ -160,16 +169,21 @@ nonmovingScavengeOne (StgClosure *q)
     }
 
     case MUT_VAR_CLEAN:
-    case MUT_VAR_DIRTY:
+    case MUT_VAR_DIRTY: {
+        StgMutVar *mv = (StgMutVar *) p;
         gct->eager_promotion = false;
-        evacuate(&((StgMutVar *)p)->var);
+        evacuate(&mv->var);
         gct->eager_promotion = saved_eager_promotion;
         if (gct->failed_to_evac) {
             ((StgClosure *)q)->header.info = &stg_MUT_VAR_DIRTY_info;
+
+            // Note [Dirty flags in the non-moving collector] in NonMoving.c
+            markQueuePushClosureGC(&gct->cap->upd_rem_set.queue, (StgClosure *) mv->var);
         } else {
             ((StgClosure *)q)->header.info = &stg_MUT_VAR_CLEAN_info;
         }
         break;
+    }
 
     case BLOCKING_QUEUE:
     {


=====================================
rts/sm/Storage.c
=====================================
@@ -1304,6 +1304,7 @@ dirty_MUT_VAR(StgRegTable *reg, StgMutVar *mvar, StgClosure *old)
         mvar->header.info = &stg_MUT_VAR_DIRTY_info;
         recordClosureMutated(cap, (StgClosure *) mvar);
         IF_NONMOVING_WRITE_BARRIER_ENABLED {
+            // See Note [Dirty flags in the non-moving collector] in NonMoving.c
             updateRemembSetPushClosure_(reg, old);
         }
     }
@@ -1326,6 +1327,7 @@ dirty_TVAR(Capability *cap, StgTVar *p,
         p->header.info = &stg_TVAR_DIRTY_info;
         recordClosureMutated(cap,(StgClosure*)p);
         IF_NONMOVING_WRITE_BARRIER_ENABLED {
+            // See Note [Dirty flags in the non-moving collector] in NonMoving.c
             updateRemembSetPushClosure(cap, old);
         }
     }
@@ -1407,6 +1409,7 @@ update_MVAR(StgRegTable *reg, StgClosure *p, StgClosure *old_val)
 {
     Capability *cap = regTableToCapability(reg);
     IF_NONMOVING_WRITE_BARRIER_ENABLED {
+        // See Note [Dirty flags in the non-moving collector] in NonMoving.c
         StgMVar *mvar = (StgMVar *) p;
         updateRemembSetPushClosure(cap, old_val);
         updateRemembSetPushClosure(cap, (StgClosure *) mvar->head);



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/3bc91b7a9e33962b816ec81fd2155819c56d210f

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/3bc91b7a9e33962b816ec81fd2155819c56d210f
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20200430/932ba30f/attachment-0001.html>


More information about the ghc-commits mailing list