[Git][ghc/ghc][wip/tsan/fixes] 2 commits: Use relaxed accesses in ticky bumping

Ben Gamari (@bgamari) gitlab at gitlab.haskell.org
Tue Apr 11 12:43:09 UTC 2023



Ben Gamari pushed to branch wip/tsan/fixes at Glasgow Haskell Compiler / GHC


Commits:
73c0e306 by Ben Gamari at 2023-04-11T08:28:58-04:00
Use relaxed accesses in ticky bumping

- - - - -
ec7473e2 by Ben Gamari at 2023-04-11T08:42:23-04:00
Fix thunk update ordering

Previously we attempted to ensure soundness of concurrent thunk update
by synchronizing on the access of the thunk's info table pointer field.
This was believed to be sufficient since the indirectee (which may
expose a closure allocated by another core) would not be examined
until the info table pointer update is complete.

However, it turns out that this can result in data races in the presence
of multiple threads racing a update a single thunk. For instance,
consider this interleaving under the old scheme:

            Thread A                             Thread B
            ---------                            ---------
    t=0     Enter t
      1     Push update frame
      2     Begin evaluation

      4     Pause thread
      5     t.indirectee=tso
      6     Release t.info=BLACKHOLE

      7     ... (e.g. GC)

      8     Resume thread
      9     Finish evaluation
      10    Relaxed t.indirectee=x

      11                                         Load t.info
      12                                         Acquire fence
      13                                         Inspect t.indirectee

      14    Release t.info=BLACKHOLE

Here Thread A enters thunk `t` but is soon paused, resulting in `t`
being lazily blackholed at t=6. Then, at t=10 Thread A finishes
evaluation and updates `t.indirectee` with a relaxed store.

Meanwhile, Thread B enters the blackhole. Under the old scheme this
would introduce an acquire-fence but this would only synchronize with
Thread A at t=6. Consequently, the result of the evaluation, `x`, is not
visible to Thread B, introducing a data race.

We fix this by treating the `indirectee` field as we do all other
mutable fields. This means we must always access this field with
acquire-loads and release-stores.

See #23185.

- - - - -


17 changed files:

- compiler/GHC/StgToCmm/Bind.hs
- compiler/GHC/StgToCmm/Ticky.hs
- compiler/GHC/StgToCmm/Utils.hs
- rts/Apply.cmm
- rts/Compact.cmm
- rts/Heap.c
- rts/Interpreter.c
- rts/PrimOps.cmm
- rts/StableName.c
- rts/StgMiscClosures.cmm
- rts/ThreadPaused.c
- rts/Threads.c
- rts/Updates.cmm
- rts/Updates.h
- rts/include/stg/SMP.h
- rts/sm/NonMovingMark.c
- utils/genapply/Main.hs


Changes:

=====================================
compiler/GHC/StgToCmm/Bind.hs
=====================================
@@ -702,11 +702,19 @@ emitBlackHoleCode node = do
 
   when eager_blackholing $ do
     whenUpdRemSetEnabled $ emitUpdRemSetPushThunk node
-    emitStore (cmmOffsetW platform node (fixedHdrSizeW profile)) (currentTSOExpr platform)
+    emitAtomicStore platform OrderingSeqCst
+        (cmmOffsetW platform node (fixedHdrSizeW profile))
+        (currentTSOExpr platform)
     -- See Note [Heap memory barriers] in SMP.h.
-    let w = wordWidth platform
-    emitPrimCall [] (MO_AtomicWrite w MemOrderRelease)
-        [node, CmmReg (CmmGlobal $ GlobalRegUse EagerBlackholeInfo $ bWord platform)]
+    emitAtomicStore platform OrderingRelaxed
+        node
+        (CmmReg (CmmGlobal $ GlobalRegUse EagerBlackholeInfo $ bWord platform))
+
+emitAtomicStore :: Platform -> MemoryOrdering -> CmmExpr -> CmmExpr -> FCode ()
+emitAtomicStore platform mord addr val =
+    emitPrimCall [] (MO_AtomicWrite w mord) [addr, val]
+  where
+    w = typeWidth $ cmmExprType platform val
 
 setupUpdate :: ClosureInfo -> LocalReg -> FCode () -> FCode ()
         -- Nota Bene: this function does not change Node (even if it's a CAF),


=====================================
compiler/GHC/StgToCmm/Ticky.hs
=====================================
@@ -829,25 +829,34 @@ bumpTickyLit :: CmmLit -> FCode ()
 bumpTickyLit lhs = bumpTickyLitBy lhs 1
 
 bumpTickyLitBy :: CmmLit -> Int -> FCode ()
-bumpTickyLitBy lhs n = do
-  platform <- getPlatform
-  emit (addToMem (bWord platform) (CmmLit lhs) n)
+bumpTickyLitBy lhs n = emitAddToMem (CmmLit lhs) n
 
 bumpTickyLitByE :: CmmLit -> CmmExpr -> FCode ()
-bumpTickyLitByE lhs e = do
-  platform <- getPlatform
-  emit (addToMemE (bWord platform) (CmmLit lhs) e)
+bumpTickyLitByE lhs e = emitAddToMemE (CmmLit lhs) e
 
 bumpHistogram :: FastString -> Int -> FCode ()
 bumpHistogram lbl n = do
     platform <- getPlatform
     let offset = n `min` (pc_TICKY_BIN_COUNT (platformConstants platform) - 1)
-    emit (addToMem (bWord platform)
-           (cmmIndexExpr platform
+    let addr = 
+           cmmIndexExpr platform
                 (wordWidth platform)
                 (CmmLit (CmmLabel (mkRtsCmmDataLabel lbl)))
-                (CmmLit (CmmInt (fromIntegral offset) (wordWidth platform))))
-           1)
+                (CmmLit (CmmInt (fromIntegral offset) (wordWidth platform)))
+    emitAddToMem addr 1
+
+emitAddToMem :: CmmExpr -> Int -> FCode ()
+emitAddToMem lhs n = do
+  platform <- getPlatform
+  emitAddToMemE lhs (mkIntExpr platform n)
+
+emitAddToMemE :: CmmExpr -> CmmExpr -> FCode ()
+emitAddToMemE lhs n = do
+  platform <- getPlatform
+  val <- newTemp (bWord platform)
+  emitAtomicRead MemOrderRelaxed val lhs
+  let val' = cmmOffsetExpr platform (CmmReg (CmmLocal val)) n
+  emitAtomicWrite MemOrderRelaxed lhs val'
 
 ------------------------------------------------------------------
 -- Showing the "type category" for ticky-ticky profiling


=====================================
compiler/GHC/StgToCmm/Utils.hs
=====================================
@@ -37,6 +37,7 @@ module GHC.StgToCmm.Utils (
         cmmUntag, cmmIsTagged,
 
         addToMem, addToMemE, addToMemLblE, addToMemLbl,
+        emitAtomicRead, emitAtomicWrite,
 
         -- * Update remembered set operations
         whenUpdRemSetEnabled,
@@ -59,6 +60,7 @@ import GHC.Platform.Regs
 import GHC.Cmm.CLabel
 import GHC.Cmm.Utils
 import GHC.Cmm.Switch
+import {-# SOURCE #-} GHC.StgToCmm.Foreign (emitPrimCall)
 import GHC.StgToCmm.CgUtils
 
 import GHC.Types.ForeignCall
@@ -118,6 +120,29 @@ addToMemE :: CmmType    -- rep of the counter
 addToMemE rep ptr n
   = mkStore ptr (CmmMachOp (MO_Add (typeWidth rep)) [CmmLoad ptr rep NaturallyAligned, n])
 
+-------------------------------------------------------------------------
+--      Atomic loads and stores
+-------------------------------------------------------------------------
+
+emitAtomicRead
+  :: MemoryOrdering
+  -> LocalReg -- ^ result register
+  -> CmmExpr  -- ^ address
+  -> FCode ()
+emitAtomicRead mord res addr
+  = void $ emitPrimCall [res] (MO_AtomicRead w mord) [addr]
+  where
+    w = typeWidth $ localRegType res
+
+emitAtomicWrite
+  :: MemoryOrdering
+  -> CmmExpr  -- ^ address
+  -> CmmExpr  -- ^ value
+  -> FCode ()
+emitAtomicWrite mord addr val
+  = do platform <- getPlatform
+       let w = typeWidth $ cmmExprType platform val
+       void $ emitPrimCall [] (MO_AtomicWrite w mord) [addr, val]
 
 -------------------------------------------------------------------------
 --


=====================================
rts/Apply.cmm
=====================================
@@ -108,7 +108,7 @@ again:
             IND,
             IND_STATIC:
         {
-            fun = StgInd_indirectee(fun);
+            fun = %acquire StgInd_indirectee(fun);
             goto again;
         }
         case BCO:
@@ -693,7 +693,7 @@ INFO_TABLE(stg_AP_STACK,/*special layout*/0,0,AP_STACK,"AP_STACK","AP_STACK")
   }
   // Can't add StgInd_indirectee(ap) to UpdRemSet here because the old value is
   // not reachable.
-  StgInd_indirectee(ap) = CurrentTSO;
+  %release StgInd_indirectee(ap) = CurrentTSO;
   SET_INFO_RELEASE(ap, __stg_EAGER_BLACKHOLE_info);
 
   /* ensure there is at least AP_STACK_SPLIM words of headroom available


=====================================
rts/Compact.cmm
=====================================
@@ -100,7 +100,7 @@ eval:
 
     // Follow indirections:
     case IND, IND_STATIC: {
-        p = StgInd_indirectee(p);
+        p = %acquire StgInd_indirectee(p);
         goto eval;
     }
 


=====================================
rts/Heap.c
=====================================
@@ -173,7 +173,7 @@ StgWord collect_pointers(StgClosure *closure, StgClosure *ptrs[]) {
         case IND:
         case IND_STATIC:
         case BLACKHOLE:
-            ptrs[nptrs++] = (StgClosure *)(((StgInd *)closure)->indirectee);
+            ptrs[nptrs++] = (StgClosure *) ACQUIRE_LOAD(&((StgInd *)closure)->indirectee);
             break;
 
         case MUT_ARR_PTRS_CLEAN:


=====================================
rts/Interpreter.c
=====================================
@@ -401,7 +401,7 @@ eval_obj:
     case IND:
     case IND_STATIC:
     {
-        tagged_obj = ((StgInd*)obj)->indirectee;
+        tagged_obj = ACQUIRE_LOAD(&((StgInd*)obj)->indirectee);
         goto eval_obj;
     }
 


=====================================
rts/PrimOps.cmm
=====================================
@@ -1770,7 +1770,7 @@ loop:
     qinfo = GET_INFO_ACQUIRE(q);
     if (qinfo == stg_IND_info ||
         qinfo == stg_MSG_NULL_info) {
-        q = StgInd_indirectee(q);
+        q = %acquire StgInd_indirectee(q);
         goto loop;
     }
 
@@ -1838,7 +1838,7 @@ loop:
 
     if (qinfo == stg_IND_info ||
         qinfo == stg_MSG_NULL_info) {
-        q = StgInd_indirectee(q);
+        q = %acquire StgInd_indirectee(q);
         goto loop;
     }
 
@@ -1940,7 +1940,7 @@ loop:
 
     if (qinfo == stg_IND_info ||
         qinfo == stg_MSG_NULL_info) {
-        q = StgInd_indirectee(q);
+        q = %acquire StgInd_indirectee(q);
         goto loop;
     }
 
@@ -2029,7 +2029,7 @@ loop:
 
     if (qinfo == stg_IND_info ||
         qinfo == stg_MSG_NULL_info) {
-        q = StgInd_indirectee(q);
+        q = %acquire StgInd_indirectee(q);
         goto loop;
     }
 
@@ -2309,7 +2309,7 @@ loop:
     //Possibly IND added by removeFromMVarBlockedQueue
     if (StgHeader_info(q) == stg_IND_info ||
         StgHeader_info(q) == stg_MSG_NULL_info) {
-        q = StgInd_indirectee(q);
+        q = %acquire StgInd_indirectee(q);
         goto loop;
     }
 


=====================================
rts/StableName.c
=====================================
@@ -156,11 +156,11 @@ removeIndirections (StgClosure* p)
         switch (get_itbl(q)->type) {
         case IND:
         case IND_STATIC:
-            p = ((StgInd *)q)->indirectee;
+            p = ACQUIRE_LOAD(&((StgInd *)q)->indirectee);
             continue;
 
         case BLACKHOLE:
-            p = ((StgInd *)q)->indirectee;
+            p = ACQUIRE_LOAD(&((StgInd *)q)->indirectee);
             if (GET_CLOSURE_TAG(p) != 0) {
                 continue;
             } else {


=====================================
rts/StgMiscClosures.cmm
=====================================
@@ -521,8 +521,8 @@ INFO_TABLE(stg_IND,1,0,IND,"IND","IND")
     (P_ node)
 {
     TICK_ENT_DYN_IND(); /* tick */
-    ACQUIRE_FENCE_ON(R1 + OFFSET_StgHeader_info);
-    node = UNTAG(StgInd_indirectee(node));
+    node = %acquire StgInd_indirectee(node);
+    node = UNTAG(node);
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(node) (node);
 }
@@ -530,8 +530,9 @@ INFO_TABLE(stg_IND,1,0,IND,"IND","IND")
     /* explicit stack */
 {
     TICK_ENT_DYN_IND(); /* tick */
-    ACQUIRE_FENCE_ON(R1 + OFFSET_StgHeader_info);
-    R1 = UNTAG(StgInd_indirectee(R1));
+    P_ p;
+    p = %acquire StgInd_indirectee(R1);
+    R1 = UNTAG(p);
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(R1) [R1];
 }
@@ -541,8 +542,9 @@ INFO_TABLE(stg_IND_STATIC,1,0,IND_STATIC,"IND_STATIC","IND_STATIC")
     /* explicit stack */
 {
     TICK_ENT_STATIC_IND(); /* tick */
-    ACQUIRE_FENCE_ON(R1 + OFFSET_StgHeader_info);
-    R1 = UNTAG(StgInd_indirectee(R1));
+    P_ p;
+    p = %acquire StgInd_indirectee(R1);
+    R1 = UNTAG(p);
     TICK_ENT_VIA_NODE();
     jump %GET_ENTRY(R1) [R1];
 }
@@ -567,8 +569,7 @@ INFO_TABLE(stg_BLACKHOLE,1,0,BLACKHOLE,"BLACKHOLE","BLACKHOLE")
 retry:
     // Synchronizes with the release-store in updateWithIndirection.
     // See Note [Heap memory barriers] in SMP.h.
-    ACQUIRE_FENCE_ON(node + OFFSET_StgHeader_info);
-    p = %relaxed StgInd_indirectee(node);
+    p = %acquire StgInd_indirectee(node);
     if (GETTAG(p) != 0) {
         return (p);
     }


=====================================
rts/ThreadPaused.c
=====================================
@@ -352,7 +352,7 @@ threadPaused(Capability *cap, StgTSO *tso)
             OVERWRITING_CLOSURE_SIZE(bh, closure_sizeW_(bh, INFO_PTR_TO_STRUCT(bh_info)));
 
             // The payload of the BLACKHOLE points to the TSO
-            ((StgInd *)bh)->indirectee = (StgClosure *)tso;
+            RELEASE_STORE(&((StgInd *)bh)->indirectee, (StgClosure *)tso);
             SET_INFO_RELEASE(bh,&stg_BLACKHOLE_info);
 
             // .. and we need a write barrier, since we just mutated the closure:


=====================================
rts/Threads.c
=====================================
@@ -437,7 +437,7 @@ checkBlockingQueues (Capability *cap, StgTSO *tso)
         p = UNTAG_CLOSURE(bq->bh);
         const StgInfoTable *pinfo = ACQUIRE_LOAD(&p->header.info);
         if (pinfo != &stg_BLACKHOLE_info ||
-            ((StgInd *)p)->indirectee != (StgClosure*)bq)
+            (RELAXED_LOAD(&((StgInd *)p)->indirectee) != (StgClosure*)bq))
         {
             wakeBlockingQueue(cap,bq);
         }


=====================================
rts/Updates.cmm
=====================================
@@ -59,7 +59,7 @@ INFO_TABLE_RET ( stg_marked_upd_frame, UPDATE_FRAME,
     ASSERT(HpAlloc == 0); // Note [HpAlloc]
 
     // we know the closure is a BLACKHOLE
-    v = StgInd_indirectee(updatee);
+    v = %acquire StgInd_indirectee(updatee);
 
     if (GETTAG(v) != 0) (likely: False) {
         // updated by someone else: discard our value and use the


=====================================
rts/Updates.h
=====================================
@@ -59,8 +59,8 @@
     }                                                           \
                                                                 \
     OVERWRITING_CLOSURE(p1);                                    \
-    %relaxed StgInd_indirectee(p1) = p2;                        \
-    SET_INFO_RELEASE(p1, stg_BLACKHOLE_info);                   \
+    %release StgInd_indirectee(p1) = p2;                        \
+    %relaxed SET_INFO(p1, stg_BLACKHOLE_info);                  \
     LDV_RECORD_CREATE(p1);                                      \
     and_then;
 


=====================================
rts/include/stg/SMP.h
=====================================
@@ -178,6 +178,7 @@ EXTERN_INLINE void load_load_barrier(void);
  *   - StgSmallMutArrPtrs: payload
  *   - StgThunk although this is a somewhat special case; see below
  *   - StgTSO: block_info
+ *   - StgInd: indirectee
  *
  * Writing to a mutable pointer field must be done via a release-store.
  * Reading from such a field is done via an acquire-load.
@@ -222,9 +223,9 @@ EXTERN_INLINE void load_load_barrier(void);
  * can see the indirectee. Consequently, a thunk update (see rts/Updates.h)
  * does the following:
  *
- *  1. Use a relaxed-store to place the new indirectee into the thunk's
+ *  1. Use a release-store to place the new indirectee into the thunk's
  *     indirectee field
- *  2. use a release-store to set the info table to stg_BLACKHOLE (which
+ *  2. use a relaxed-store to set the info table to stg_BLACKHOLE (which
  *     represents an indirection)
  *
  * Blackholing a thunk (either eagerly, by GHC.StgToCmm.Bind.emitBlackHoleCode,
@@ -237,13 +238,10 @@ EXTERN_INLINE void load_load_barrier(void);
  *  1. We jump into the entry code of the indirection (e.g. stg_BLACKHOLE);
  *     this of course implies that we have already read the thunk's info table
  *     pointer, which is done with a relaxed load.
- *  2. use an acquire-fence to ensure that our view on the thunk is
- *     up-to-date. This synchronizes with step (2) in the update
- *     procedure.
- *  3. relaxed-load the indirectee. Since thunks are updated at most
+ *  2. acquire-load the indirectee. Since thunks are updated at most
  *     once we know that the fence in the last step has given us
  *     an up-to-date view of the indirectee closure.
- *  4. enter the indirectee (or block if the indirectee is a TSO)
+ *  3. enter the indirectee (or block if the indirectee is a TSO)
  *
  * Other closures
  * --------------
@@ -270,7 +268,7 @@ EXTERN_INLINE void load_load_barrier(void);
  *    in this primops.
  *
  *  - Sending a Message to another capability:
- *    This is protected by the acquition and release of the target capability's
+ *    This is protected by the acquision and release of the target capability's
  *    lock in Messages.c:sendMessage.
  *
  * N.B. recordClosureMutated places a reference to the mutated object on


=====================================
rts/sm/NonMovingMark.c
=====================================
@@ -681,8 +681,9 @@ void updateRemembSetPushThunkEager(Capability *cap,
     case IND:
     {
         StgInd *ind = (StgInd *) thunk;
-        if (check_in_nonmoving_heap(ind->indirectee)) {
-            push_closure(queue, ind->indirectee, NULL);
+        StgClosure *indirectee = ACQUIRE_LOAD(&ind->indirectee);
+        if (check_in_nonmoving_heap(indirectee)) {
+            push_closure(queue, indirectee, NULL);
         }
         break;
     }


=====================================
utils/genapply/Main.hs
=====================================
@@ -783,7 +783,11 @@ genApply regstatus args =
         text "case IND,",
         text "     IND_STATIC: {",
         nest 4 (vcat [
-          text "R1 = StgInd_indirectee(R1);",
+          -- N.B. annoyingly the %acquire syntax must place its result in a local register
+          -- as it is a Cmm prim call node.
+          text "P_ p;",
+          text "p = %acquire StgInd_indirectee(R1);",
+          text "R1 = p;",
             -- An indirection node might contain a tagged pointer
           text "goto again;"
          ]),



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/61646bbd8f67f97daa96c7450c00f2511d5ac1f7...ec7473e25d148e2386ca9bfe35c32e727cdea615

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/61646bbd8f67f97daa96c7450c00f2511d5ac1f7...ec7473e25d148e2386ca9bfe35c32e727cdea615
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230411/4ddfbc3f/attachment-0001.html>


More information about the ghc-commits mailing list