[Git][ghc/ghc][wip/tsan/fixes-2] 22 commits: rts: Silence spurious data races in ticky counters

Ben Gamari (@bgamari) gitlab at gitlab.haskell.org
Tue Jun 20 11:03:50 UTC 2023



Ben Gamari pushed to branch wip/tsan/fixes-2 at Glasgow Haskell Compiler / GHC


Commits:
5482c162 by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Silence spurious data races in ticky counters

Previously we would use non-atomic accesses when bumping ticky counters,
which would result in spurious data race reports from ThreadSanitizer
when the threaded RTS was in use.

- - - - -
974ff355 by Ben Gamari at 2023-06-20T07:03:45-04:00
Improve TSAN documentation

- - - - -
cd0e3354 by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Fix data race in Interpreter's preemption check

- - - - -
28bfa51a by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Fix data race in threadStatus#

- - - - -
31bbe4b2 by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Fix data race in CHECK_GC

- - - - -
f42ae688 by Ben Gamari at 2023-06-20T07:03:45-04:00
base: use atomic write when updating timer manager

- - - - -
180deed0 by Ben Gamari at 2023-06-20T07:03:45-04:00
Use relaxed atomics to manipulate TSO status fields

- - - - -
2a9a9bff by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Add necessary barriers when manipulating TSO owner

- - - - -
850bf588 by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Fix synchronization on thread blocking state

- - - - -
2b7be9ba by Ben Gamari at 2023-06-20T07:03:45-04:00
rts: Relaxed load MutVar info table

- - - - -
cc3e1e58 by Ben Gamari at 2023-06-20T07:03:45-04:00
hadrian: More debug information

- - - - -
1edef489 by Ben Gamari at 2023-06-20T07:03:45-04:00
hadrian: More selective TSAN instrumentation

- - - - -
44899037 by Ben Gamari at 2023-06-20T07:03:45-04:00
codeGen/tsan: Rework handling of spilling

- - - - -
6c24a7d2 by Ben Gamari at 2023-06-20T07:03:45-04:00
codeGen: Ensure that TSAN is aware of writeArray# write barriers

- - - - -
b4447d88 by Ben Gamari at 2023-06-20T07:03:45-04:00
codeGen: Ensure that array reads have necessary barriers

This was the cause of #23541.

- - - - -
885f76ca by Ben Gamari at 2023-06-20T07:03:45-04:00
Wordsmith TSAN Note

- - - - -
b8402eb2 by Ben Gamari at 2023-06-20T07:03:46-04:00
codeGen: Use relaxed accesses in ticky bumping

- - - - -
399b64da by Ben Gamari at 2023-06-20T07:03:46-04:00
codeGen: Use relaxed-read in closureInfoPtr

- - - - -
15f0d664 by Ben Gamari at 2023-06-20T07:03:46-04:00
Fix thunk update ordering

Previously we attempted to ensure soundness of concurrent thunk update
by synchronizing on the access of the thunk's info table pointer field.
This was believed to be sufficient since the indirectee (which may
expose a closure allocated by another core) would not be examined
until the info table pointer update is complete.

However, it turns out that this can result in data races in the presence
of multiple threads racing a update a single thunk. For instance,
consider this interleaving under the old scheme:

            Thread A                             Thread B
            ---------                            ---------
    t=0     Enter t
      1     Push update frame
      2     Begin evaluation

      4     Pause thread
      5     t.indirectee=tso
      6     Release t.info=BLACKHOLE

      7     ... (e.g. GC)

      8     Resume thread
      9     Finish evaluation
      10    Relaxed t.indirectee=x

      11                                         Load t.info
      12                                         Acquire fence
      13                                         Inspect t.indirectee

      14    Release t.info=BLACKHOLE

Here Thread A enters thunk `t` but is soon paused, resulting in `t`
being lazily blackholed at t=6. Then, at t=10 Thread A finishes
evaluation and updates `t.indirectee` with a relaxed store.

Meanwhile, Thread B enters the blackhole. Under the old scheme this
would introduce an acquire-fence but this would only synchronize with
Thread A at t=6. Consequently, the result of the evaluation, `x`, is not
visible to Thread B, introducing a data race.

We fix this by treating the `indirectee` field as we do all other
mutable fields. This means we must always access this field with
acquire-loads and release-stores.

See #23185.

- - - - -
03fa110a by Ben Gamari at 2023-06-20T07:03:46-04:00
STM: Use acquire loads when possible

Full sequential consistency is not needed here.

- - - - -
647b1ce6 by Ubuntu at 2023-06-20T07:03:46-04:00
ghc-prim: Use C11 atomics

- - - - -
e485e238 by Ubuntu at 2023-06-20T07:03:46-04:00
Run script

- - - - -


30 changed files:

- compiler/GHC/Cmm/Info.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/ThreadSanitizer.hs
- compiler/GHC/StgToCmm/Bind.hs
- compiler/GHC/StgToCmm/Prim.hs
- compiler/GHC/StgToCmm/Ticky.hs
- compiler/GHC/StgToCmm/Utils.hs
- hadrian/src/Flavour.hs
- libraries/base/GHC/Event/Thread.hs
- libraries/ghc-prim/cbits/atomic.c
- rts/Apply.cmm
- rts/Compact.cmm
- rts/Exception.cmm
- rts/Heap.c
- rts/HeapStackCheck.cmm
- rts/Interpreter.c
- rts/Messages.c
- rts/PrimOps.cmm
- rts/RaiseAsync.c
- rts/STM.c
- rts/Schedule.c
- rts/StableName.c
- rts/StgMiscClosures.cmm
- rts/StgStartup.cmm
- rts/ThreadPaused.c
- rts/Threads.c
- rts/TraverseHeap.c
- rts/Updates.cmm
- rts/Updates.h
- rts/include/Cmm.h


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e5038a94be33dd96ef0040dde0585efcb6e0cdb4...e485e238e2ae4a9e9a9b6124613d1d43ad523725

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e5038a94be33dd96ef0040dde0585efcb6e0cdb4...e485e238e2ae4a9e9a9b6124613d1d43ad523725
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230620/01ddb60c/attachment-0001.html>


More information about the ghc-commits mailing list