[Git][ghc/ghc][wip/tsan-ghc-8.10] 79 commits: CmmToLlvm: Declare signature for memcmp

Ben Gamari gitlab at gitlab.haskell.org
Mon Nov 30 17:27:15 UTC 2020



Ben Gamari pushed to branch wip/tsan-ghc-8.10 at Glasgow Haskell Compiler / GHC


Commits:
70ac4ed8 by Moritz Angermann at 2020-11-25T10:41:34+08:00
CmmToLlvm: Declare signature for memcmp

Otherwise `opt` fails with:

    error: use of undefined value '@memcmp$def'

- - - - -
7b8856f6 by Ben Gamari at 2020-11-30T12:21:35-05:00
SMP.h: Add C11-style atomic operations

- - - - -
15116c24 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Infrastructure for testing with ThreadSanitizer

- - - - -
6f51dabf by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/CNF: Initialize all bdescrs in group

It seems wise and cheap to ensure that the whole bdescr of all blocks of
a compact group is valid, even if most cases only look at the flags
field.

- - - - -
d1b8cb4f by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/Capability: Intialize interrupt field

Previously this was left uninitialized.

Also clarify some comments.

- - - - -
88eb3e67 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/Task: Make comments proper Notes

- - - - -
a4e20e6d by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/SpinLock: Move to proper atomics

This is fairly straightforward; we just needed to use relaxed operations
for the PROF_SPIN counters and a release store instead of a write
barrier.

- - - - -
3e979160 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/OSThreads: Fix data race

Previously we would race on the cached processor count. Avoiding this is
straightforward; just use relaxed operations.

- - - - -
70d00e09 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts/ClosureMaros: Use relaxed atomics

- - - - -
f065898a by Ben Gamari at 2020-11-30T12:21:35-05:00
testsuite: Fix thread leak in hs_try_putmvar00[13]

- - - - -
51f48fd2 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Introduce SET_HDR_RELEASE

Also ensure that we also store the info table pointer last to ensure
that the synchronization covers all stores.

- - - - -
9316c4a4 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Factor out logic to identify a good capability for running a task

Not only does this make the control flow a bit clearer but it also
allows us to add a TSAN suppression on this logic, which requires
(harmless) data races.

- - - - -
53f24b8a by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Annotate benign race in waitForCapability

- - - - -
4a3597b4 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Clarify locking behavior of releaseCapability_

- - - - -
1de8c691 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Add assertions for task ownership of capabilities

- - - - -
b7d9eb41 by Ben Gamari at 2020-11-30T12:21:35-05:00
rts: Use relaxed atomics on n_returning_tasks

This mitigates the warning of a benign race on n_returning_tasks in
shouldYieldCapability.

See #17261.

- - - - -
47423a5d by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Mitigate races in capability interruption logic

- - - - -
c858edad by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Capability: Use relaxed operations for last_free_capability

- - - - -
c2a64f77 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Use relaxed operations for cap->running_task (TODO)

This shouldn't be necessary since only the owning thread of the capability
should be touching this.

- - - - -
86598ea5 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Schedule: Use relaxed operations for sched_state

- - - - -
7111ed98 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Accept data race in work-stealing implementation

This race is okay since the task is owned by the capability pushing it.
By Note [Ownership of Task] this means that the capability is free to
write to `task->cap` without taking `task->lock`.

Fixes #17276.

- - - - -
d1667414 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Eliminate data races on pending_sync

- - - - -
c7085820 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Schedule: Eliminate data races on recent_activity

We cannot safely use relaxed atomics here.

- - - - -
7b479a36 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Avoid data races in message handling

- - - - -
367df852 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Messages: Drop incredibly fishy write barrier

executeMessage previously had a write barrier at the beginning of its
loop apparently in an attempt to synchronize with another thread's
writes to the Message. I would guess that the author had intended to use
a load barrier here given that there are no globally-visible writes done
in executeMessage.

I've removed the redundant barrier since the necessary load barrier is
now provided by the ACQUIRE_LOAD.

- - - - -
2e3c82f3 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/ThreadPaused: Avoid data races

- - - - -
fae4a32d by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Schedule: Eliminate data races in run queue management

- - - - -
4f3ad188 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Eliminate shutdown data race on task counters

- - - - -
3e595d5a by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Threads: Avoid data races (TODO)

Replace barriers with appropriate ordering. Drop redundant barrier in
tryWakeupThread (the RELEASE barrier will be provided by sendMessage's
mutex release).

We use relaxed operations on why_blocked and the stack although it's not
clear to me why this is necessary.

- - - - -
5099274d by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Messages: Annotate benign race

- - - - -
1236fbe0 by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/RaiseAsync: Synchronize what_next read

- - - - -
476c4a8a by Ben Gamari at 2020-11-30T12:21:36-05:00
rts/Task: Move debugTrace to avoid data race

Specifically, we need to hold all_tasks_mutex to read taskCount.

- - - - -
8e8c7adf by Ben Gamari at 2020-11-30T12:21:36-05:00
Disable flawed assertion

- - - - -
3eebe524 by Ben Gamari at 2020-11-30T12:21:36-05:00
Document schedulePushWork race

- - - - -
cb1eb0e8 by Ben Gamari at 2020-11-30T12:21:36-05:00
Capabiliity: Properly fix data race on n_returning_tasks

There is a real data race but can be made safe by using proper atomic
(but relaxed) accesses.

- - - - -
9863350a by Ben Gamari at 2020-11-30T12:21:36-05:00
rts: Make write of to_cap->inbox atomic

This is necessary since emptyInbox may read from to_cap->inbox without
taking cap->lock.

- - - - -
f81c1b02 by Ben Gamari at 2020-11-30T12:21:36-05:00
gitlab-ci: Add nightly-x86_64-linux-deb9-tsan job

- - - - -
b5855f96 by GHC GitLab CI at 2020-11-30T12:21:36-05:00
testsuite: Mark setnumcapabilities001 as broken with TSAN

Due to #18808.

- - - - -
21130520 by GHC GitLab CI at 2020-11-30T12:21:36-05:00
testsuite: Skip divbyzero and derefnull under TSAN

ThreadSanitizer changes the output of these tests.

- - - - -
227dc381 by Ben Gamari at 2020-11-30T12:21:36-05:00
testsuite: Skip high memory usage tests with TSAN

ThreadSanitizer significantly increases the memory footprint of tests,
so much so that it can send machines into OOM.

- - - - -
3930c9fe by Ben Gamari at 2020-11-30T12:21:36-05:00
testsuite: Mark hie002 as high_memory_usage

This test has a peak residency of 1GByte; this is large enough to
classify as "high" in my book.

- - - - -
c7424362 by Ben Gamari at 2020-11-30T12:21:36-05:00
testsuite: Mark T9872[abc] as high_memory_usage

These all have a maximum residency of over 2 GB.

- - - - -
8b418d39 by Ben Gamari at 2020-11-30T12:21:36-05:00
gitlab-ci: Disable documentation in TSAN build

Haddock chews through enough memory to cause the CI builders to OOM and
there's frankly no reason to build documentation in this job anyways.

- - - - -
cd0e033b by Ben Gamari at 2020-11-30T12:21:36-05:00
TSANUtils: Ensure that C11 atomics are supported

- - - - -
12bf63a7 by Ben Gamari at 2020-11-30T12:21:36-05:00
testsuite: Mark T3807 as broken with TSAN

Due to #18883.

- - - - -
f70cab90 by Ben Gamari at 2020-11-30T12:27:05-05:00
testsuite: Mark T13702 as broken with TSAN due to #18884

- - - - -
398f1ea9 by Ben Gamari at 2020-11-30T12:27:05-05:00
rts/BlockAlloc: Use relaxed operations

- - - - -
a3d1f39b by Ben Gamari at 2020-11-30T12:27:05-05:00
rts: Rework handling of mutlist scavenging statistics

- - - - -
e3526565 by Ben Gamari at 2020-11-30T12:27:05-05:00
rts: Avoid data races in StablePtr implementation

This fixes two potentially problematic data races in the StablePtr
implementation:

 * We would fail to RELEASE the stable pointer table when enlarging it,
   causing other cores to potentially see uninitialized memory.

 * We would fail to ACQUIRE when dereferencing a stable pointer.

- - - - -
e1c1552f by Ben Gamari at 2020-11-30T12:27:05-05:00
rts/Storage: Use atomics

- - - - -
5182aac5 by Ben Gamari at 2020-11-30T12:27:05-05:00
rts/Updates: Use proper atomic operations

- - - - -
72481992 by Ben Gamari at 2020-11-30T12:27:05-05:00
rts/Weak: Eliminate data races

By taking all_tasks_mutex in stat_exit. Also better-document the fact
that the task statistics are protected by all_tasks_mutex.

- - - - -
199419f6 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/GC: Use atomics

- - - - -
d7cd64b9 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Use RELEASE ordering in unlockClosure

- - - - -
26638062 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/Storage: Accept races on heap size counters

- - - - -
5d82ecd1 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Join to concurrent mark thread during shutdown

Previously we would take all capabilities but fail to join on the thread
itself, potentially resulting in a leaked thread.

- - - - -
3f519f3f by GHC GitLab CI at 2020-11-30T12:27:06-05:00
rts: Fix race in GC CPU time accounting

Ensure that the GC leader synchronizes with workers before calling
stat_endGC.

- - - - -
a6938732 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/SpinLock: Separate out slow path

Not only is this in general a good idea, but it turns out that GCC
unrolls the retry loop, resulting is massive code bloat in critical
parts of the RTS (e.g. `evacuate`).

- - - - -
120a1fba by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Use relaxed ordering on spinlock counters

- - - - -
dcc0916a by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Annotate hopefully "benign" races in freeGroup

- - - - -
2131a961 by Ben Gamari at 2020-11-30T12:27:06-05:00
Strengthen ordering in releaseGCThreads

- - - - -
e90d9cfa by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/WSDeque: Rewrite with proper atomics

After a few attempts at shoring up the previous implementation, I ended
up turning to the literature and now use the proven implementation,

> N.M. LĂȘ, A. Pop, A.Cohen, and F.Z. Nardelli. "Correct and Efficient
> Work-Stealing for Weak Memory Models". PPoPP'13, February 2013,
> ACM 978-1-4503-1922/13/02.

Note only is this approach formally proven correct under C11 semantics
but it is also proved to be a bit faster in practice.

- - - - -
e52ab4fa by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Use relaxed atomics for whitehole spin stats

- - - - -
e8705f29 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Avoid lock order inversion during fork

Fixes #17275.

- - - - -
2af2f0be by GHC GitLab CI at 2020-11-30T12:27:06-05:00
rts: Use proper relaxe operations in getCurrentThreadCPUTime

Here we are doing lazy initialization; it's okay if we do the check more
than once, hence relaxed operation is fine.

- - - - -
469872ec by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/STM: Use atomics

This fixes a potentially harmful race where we failed to synchronize
before looking at a TVar's current_value.

Also did a bit of refactoring to avoid abstract over management of
max_commits.

- - - - -
5948bc76 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/stm: Strengthen orderings to SEQ_CST instead of volatile

Previously the `current_value`, `first_watch_queue_entry`, and
`num_updates` fields of `StgTVar` were marked as `volatile` in an
attempt to provide strong ordering. Of course, this isn't sufficient.

We now use proper atomic operations. In most of these cases I strengthen
the ordering all the way to SEQ_CST although it's possible that some
could be weakened with some thought.

- - - - -
311b0184 by Ben Gamari at 2020-11-30T12:27:06-05:00
Mitigate data races in event manager startup/shutdown

- - - - -
513e9fe6 by Ben Gamari at 2020-11-30T12:27:06-05:00
Suppress data race due to close

This suppresses the other side of a race during shutdown.

- - - - -
899c985c by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Accept benign races in Proftimer

- - - - -
270ad47f by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Pause timer while changing capability count

This avoids #17289.

- - - - -
37601b97 by Ben Gamari at 2020-11-30T12:27:06-05:00
Fix #17289

- - - - -
4718c285 by Ben Gamari at 2020-11-30T12:27:06-05:00
suppress #17289 (ticker) race

- - - - -
982993ad by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Fix timer initialization

Previously `initScheduler` would attempt to pause the ticker and in so
doing acquire the ticker mutex. However, initTicker, which is
responsible for initializing said mutex, hadn't been called
yet.

- - - - -
95a8bd76 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Fix races in Pthread timer backend shudown

We can generally be pretty relaxed in the barriers here since the timer
thread is a loop.

- - - - -
4b5f3764 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/Stats: Hide a few unused unnecessarily global functions

- - - - -
e7b0b74f by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/Stats: Protect with mutex

While on face value this seems a bit heavy, I think it's far better than
enforcing ordering on every access.

- - - - -
1fed9ab5 by Ben Gamari at 2020-11-30T12:27:06-05:00
rts: Tear down stats_mutex after exitHeapProfiling

Since the latter wants to call getRTSStats.

- - - - -
f6f0343b by Ben Gamari at 2020-11-30T12:27:06-05:00
rts/Stats: Reintroduce mut_user_time

Fix the previous backport; this function was dead code in master but is
still needed due to ProfHeap.c in ghc-8.10.

- - - - -


30 changed files:

- .gitlab-ci.yml
- compiler/llvmGen/LlvmCodeGen/Base.hs
- hadrian/hadrian.cabal
- hadrian/src/Flavour.hs
- hadrian/src/Settings.hs
- + hadrian/src/Settings/Flavours/ThreadSanitizer.hs
- includes/Rts.h
- includes/rts/OSThreads.h
- includes/rts/SpinLock.h
- includes/rts/StablePtr.h
- + includes/rts/TSANUtils.h
- includes/rts/storage/ClosureMacros.h
- includes/rts/storage/Closures.h
- includes/rts/storage/GC.h
- includes/stg/SMP.h
- libraries/base/GHC/Event/Control.hs
- + rts/.tsan-suppressions
- rts/Capability.c
- rts/Capability.h
- rts/Messages.c
- rts/Proftimer.c
- rts/RaiseAsync.c
- rts/RtsStartup.c
- rts/SMPClosureOps.h
- rts/STM.c
- rts/Schedule.c
- rts/Schedule.h
- rts/Sparks.c
- + rts/SpinLock.c
- rts/StablePtr.c


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/a3152aa057644dac7b8df4c30c3034d3ab180748...f6f0343bd8dc1ea5d8085ed551bcb3400b72b694

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/a3152aa057644dac7b8df4c30c3034d3ab180748...f6f0343bd8dc1ea5d8085ed551bcb3400b72b694
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201130/5c53527e/attachment-0001.html>


More information about the ghc-commits mailing list