[Git][ghc/ghc][wip/T18857] 132 commits: SMP.h: Add C11-style atomic operations

Moritz Angermann gitlab at gitlab.haskell.org
Fri Nov 13 02:39:45 UTC 2020



Moritz Angermann pushed to branch wip/T18857 at Glasgow Haskell Compiler / GHC


Commits:
b9d4dd9c by Ben Gamari at 2020-10-24T20:44:17-04:00
SMP.h: Add C11-style atomic operations

- - - - -
ccf2d4b0 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts: Infrastructure for testing with ThreadSanitizer

- - - - -
a61f66d6 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/CNF: Initialize all bdescrs in group

It seems wise and cheap to ensure that the whole bdescr of all blocks of
a compact group is valid, even if most cases only look at the flags
field.

- - - - -
65136c13 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/Capability: Intialize interrupt field

Previously this was left uninitialized.

Also clarify some comments.

- - - - -
b3ce6aca by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/Task: Make comments proper Notes

- - - - -
d3890ac7 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/SpinLock: Move to proper atomics

This is fairly straightforward; we just needed to use relaxed operations
for the PROF_SPIN counters and a release store instead of a write
barrier.

- - - - -
ef88712f by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/OSThreads: Fix data race

Previously we would race on the cached processor count. Avoiding this is
straightforward; just use relaxed operations.

- - - - -
33a719c3 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts/ClosureMaros: Use relaxed atomics

- - - - -
f08951fd by Ben Gamari at 2020-10-24T20:59:39-04:00
configure: Bump minimum-supported gcc version to 4.7

Since the __atomic_* builtins are not supported until gcc 4.7. Given
that this version was released in 2012 I think this is acceptable.

- - - - -
d584923a by Ben Gamari at 2020-10-24T20:59:39-04:00
testsuite: Fix thread leak in hs_try_putmvar00[13]

- - - - -
bf1b0bc7 by Ben Gamari at 2020-10-24T20:59:39-04:00
rts: Introduce SET_HDR_RELEASE

Also ensure that we also store the info table pointer last to ensure
that the synchronization covers all stores.

- - - - -
1a2e9f5e by Ben Gamari at 2020-10-24T21:00:19-04:00
gitlab-ci: Add nightly-x86_64-linux-deb9-tsan job

- - - - -
58a5b0e5 by GHC GitLab CI at 2020-10-24T21:00:19-04:00
testsuite: Mark setnumcapabilities001 as broken with TSAN

Due to #18808.

- - - - -
d9bc7dea by GHC GitLab CI at 2020-10-24T21:00:19-04:00
testsuite: Skip divbyzero and derefnull under TSAN

ThreadSanitizer changes the output of these tests.

- - - - -
fcc42a10 by Ben Gamari at 2020-10-24T21:00:19-04:00
testsuite: Skip high memory usage tests with TSAN

ThreadSanitizer significantly increases the memory footprint of tests,
so much so that it can send machines into OOM.

- - - - -
cae4bb3e by Ben Gamari at 2020-10-24T21:00:19-04:00
testsuite: Mark hie002 as high_memory_usage

This test has a peak residency of 1GByte; this is large enough to
classify as "high" in my book.

- - - - -
dae1b86a by Ben Gamari at 2020-10-24T21:00:19-04:00
testsuite: Mark T9872[abc] as high_memory_usage

These all have a maximum residency of over 2 GB.

- - - - -
c5a0bb22 by Ben Gamari at 2020-10-24T21:00:19-04:00
gitlab-ci: Disable documentation in TSAN build

Haddock chews through enough memory to cause the CI builders to OOM and
there's frankly no reason to build documentation in this job anyways.

- - - - -
4cb1232e by Ben Gamari at 2020-10-24T21:00:19-04:00
TSANUtils: Ensure that C11 atomics are supported

- - - - -
7ed15f7f by Ben Gamari at 2020-10-24T21:00:19-04:00
testsuite: Mark T3807 as broken with TSAN

Due to #18883.

- - - - -
f7e6f012 by Ben Gamari at 2020-10-24T21:00:19-04:00
testsuite: Mark T13702 as broken with TSAN due to #18884

- - - - -
16b136b0 by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Factor out logic to identify a good capability for running a task

Not only does this make the control flow a bit clearer but it also
allows us to add a TSAN suppression on this logic, which requires
(harmless) data races.

- - - - -
2781d68c by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Annotate benign race in waitForCapability

- - - - -
f6b4b492 by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Clarify locking behavior of releaseCapability_

- - - - -
65219810 by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Add assertions for task ownership of capabilities

- - - - -
31fa87ec by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Use relaxed atomics on n_returning_tasks

This mitigates the warning of a benign race on n_returning_tasks in
shouldYieldCapability.

See #17261.

- - - - -
6517a2ea by Ben Gamari at 2020-10-24T21:00:36-04:00
rts: Mitigate races in capability interruption logic

- - - - -
2e9ba3f2 by Ben Gamari at 2020-10-24T21:00:36-04:00
rts/Capability: Use relaxed operations for last_free_capability

- - - - -
e10dde37 by Ben Gamari at 2020-10-24T21:00:37-04:00
rts: Use relaxed operations for cap->running_task (TODO)

This shouldn't be necessary since only the owning thread of the capability
should be touching this.

- - - - -
855325cd by Ben Gamari at 2020-10-24T21:00:37-04:00
rts/Schedule: Use relaxed operations for sched_state

- - - - -
811f915d by Ben Gamari at 2020-10-24T21:00:37-04:00
rts: Accept data race in work-stealing implementation

This race is okay since the task is owned by the capability pushing it.
By Note [Ownership of Task] this means that the capability is free to
write to `task->cap` without taking `task->lock`.

Fixes #17276.

- - - - -
8d2b3c3d by Ben Gamari at 2020-10-24T21:00:37-04:00
rts: Eliminate data races on pending_sync

- - - - -
f8871018 by Ben Gamari at 2020-10-24T21:00:37-04:00
rts/Schedule: Eliminate data races on recent_activity

We cannot safely use relaxed atomics here.

- - - - -
d079b943 by Ben Gamari at 2020-10-24T21:00:37-04:00
rts: Avoid data races in message handling

- - - - -
06f80497 by Ben Gamari at 2020-10-24T21:00:37-04:00
rts/Messages: Drop incredibly fishy write barrier

executeMessage previously had a write barrier at the beginning of its
loop apparently in an attempt to synchronize with another thread's
writes to the Message. I would guess that the author had intended to use
a load barrier here given that there are no globally-visible writes done
in executeMessage.

I've removed the redundant barrier since the necessary load barrier is
now provided by the ACQUIRE_LOAD.

- - - - -
d4a87779 by Ben Gamari at 2020-10-24T21:00:38-04:00
rts/ThreadPaused: Avoid data races

- - - - -
56778ab3 by Ben Gamari at 2020-10-24T21:00:38-04:00
rts/Schedule: Eliminate data races in run queue management

- - - - -
086521f7 by Ben Gamari at 2020-10-24T21:00:38-04:00
rts: Eliminate shutdown data race on task counters

- - - - -
abad9778 by Ben Gamari at 2020-10-24T21:00:38-04:00
rts/Threads: Avoid data races (TODO)

Replace barriers with appropriate ordering. Drop redundant barrier in
tryWakeupThread (the RELEASE barrier will be provided by sendMessage's
mutex release).

We use relaxed operations on why_blocked and the stack although it's not
clear to me why this is necessary.

- - - - -
2f56be8a by Ben Gamari at 2020-10-24T21:00:39-04:00
rts/Messages: Annotate benign race

- - - - -
7c0cdab1 by Ben Gamari at 2020-10-24T21:00:39-04:00
rts/RaiseAsync: Synchronize what_next read

- - - - -
6cc2a8a5 by Ben Gamari at 2020-10-24T21:00:39-04:00
rts/Task: Move debugTrace to avoid data race

Specifically, we need to hold all_tasks_mutex to read taskCount.

- - - - -
bbaec97d by Ben Gamari at 2020-10-24T21:00:39-04:00
Disable flawed assertion

- - - - -
dd175a92 by Ben Gamari at 2020-10-24T21:00:39-04:00
Document schedulePushWork race

- - - - -
3416244b by Ben Gamari at 2020-10-24T21:00:40-04:00
Capabiliity: Properly fix data race on n_returning_tasks

There is a real data race but can be made safe by using proper atomic
(but relaxed) accesses.

- - - - -
dffd9432 by Ben Gamari at 2020-10-24T21:00:40-04:00
rts: Make write of to_cap->inbox atomic

This is necessary since emptyInbox may read from to_cap->inbox without
taking cap->lock.

- - - - -
1f4cbc29 by Ben Gamari at 2020-10-24T21:00:57-04:00
rts/BlockAlloc: Use relaxed operations

- - - - -
d0d07cff by Ben Gamari at 2020-10-24T21:00:57-04:00
rts: Rework handling of mutlist scavenging statistics

- - - - -
9e5c7f6d by Ben Gamari at 2020-10-24T21:00:57-04:00
rts: Avoid data races in StablePtr implementation

This fixes two potentially problematic data races in the StablePtr
implementation:

 * We would fail to RELEASE the stable pointer table when enlarging it,
   causing other cores to potentially see uninitialized memory.

 * We would fail to ACQUIRE when dereferencing a stable pointer.

- - - - -
316add67 by Ben Gamari at 2020-10-24T21:00:57-04:00
rts/Storage: Use atomics

- - - - -
5c23bc4c by Ben Gamari at 2020-10-24T21:00:58-04:00
rts/Updates: Use proper atomic operations

- - - - -
3d0f033c by Ben Gamari at 2020-10-24T21:00:58-04:00
rts/Weak: Eliminate data races

By taking all_tasks_mutex in stat_exit. Also better-document the fact
that the task statistics are protected by all_tasks_mutex.

- - - - -
edb4b92b by Ben Gamari at 2020-10-24T21:01:18-04:00
rts/WSDeque: Rewrite with proper atomics

After a few attempts at shoring up the previous implementation, I ended
up turning to the literature and now use the proven implementation,

> N.M. Lê, A. Pop, A.Cohen, and F.Z. Nardelli. "Correct and Efficient
> Work-Stealing for Weak Memory Models". PPoPP'13, February 2013,
> ACM 978-1-4503-1922/13/02.

Note only is this approach formally proven correct under C11 semantics
but it is also proved to be a bit faster in practice.

- - - - -
d39bbd3d by Ben Gamari at 2020-10-24T21:01:33-04:00
rts: Use relaxed atomics for whitehole spin stats

- - - - -
8f802f38 by Ben Gamari at 2020-10-24T21:01:33-04:00
rts: Avoid lock order inversion during fork

Fixes #17275.

- - - - -
cef667b0 by GHC GitLab CI at 2020-10-24T21:01:34-04:00
rts: Use proper relaxe operations in getCurrentThreadCPUTime

Here we are doing lazy initialization; it's okay if we do the check more
than once, hence relaxed operation is fine.

- - - - -
8cf50eb1 by Ben Gamari at 2020-10-24T21:01:54-04:00
rts/STM: Use atomics

This fixes a potentially harmful race where we failed to synchronize
before looking at a TVar's current_value.

Also did a bit of refactoring to avoid abstract over management of
max_commits.

- - - - -
88a7ce38 by Ben Gamari at 2020-10-24T21:01:54-04:00
rts/stm: Strengthen orderings to SEQ_CST instead of volatile

Previously the `current_value`, `first_watch_queue_entry`, and
`num_updates` fields of `StgTVar` were marked as `volatile` in an
attempt to provide strong ordering. Of course, this isn't sufficient.

We now use proper atomic operations. In most of these cases I strengthen
the ordering all the way to SEQ_CST although it's possible that some
could be weakened with some thought.

- - - - -
f97c59ce by Ben Gamari at 2020-10-24T21:02:11-04:00
Mitigate data races in event manager startup/shutdown

- - - - -
c7c3f8aa by Ben Gamari at 2020-10-24T21:02:22-04:00
rts: Accept benign races in Proftimer

- - - - -
5a98dfca by Ben Gamari at 2020-10-24T21:02:22-04:00
rts: Pause timer while changing capability count

This avoids #17289.

- - - - -
01d95525 by Ben Gamari at 2020-10-24T21:02:22-04:00
Fix #17289

- - - - -
9a528985 by Ben Gamari at 2020-10-24T21:02:23-04:00
suppress #17289 (ticker) race

- - - - -
1726ec41 by Ben Gamari at 2020-10-24T21:02:23-04:00
rts: Fix timer initialization

Previously `initScheduler` would attempt to pause the ticker and in so
doing acquire the ticker mutex. However, initTicker, which is
responsible for initializing said mutex, hadn't been called
yet.

- - - - -
bfbe4366 by Ben Gamari at 2020-10-24T21:02:23-04:00
rts: Fix races in Pthread timer backend shudown

We can generally be pretty relaxed in the barriers here since the timer
thread is a loop.

- - - - -
297acc71 by Ben Gamari at 2020-10-24T21:02:44-04:00
rts/Stats: Hide a few unused unnecessarily global functions

- - - - -
aad1f803 by Ben Gamari at 2020-10-30T00:41:14-04:00
rts/GC: Use atomics

- - - - -
d0bc0517 by Ben Gamari at 2020-10-30T00:41:14-04:00
rts: Use RELEASE ordering in unlockClosure

- - - - -
d44f5232 by Ben Gamari at 2020-10-30T00:41:14-04:00
rts/Storage: Accept races on heap size counters

- - - - -
4e4a7386 by Ben Gamari at 2020-10-30T00:41:14-04:00
rts: Join to concurrent mark thread during shutdown

Previously we would take all capabilities but fail to join on the thread
itself, potentially resulting in a leaked thread.

- - - - -
a80cc857 by GHC GitLab CI at 2020-10-30T00:41:14-04:00
rts: Fix race in GC CPU time accounting

Ensure that the GC leader synchronizes with workers before calling
stat_endGC.

- - - - -
105d43db by Ben Gamari at 2020-10-30T14:02:19-04:00
rts/SpinLock: Separate out slow path

Not only is this in general a good idea, but it turns out that GCC
unrolls the retry loop, resulting is massive code bloat in critical
parts of the RTS (e.g. `evacuate`).

- - - - -
f7b45cde by Ben Gamari at 2020-10-30T14:02:19-04:00
rts: Use relaxed ordering on spinlock counters

- - - - -
bd4abdc9 by Ben Gamari at 2020-11-01T01:10:31-04:00
testsuite: Add performance test for #18698

- - - - -
dfd27445 by Hécate at 2020-11-01T01:11:09-04:00
Add the proper HLint rules and remove redundant keywords from compiler

- - - - -
ce1bb995 by Hécate at 2020-11-01T08:52:08-05:00
Fix a leak in `transpose`

This patch was authored by David Feuer <david.feuer at gmail.com>

- - - - -
e63db32c by Ben Gamari at 2020-11-01T08:52:44-05:00
Scav: Use bd->gen_no instead of bd->gen->no

This potentially saves a cache miss per scavenge.

- - - - -
b1dda153 by Ben Gamari at 2020-11-01T12:58:36-05:00
rts/Stats: Protect with mutex

While on face value this seems a bit heavy, I think it's far better than
enforcing ordering on every access.

- - - - -
5c2e6bce by Ben Gamari at 2020-11-01T12:58:36-05:00
rts: Tear down stats_mutex after exitHeapProfiling

Since the latter wants to call getRTSStats.

- - - - -
ef25aaa1 by Ben Gamari at 2020-11-01T13:02:11-05:00
rts: Annotate hopefully "benign" races in freeGroup

- - - - -
3a181553 by Ben Gamari at 2020-11-01T13:02:18-05:00
Strengthen ordering in releaseGCThreads

- - - - -
af474f62 by Ben Gamari at 2020-11-01T13:05:38-05:00
Suppress data race due to close

This suppresses the other side of a race during shutdown.

- - - - -
b4686bff by Ben Gamari at 2020-11-01T13:09:59-05:00
Merge branch 'wip/tsan/ci' into wip/tsan/all

- - - - -
b8e66e0e by Ben Gamari at 2020-11-01T13:10:01-05:00
Merge branch 'wip/tsan/storage' into wip/tsan/all

- - - - -
375512cf by Ben Gamari at 2020-11-01T13:10:02-05:00
Merge branch 'wip/tsan/wsdeque' into wip/tsan/all

- - - - -
65ebf07e by Ben Gamari at 2020-11-01T13:10:03-05:00
Merge branch 'wip/tsan/misc' into wip/tsan/all

- - - - -
55c375d0 by Ben Gamari at 2020-11-01T13:10:04-05:00
Merge branch 'wip/tsan/stm' into wip/tsan/all

- - - - -
a9f75fe2 by Ben Gamari at 2020-11-01T13:10:06-05:00
Merge branch 'wip/tsan/event-mgr' into wip/tsan/all

- - - - -
8325d658 by Ben Gamari at 2020-11-01T13:10:24-05:00
Merge branch 'wip/tsan/timer' into wip/tsan/all

- - - - -
07e82ba5 by Ben Gamari at 2020-11-01T13:10:35-05:00
Merge branch 'wip/tsan/stats' into wip/tsan/all

- - - - -
4ce2f7d6 by GHC GitLab CI at 2020-11-02T23:45:06-05:00
testsuite: Add --top flag to driver

This allows us to make `config.top` a proper Path. Previously it was a
str, which caused the Ghostscript detection logic to break.

- - - - -
0b772221 by Ben Gamari at 2020-11-02T23:45:42-05:00
Document that ccall convention doesn't support varargs

We do not support foreign "C" imports of varargs functions. While this
works on amd64, in general the platform's calling convention may need
more type information that our Cmm representation can currently provide.
For instance, this is the case with Darwin's AArch64 calling convention.
Document this fact in the users guide and fix T5423 which makes use of a
disallowed foreign import.

Closes #18854.

- - - - -
81006a06 by David Eichmann at 2020-11-02T23:46:19-05:00
RtsAPI: pause and resume the RTS

The `rts_pause` and `rts_resume` functions have been added to `RtsAPI.h` and
allow an external process to completely pause and resume the RTS.

Co-authored-by: Sven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss at gmail.com>

- - - - -
bfb1e272 by Ryan Scott at 2020-11-02T23:46:55-05:00
Display results of GHC.Core.Lint.lint* functions consistently

Previously, the functions in `GHC.Core.Lint` used a patchwork of
different ways to display Core Lint errors:

* `lintPassResult` (which is the source of most Core Lint errors) renders
  Core Lint errors with a distinctive banner (e.g.,
  `*** Core Lint errors : in result of ... ***`) that sets them apart
  from ordinary GHC error messages.
* `lintAxioms`, in contrast, uses a completely different code path that
  displays Core Lint errors in a rather confusing manner. For example,
  the program in #18770 would give these results:

  ```
  Bug.hs:1:1: error:
      Bug.hs:12:1: warning:
          Non-*-like kind when *-like expected: RuntimeRep
          when checking the body of forall: 'TupleRep '[r]
          In the coercion axiom Bug.N:T :: []. Bug.T ~_R Any
          Substitution: [TCvSubst
                           In scope: InScope {r}
                           Type env: [axl :-> r]
                           Co env: []]
    |
  1 | {-# LANGUAGE DataKinds #-}
    | ^
  ```
* Further digging reveals that `GHC.IfaceToCore` displays Core Lint
  errors for iface unfoldings as though they were a GHC panic. See, for
  example, this excerpt from #17723:

  ```
  ghc: panic! (the 'impossible' happened)
    (GHC version 8.8.2 for x86_64-unknown-linux):
          Iface Lint failure
    In interface for Lib
    ...
  ```

This patch makes all of these code paths display Core Lint errors and
warnings consistently. I decided to adopt the conventions that
`lintPassResult` currently uses, as they appear to have been around the
longest (and look the best, in my subjective opinion). We now use the
`displayLintResult` function for all three scenarios mentioned above.
For example, here is what the Core Lint error for the program in #18770 looks
like after this patch:

```
[1 of 1] Compiling Bug              ( Bug.hs, Bug.o )
*** Core Lint errors : in result of TcGblEnv axioms ***
Bug.hs:12:1: warning:
    Non-*-like kind when *-like expected: RuntimeRep
    when checking the body of forall: 'TupleRep '[r_axn]
    In the coercion axiom N:T :: []. T ~_R Any
    Substitution: [TCvSubst
                     In scope: InScope {r_axn}
                     Type env: [axn :-> r_axn]
                     Co env: []]
*** Offending Program ***
axiom N:T :: T = Any -- Defined at Bug.hs:12:1
*** End of Offense ***

<no location info>: error:
Compilation had errors
```

Fixes #18770.

- - - - -
a9e5f52c by Simon Peyton Jones at 2020-11-02T23:47:31-05:00
Expand type synonyms with :kind!

The User's Guide claims that `:kind!` should expand type synonyms,
but GHCi wasn't doing this in practice. Let's just update the implementation
to match the specification in the User's Guide.

Fixes #13795. Fixes #18828.

Co-authored-by: Ryan Scott <ryan.gl.scott at gmail.com>

- - - - -
1370eda7 by Ben Gamari at 2020-11-02T23:48:06-05:00
hadrian: Don't capture RunTest output

There are a few reasons why capturing the output of the RunTest builder
is undesirable:

 * there is a large amount of output which then gets unnecessarily
   duplicated by Hadrian if the builder fails

 * the output may contain codepoints which are unrepresentable in the
   current codepage on Windows, causing Hadrian to crash

 * capturing the output causes the testsuite driver to disable
   its colorisation logic, making the output less legible.

- - - - -
78f2767d by Matthew Pickering at 2020-11-03T17:39:53-05:00
Update inlining flags documentation

- - - - -
14ce454f by Sylvain Henry at 2020-11-03T17:40:34-05:00
Linker: reorganize linker related code

Move linker related code into GHC.Linker. Previously it was scattered
into GHC.Unit.State, GHC.Driver.Pipeline, GHC.Runtime.Linker, etc.

Add documentation in GHC.Linker

- - - - -
616bec0d by Alan Zimmerman at 2020-11-03T17:41:10-05:00
Restrict Linear arrow %1 to exactly literal 1 only

This disallows `a %001 -> b`, and makes sure the type literal is
printed from its SourceText so it is clear why.

Closes #18888

- - - - -
3486ebe6 by Sylvain Henry at 2020-11-03T17:41:48-05:00
Hadrian: don't fail if ghc-tarballs dir doesn't exist

- - - - -
37f0434d by Sylvain Henry at 2020-11-03T17:42:26-05:00
Constant-folding: don't pass through GHC's Int/Word (fix #11704)

Constant-folding rules for integerToWord/integerToInt were performing
the following coercions at compilation time:

    integerToWord: target's Integer -> ghc's Word -> target's Word
    integerToInt : target's Integer -> ghc's Int -> target's Int

1) It was wrong for cross-compilers when GHC's word size is smaller than
   the target one. This patch avoids passing through GHC's word-sized
   types:

    integerToWord: target's Integer -> ghc's Integer -> target's Word
    integerToInt : target's Integer -> ghc's Integer -> target's Int

2) Additionally we didn't wrap the target word/int literal to make it
   fit into the target's range! This broke the invariant of literals
   only containing values in range.

   The existing code is wrong only with a 64-bit cross-compiling GHC,
   targeting a 32-bit platform, and performing constant folding on a
   literal that doesn't fit in a 32-bit word. If GHC was built with
   DEBUG, the assertion in GHC.Types.Literal.mkLitWord would fail.
   Otherwise the bad transformation would go unnoticed.

- - - - -
bff74de7 by Sylvain Henry at 2020-11-03T17:43:03-05:00
Bignum: make GMP's bignat_add not recursive

bignat_add was a loopbreaker with an INLINE pragma (spotted by
@mpickering). This patch makes it non recursive to avoid the issue.

- - - - -
bb100805 by Andreas Klebinger at 2020-11-04T16:47:24-05:00
NCG: Fix 64bit int comparisons on 32bit x86

We no compare these by doing 64bit subtraction and
checking the resulting flags.

We used to do this differently but the old approach was
broken when the high bits compared equal and the comparison
was one of >= or <=.

The new approach should be both correct and faster.

- - - - -
b790b7f9 by Andreas Klebinger at 2020-11-04T16:47:59-05:00
Testsuite: Support for user supplied package dbs

We can now supply additional package dbs to the testsuite.
For make the package db can be supplied by
passing PACKAGE_DB=/path/to/db.

In the testsuite driver it's passed via the --test-package-db
argument.

- - - - -
81560981 by Sylvain Henry at 2020-11-04T16:48:42-05:00
Don't use LEA with 8-bit registers (#18614)

- - - - -
17d5c518 by Viktor Dukhovni at 2020-11-05T00:50:23-05:00
Naming, value types and tests for Addr# atomics

The atomic Exchange and CAS operations on integral types are updated to
take and return more natural `Word#` rather than `Int#` values.  These
are bit-block not arithmetic operations, and the sign bit plays no
special role.

Standardises the names to `atomic<OpType><ValType>Addr#`, where `OpType` is one
of `Cas` or `Exchange` and `ValType` is presently either `Word` or `Addr`.
Eventually, variants for `Word32` and `Word64` can and should be added,
once #11953 and related issues (e.g. #13825) are resolved.

Adds tests for `Addr#` CAS that mirror existing tests for
`MutableByteArray#`.

- - - - -
2125b1d6 by Ryan Scott at 2020-11-05T00:51:01-05:00
Add a regression test for #18920

Commit f594a68a5500696d94ae36425bbf4d4073aca3b2
(`Use level numbers for generalisation`) ended up fixing #18920. Let's add a
regression test to ensure that it stays fixed.

Fixes #18920.

- - - - -
e07e383a by Ryan Scott at 2020-11-06T03:45:28-05:00
Replace HsImplicitBndrs with HsOuterTyVarBndrs

This refactors the GHC AST to remove `HsImplicitBndrs` and replace it with
`HsOuterTyVarBndrs`, a type which records whether the outermost quantification
in a type is explicit (i.e., with an outermost, invisible `forall`) or
implicit. As a result of this refactoring, it is now evident in the AST where
the `forall`-or-nothing rule applies: it's all the places that use
`HsOuterTyVarBndrs`. See the revamped `Note [forall-or-nothing rule]` in
`GHC.Hs.Type` (previously in `GHC.Rename.HsType`).

Moreover, the places where `ScopedTypeVariables` brings lexically scoped type
variables into scope are a subset of the places that adhere to the
`forall`-or-nothing rule, so this also makes places that interact with
`ScopedTypeVariables` easier to find. See the revamped
`Note [Lexically scoped type variables]` in `GHC.Hs.Type` (previously in
`GHC.Tc.Gen.Sig`).

`HsOuterTyVarBndrs` are used in type signatures (see `HsOuterSigTyVarBndrs`)
and type family equations (see `HsOuterFamEqnTyVarBndrs`). The main difference
between the former and the latter is that the former cares about specificity
but the latter does not.

There are a number of knock-on consequences:

* There is now a dedicated `HsSigType` type, which is the combination of
  `HsOuterSigTyVarBndrs` and `HsType`. `LHsSigType` is now an alias for an
  `XRec` of `HsSigType`.
* Working out the details led us to a substantial refactoring of
  the handling of explicit (user-written) and implicit type-variable
  bindings in `GHC.Tc.Gen.HsType`.

  Instead of a confusing family of higher order functions, we now
  have a local data type, `SkolemInfo`, that controls how these
  binders are kind-checked.

  It remains very fiddly, not fully satisfying. But it's better
  than it was.

Fixes #16762. Bumps the Haddock submodule.

Co-authored-by: Simon Peyton Jones <simonpj at microsoft.com>
Co-authored-by: Richard Eisenberg <rae at richarde.dev>
Co-authored-by: Zubin Duggal <zubin at cmi.ac.in>

- - - - -
c85f4928 by Sylvain Henry at 2020-11-06T03:46:08-05:00
Refactor -dynamic-too handling

1) Don't modify DynFlags (too much) for -dynamic-too: now when we
   generate dynamic outputs for "-dynamic-too", we only set "dynamicNow"
   boolean field in DynFlags instead of modifying several other fields.
   These fields now have accessors that take dynamicNow into account.

2) Use DynamicTooState ADT to represent -dynamic-too state. It's much
   clearer than the undocumented "DynamicTooConditional" that was used
   before.

As a result, we can finally remove the hscs_iface_dflags field in
HscRecomp. There was a comment on this field saying:

   "FIXME (osa): I don't understand why this is necessary, but I spent
   almost two days trying to figure this out and I couldn't .. perhaps
   someone who understands this code better will remove this later."

I don't fully understand the details, but it was needed because of the
changes made to the DynFlags for -dynamic-too.

There is still something very dubious in GHC.Iface.Recomp: we have to
disable the "dynamicNow" flag at some point for some Backpack's "heinous
hack" to continue to work. It may be because interfaces for indefinite
units are always non-dynamic, or because we mix and match dynamic and
non-dynamic interfaces (#9176), or something else, who knows?

- - - - -
2cb87909 by Moritz Angermann at 2020-11-06T03:46:44-05:00
[AArch64] Aarch64 Always PIC

- - - - -
b1d2c1f3 by Ben Gamari at 2020-11-06T03:47:19-05:00
rts/Sanity: Avoid nasty race in weak pointer sanity-checking

See Note [Racing weak pointer evacuation] for all of the gory details.

- - - - -
638f38c5 by Ben Gamari at 2020-11-08T09:29:16-05:00
Merge remote-tracking branch 'origin/wip/tsan/all'

- - - - -
22888798 by Ben Gamari at 2020-11-08T12:08:40-05:00
Fix haddock submodule

The previous merge mistakenly reverted it.

- - - - -
d445cf05 by Ben Gamari at 2020-11-10T10:26:20-05:00
rts/linker: Fix relocation overflow in PE linker

Previously the overflow check for the IMAGE_REL_AMD64_ADDR32NB
relocation failed to account for the signed nature of the value.
Specifically, the overflow check was:

    uint64_t v;
    v = S + A;
    if (v >> 32) { ... }

However, `v` ultimately needs to fit into 32-bits as a signed value.
Consequently, values `v > 2^31` in fact overflow yet this is not caught
by the existing overflow check.

Here we rewrite the overflow check to rather ensure that
`INT32_MIN <= v <= INT32_MAX`. There is now quite a bit of repetition
between the `IMAGE_REL_AMD64_REL32` and `IMAGE_REL_AMD64_ADDR32` cases
but I am leaving fixing this for future work.

This bug was first noticed by @awson.

Fixes #15808.

- - - - -
4c407f6e by Sylvain Henry at 2020-11-10T10:27:00-05:00
Export SPEC from GHC.Exts (#13681)

- - - - -
7814cd5b by David Eichmann at 2020-11-10T10:27:35-05:00
ghc-heap: expose decoding from heap representation

Co-authored-by: Sven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss at gmail.com>

- - - - -
fa344d33 by Richard Eisenberg at 2020-11-10T10:28:10-05:00
Add test case for #17186.

This got fixed sometime recently; not worth it trying to
figure out which commit.

- - - - -
2e63a0fb by David Eichmann at 2020-11-10T10:28:46-05:00
Add code comments for StgInfoTable and StgStack structs

- - - - -
fcfda909 by Ben Gamari at 2020-11-11T03:19:59-05:00
nativeGen: Make makeImportsDoc take an NCGConfig rather than DynFlags

It appears this was an oversight as there is no reason the full DynFlags
is necessary.

- - - - -
6e23695e by Ben Gamari at 2020-11-11T03:19:59-05:00
Move this_module into NCGConfig

In various places in the NCG we need the Module currently being
compiled. Let's move this into the environment instead of chewing threw
another register.

- - - - -
c6264a2d by Ben Gamari at 2020-11-11T03:20:00-05:00
codeGen: Produce local symbols for module-internal functions

It turns out that some important native debugging/profiling tools (e.g.
perf) rely only on symbol tables for function name resolution (as
opposed to using DWARF DIEs). However, previously GHC would emit
temporary symbols (e.g. `.La42b`) to identify module-internal
entities. Such symbols are dropped during linking and therefore not
visible to runtime tools (in addition to having rather un-helpful unique
names). For instance, `perf report` would often end up attributing all
cost to the libc `frame_dummy` symbol since Haskell code was no covered
by any proper symbol (see #17605).

We now rather follow the model of C compilers and emit
descriptively-named local symbols for module internal things. Since this
will increase object file size this behavior can be disabled with the
`-fno-expose-internal-symbols` flag.

With this `perf record` can finally be used against Haskell executables.
Even more, with `-g3` `perf annotate` provides inline source code.

- - - - -
584058dd by Ben Gamari at 2020-11-11T03:20:00-05:00
Enable -fexpose-internal-symbols when debug level >=2

This seems like a reasonable default as the object file size increases
by around 5%.

- - - - -
c34a4b98 by Ömer Sinan Ağacan at 2020-11-11T03:20:35-05:00
Fix and enable object unloading in GHCi

Fixes #16525 by tracking dependencies between object file symbols and
marking symbol liveness during garbage collection

See Note [Object unloading] in CheckUnload.c for details.

- - - - -
2782487f by Ray Shih at 2020-11-11T03:20:35-05:00
Add loadNativeObj and unloadNativeObj

(This change is originally written by niteria)

This adds two functions:
* `loadNativeObj`
* `unloadNativeObj`
and implements them for Linux.

They are useful if you want to load a shared object with Haskell code
using the system linker and have GHC call dlclose() after the
code is no longer referenced from the heap.

Using the system linker allows you to load the shared object
above outside the low-mem region. It also loads the DWARF sections
in a way that `perf` understands.

`dl_iterate_phdr` is what makes this implementation Linux specific.

- - - - -
7a65f9e1 by GHC GitLab CI at 2020-11-11T03:20:35-05:00
rts: Introduce highMemDynamic

- - - - -
e9e1b2e7 by GHC GitLab CI at 2020-11-11T03:20:35-05:00
Introduce test for dynamic library unloading

This uses the highMemDynamic flag introduced earlier to verify that
dynamic objects are properly unloaded.

- - - - -
5506f134 by Krzysztof Gogolewski at 2020-11-11T03:21:14-05:00
Force argument in setIdMult (#18925)

- - - - -
38aedc7a by Ben Gamari at 2020-11-12T21:39:41-05:00
CmmToLlvm: Declare signature for memcmp

Otherwise `opt` fails with:

    error: use of undefined value '@memcmp$def'

Fixes #18857.

- - - - -
e6f1d610 by Ben Gamari at 2020-11-12T21:39:41-05:00
gitlab-ci: Run LLVM job on appropriately-labelled MRs

Namely, those marked with the ~"LLVM backend" label

- - - - -
fe45f2c7 by Ben Gamari at 2020-11-12T21:39:41-05:00
gitlab-ci: Run LLVM builds on Debian 10

The current Debian 9 image doesn't provide LLVM 7.

- - - - -
40e89e37 by Ben Gamari at 2020-11-12T21:39:41-05:00
hadrian: Don't use -fllvm to bootstrap under LLVM flavour

Previously Hadrian's LLVM build flavours would use `-fllvm` for all
invocations, even those to stage0 GHC. This meant that we needed to keep
two LLVM versions around in all of the CI images. Moreover, it differed
from the behavior of the old make build system's llvm flavours.

Change this to reflect the behavior of the `make` build system, using
`-fllvm` only with the stage1 and stage2 compilers.

- - - - -
3c7f5854 by Moritz Angermann at 2020-11-12T21:39:41-05:00
fixup ShortText

- - - - -


30 changed files:

- .gitlab-ci.yml
- compiler/.hlint.yaml
- compiler/GHC.hs
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/Graph.hs
- compiler/GHC/Cmm/Info/Build.hs
- compiler/GHC/Cmm/LayoutStack.hs
- compiler/GHC/Cmm/Pipeline.hs
- compiler/GHC/Cmm/ProcPoint.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/BlockLayout.hs
- compiler/GHC/CmmToAsm/CFG/Dominators.hs
- compiler/GHC/CmmToAsm/Config.hs
- compiler/GHC/CmmToAsm/Monad.hs
- compiler/GHC/CmmToAsm/PIC.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/Reg/Graph/Spill.hs
- compiler/GHC/CmmToAsm/SPARC/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/Cond.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/CmmToLlvm/Base.hs
- compiler/GHC/CmmToLlvm/CodeGen.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Make.hs
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/Core/Opt/Exitify.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/71fdb441d0af4767cb9fc797c3460ce1365e317f...3c7f5854d0c5c84de0cf735b511402539a243e72

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/71fdb441d0af4767cb9fc797c3460ce1365e317f...3c7f5854d0c5c84de0cf735b511402539a243e72
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201112/54653f7f/attachment-0001.html>


More information about the ghc-commits mailing list