[Git][ghc/ghc][wip/boxed-rep] 148 commits: nativeGen: Make makeImportsDoc take an NCGConfig rather than DynFlags

Ben Gamari gitlab at gitlab.haskell.org
Mon Dec 14 23:49:07 UTC 2020



Ben Gamari pushed to branch wip/boxed-rep at Glasgow Haskell Compiler / GHC


Commits:
fcfda909 by Ben Gamari at 2020-11-11T03:19:59-05:00
nativeGen: Make makeImportsDoc take an NCGConfig rather than DynFlags

It appears this was an oversight as there is no reason the full DynFlags
is necessary.

- - - - -
6e23695e by Ben Gamari at 2020-11-11T03:19:59-05:00
Move this_module into NCGConfig

In various places in the NCG we need the Module currently being
compiled. Let's move this into the environment instead of chewing threw
another register.

- - - - -
c6264a2d by Ben Gamari at 2020-11-11T03:20:00-05:00
codeGen: Produce local symbols for module-internal functions

It turns out that some important native debugging/profiling tools (e.g.
perf) rely only on symbol tables for function name resolution (as
opposed to using DWARF DIEs). However, previously GHC would emit
temporary symbols (e.g. `.La42b`) to identify module-internal
entities. Such symbols are dropped during linking and therefore not
visible to runtime tools (in addition to having rather un-helpful unique
names). For instance, `perf report` would often end up attributing all
cost to the libc `frame_dummy` symbol since Haskell code was no covered
by any proper symbol (see #17605).

We now rather follow the model of C compilers and emit
descriptively-named local symbols for module internal things. Since this
will increase object file size this behavior can be disabled with the
`-fno-expose-internal-symbols` flag.

With this `perf record` can finally be used against Haskell executables.
Even more, with `-g3` `perf annotate` provides inline source code.

- - - - -
584058dd by Ben Gamari at 2020-11-11T03:20:00-05:00
Enable -fexpose-internal-symbols when debug level >=2

This seems like a reasonable default as the object file size increases
by around 5%.

- - - - -
c34a4b98 by Ömer Sinan Ağacan at 2020-11-11T03:20:35-05:00
Fix and enable object unloading in GHCi

Fixes #16525 by tracking dependencies between object file symbols and
marking symbol liveness during garbage collection

See Note [Object unloading] in CheckUnload.c for details.

- - - - -
2782487f by Ray Shih at 2020-11-11T03:20:35-05:00
Add loadNativeObj and unloadNativeObj

(This change is originally written by niteria)

This adds two functions:
* `loadNativeObj`
* `unloadNativeObj`
and implements them for Linux.

They are useful if you want to load a shared object with Haskell code
using the system linker and have GHC call dlclose() after the
code is no longer referenced from the heap.

Using the system linker allows you to load the shared object
above outside the low-mem region. It also loads the DWARF sections
in a way that `perf` understands.

`dl_iterate_phdr` is what makes this implementation Linux specific.

- - - - -
7a65f9e1 by GHC GitLab CI at 2020-11-11T03:20:35-05:00
rts: Introduce highMemDynamic

- - - - -
e9e1b2e7 by GHC GitLab CI at 2020-11-11T03:20:35-05:00
Introduce test for dynamic library unloading

This uses the highMemDynamic flag introduced earlier to verify that
dynamic objects are properly unloaded.

- - - - -
5506f134 by Krzysztof Gogolewski at 2020-11-11T03:21:14-05:00
Force argument in setIdMult (#18925)

- - - - -
787e93ae by Ben Gamari at 2020-11-11T23:14:11-05:00
testsuite: Add testcase for #18733

- - - - -
5353fd50 by Ben Gamari at 2020-11-12T10:05:30-05:00
compiler: Fix recompilation checking

In ticket #18733 we noticed a rather serious deficiency in the current
fingerprinting logic for recursive groups. I have described the old
fingerprinting story and its problems in Note [Fingerprinting recursive
groups] and have reworked the story accordingly to avoid these issues.

Fixes #18733.

- - - - -
63fa3997 by Sebastian Graf at 2020-11-13T14:29:39-05:00
Arity: Rework `ArityType` to fix monotonicity (#18870)

As we found out in #18870, `andArityType` is not monotone, with
potentially severe consequences for termination of fixed-point
iteration. That showed in an abundance of "Exciting arity" DEBUG
messages that are emitted whenever we do more than one step in
fixed-point iteration.

The solution necessitates also recording `OneShotInfo` info for
`ABot` arity type. Thus we get the following definition for `ArityType`:

```
data ArityType = AT [OneShotInfo] Divergence
```

The majority of changes in this patch are the result of refactoring use
sites of `ArityType` to match the new definition.

The regression test `T18870` asserts that we indeed don't emit any DEBUG
output anymore for a function where we previously would have.
Similarly, there's a regression test `T18937` for #18937, which we
expect to be broken for now.

Fixes #18870.

- - - - -
197d59fa by Sebastian Graf at 2020-11-13T14:29:39-05:00
Arity: Emit "Exciting arity" warning only after second iteration (#18937)

See Note [Exciting arity] why we emit the warning at all and why we only
do after the second iteration now.

Fixes #18937.

- - - - -
de7ec9dd by David Eichmann at 2020-11-13T14:30:16-05:00
Add rts_listThreads and rts_listMiscRoots to RtsAPI.h

These are used to find the current roots of the garbage collector.

Co-authored-by: Sven Tennie's avatarSven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering's avatarMatthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: default avatarBen Gamari <bgamari.foss at gmail.com>

- - - - -
24a86f09 by Ben Gamari at 2020-11-13T14:30:51-05:00
gitlab-ci: Cache cabal store in linting job

- - - - -
0a7e592c by Ben Gamari at 2020-11-15T03:35:45-05:00
nativeGen/dwarf: Fix procedure end addresses

Previously the `.debug_aranges` and `.debug_info` (DIE) DWARF
information would claim that procedures (represented with a
`DW_TAG_subprogram` DIE) would only span the range covered by their entry
block. This omitted all of the continuation blocks (represented by
`DW_TAG_lexical_block` DIEs), confusing `perf`. Fix this by introducing
a end-of-procedure label and using this as the `DW_AT_high_pc` of
procedure `DW_TAG_subprogram` DIEs

Fixes #17605.

- - - - -
1e19183d by Ben Gamari at 2020-11-15T03:35:45-05:00
nativeGen/dwarf: Only produce DW_AT_source_note DIEs in -g3

Standard debugging tools don't know how to understand these so let's not
produce them unless asked.

- - - - -
ad73370f by Ben Gamari at 2020-11-15T03:35:45-05:00
nativeGen/dwarf: Use DW_AT_linkage instead of DW_AT_MIPS_linkage

- - - - -
a2539650 by Ben Gamari at 2020-11-15T03:35:45-05:00
gitlab-ci: Add DWARF release jobs for Debian 10, Fedora27

- - - - -
d61adb3d by Ryan Scott at 2020-11-15T03:36:21-05:00
Name (tc)SplitForAll- functions more consistently

There is a zoo of `splitForAll-` functions in `GHC.Core.Type` (as well as
`tcSplitForAll-` functions in `GHC.Tc.Utils.TcType`) that all do very similar
things, but vary in the particular form of type variable that they return. To
make things worse, the names of these functions are often quite misleading.
Some particularly egregious examples:

* `splitForAllTys` returns `TyCoVar`s, but `splitSomeForAllTys` returns
  `VarBndr`s.
* `splitSomeForAllTys` returns `VarBndr`s, but `tcSplitSomeForAllTys` returns
  `TyVar`s.
* `splitForAllTys` returns `TyCoVar`s, but `splitForAllTysInvis` returns
  `InvisTVBinder`s. (This in particular arose in the context of #18939, and
  this finally motivated me to bite the bullet and improve the status quo
  vis-à-vis how we name these functions.)

In an attempt to bring some sanity to how these functions are named, I have
opted to rename most of these functions en masse to use consistent suffixes
that describe the particular form of type variable that each function returns.
In concrete terms, this amounts to:

* Functions that return a `TyVar` now use the suffix `-TyVar`.
  This caused the following functions to be renamed:
  * `splitTyVarForAllTys` -> `splitForAllTyVars`
  * `splitForAllTy_ty_maybe` -> `splitForAllTyVar_maybe`
  * `tcSplitForAllTys` -> `tcSplitForAllTyVars`
  * `tcSplitSomeForAllTys` -> `tcSplitSomeForAllTyVars`
* Functions that return a `CoVar` now use the suffix `-CoVar`.
  This caused the following functions to be renamed:
  * `splitForAllTy_co_maybe` -> `splitForAllCoVar_maybe`
* Functions that return a `TyCoVar` now use the suffix `-TyCoVar`.
  This caused the following functions to be renamed:
  * `splitForAllTy` -> `splitForAllTyCoVar`
  * `splitForAllTys` -> `splitForAllTyCoVars`
  * `splitForAllTys'` -> `splitForAllTyCoVars'`
  * `splitForAllTy_maybe` -> `splitForAllTyCoVar_maybe`
* Functions that return a `VarBndr` now use the suffix corresponding to the
  most relevant type synonym. This caused the following functions to be renamed:
  * `splitForAllVarBndrs` -> `splitForAllTyCoVarBinders`
  * `splitForAllTysInvis` -> `splitForAllInvisTVBinders`
  * `splitForAllTysReq` -> `splitForAllReqTVBinders`
  * `splitSomeForAllTys` -> `splitSomeForAllTyCoVarBndrs`
  * `tcSplitForAllVarBndrs` -> `tcSplitForAllTyVarBinders`
  * `tcSplitForAllTysInvis` -> `tcSplitForAllInvisTVBinders`
  * `tcSplitForAllTysReq` -> `tcSplitForAllReqTVBinders`
  * `tcSplitForAllTy_maybe` -> `tcSplitForAllTyVarBinder_maybe`

Note that I left the following functions alone:

* Functions that split apart things besides `ForAllTy`s, such as `splitFunTys`
  or `splitPiTys`. Thankfully, there are far fewer of these functions than
  there are functions that split apart `ForAllTy`s, so there isn't much of a
  pressing need to apply the new naming convention elsewhere.
* Functions that split apart `ForAllCo`s in `Coercion`s, such as
  `GHC.Core.Coercion.splitForAllCo_maybe`. We could theoretically apply the new
  naming convention here, but then we'd have to figure out how to disambiguate
  `Type`-splitting functions from `Coercion`-splitting functions. Ultimately,
  the `Coercion`-splitting functions aren't used nearly as much as the
  `Type`-splitting functions, so I decided to leave the former alone.

This is purely refactoring and should cause no change in behavior.

- - - - -
645444af by Ryan Scott at 2020-11-15T03:36:21-05:00
Use tcSplitForAllInvisTyVars (not tcSplitForAllTyVars) in more places

The use of `tcSplitForAllTyVars` in `tcDataFamInstHeader` was the immediate
cause of #18939, and replacing it with a new `tcSplitForAllInvisTyVars`
function (which behaves like `tcSplitForAllTyVars` but only splits invisible
type variables) fixes the issue. However, this led me to realize that _most_
uses of `tcSplitForAllTyVars` in GHC really ought to be
`tcSplitForAllInvisTyVars` instead. While I was in town, I opted to replace
most uses of `tcSplitForAllTys` with `tcSplitForAllTysInvis` to reduce the
likelihood of such bugs in the future.

I say "most uses" above since there is one notable place where we _do_ want
to use `tcSplitForAllTyVars`: in `GHC.Tc.Validity.forAllTyErr`, which produces
the "`Illegal polymorphic type`" error message if you try to use a higher-rank
`forall` without having `RankNTypes` enabled. Here, we really do want to split
all `forall`s, not just invisible ones, or we run the risk of giving an
inaccurate error message in the newly added `T18939_Fail` test case.

I debated at some length whether I wanted to name the new function
`tcSplitForAllInvisTyVars` or `tcSplitForAllTyVarsInvisible`, but in the end,
I decided that I liked the former better. For consistency's sake, I opted to
rename the existing `splitPiTysInvisible` and `splitPiTysInvisibleN` functions
to `splitInvisPiTys` and `splitPiTysInvisN`, respectively, so that they use the
same naming convention. As a consequence, this ended up requiring a `haddock`
submodule bump.

Fixes #18939.

- - - - -
8887102f by Moritz Angermann at 2020-11-15T03:36:56-05:00
AArch64/arm64 adjustments

This addes the necessary logic to support aarch64 on elf, as well
as aarch64 on mach-o, which Apple calls arm64.

We change architecture name to AArch64, which is the official arm
naming scheme.

- - - - -
fc644b1a by Ben Gamari at 2020-11-15T03:37:31-05:00
ghc-bin: Build with eventlogging by default

We now have all sorts of great facilities using the
eventlog which were previously unavailable without
building a custom GHC. Fix this by linking with
`-eventlog` by default.
- - - - -
52114fa0 by Sylvain Henry at 2020-11-16T11:48:47+01:00
Add Addr# atomic primops (#17751)

This reuses the codegen used for ByteArray#'s atomic primops.

- - - - -
8150f654 by Sebastian Graf at 2020-11-18T23:38:40-05:00
PmCheck: Print types of uncovered patterns (#18932)

In order to avoid confusion as in #18932, we display the type of the
match variables in the non-exhaustiveness warning, e.g.

```
T18932.hs:14:1: warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In an equation for ‘g’:
        Patterns of type  ‘T a’, ‘T a’, ‘T a’ not matched:
            (MkT2 _) (MkT1 _) (MkT1 _)
            (MkT2 _) (MkT1 _) (MkT2 _)
            (MkT2 _) (MkT2 _) (MkT1 _)
            (MkT2 _) (MkT2 _) (MkT2 _)
            ...
   |
14 | g (MkT1 x) (MkT1 _) (MkT1 _) = x
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

It also allows us to omit the type signature on wildcard matches which
we previously showed in only some situations, particularly
`-XEmptyCase`.

Fixes #18932.

- - - - -
165352a2 by Krzysztof Gogolewski at 2020-11-20T02:08:36-05:00
Export indexError from GHC.Ix (#18579)

- - - - -
b57845c3 by Kamil Dworakowski at 2020-11-20T02:09:16-05:00
Clarify interruptible FFI wrt masking state

- - - - -
321d1bd8 by Sebastian Graf at 2020-11-20T02:09:51-05:00
Fix strictness signatures of `prefetchValue*#` primops

Their strictness signatures said the primops are strict in their first
argument, which is wrong: Handing it a thunk will prefetch the pointer
to the thunk, but not evaluate it. Hence not strict.

The regression test `T8256` actually tests for laziness in the first
argument, so GHC apparently never exploited the strictness signature.

See also https://gitlab.haskell.org/ghc/ghc/-/issues/8256#note_310867,
where this came up.

- - - - -
0aec78b6 by Sebastian Graf at 2020-11-20T02:09:51-05:00
Demand: Interleave usage and strictness demands (#18903)

As outlined in #18903, interleaving usage and strictness demands not
only means a more compact demand representation, but also allows us to
express demands that we weren't easily able to express before.

Call demands are *relative* in the sense that a call demand `Cn(cd)`
on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
result is used according to `cd`". Example from #18903:

```hs
h :: Int -> Int
h m =
  let g :: Int -> (Int,Int)
      g 1 = (m, 0)
      g n = (2 * n, 2 `div` n)
      {-# NOINLINE g #-}
  in case m of
    1 -> 0
    2 -> snd (g m)
    _ -> uncurry (+) (g m)
```

Without the interleaved representation, we would just get `L` for the
strictness demand on `g`. Now we are able to express that whenever
`g` is called, its second component is used strictly in denoting `g`
by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
division, for example.

Fixes #18903.
While fixing regressions, I also discovered and fixed #18957.

Metric Decrease:
    T13253-spj

- - - - -
3a55b3a2 by Sebastian Graf at 2020-11-20T02:09:51-05:00
Update user's guide entry on demand analysis and worker/wrapper

The demand signature notation has been undocumented for a long time.
The only source to understand it, apart from reading the `Outputable`
instance, has been an outdated wiki page.

Since the previous commits have reworked the demand lattice, I took
it as an opportunity to also write some documentation about notation.

- - - - -
fc963932 by Greg Steuck at 2020-11-20T02:10:31-05:00
Find hadrian location more reliably in cabal-install output

Fix #18944

- - - - -
9f40cf6c by Ben Gamari at 2020-11-20T02:11:07-05:00
rts/linker: Align bssSize to page size when mapping symbol extras

We place symbol_extras right after bss. We also need
to ensure that symbol_extras can be mprotect'd independently from the
rest of the image. To ensure this we round up the size of bss to a page
boundary, thus ensuring that symbol_extras is also page-aligned.

- - - - -
b739c319 by Ben Gamari at 2020-11-20T02:11:43-05:00
gitlab-ci: Add usage message to ci.sh

- - - - -
802e9180 by Ben Gamari at 2020-11-20T02:11:43-05:00
gitlab-ci: Add VERBOSE environment variable

And change the make build system's default behavior to V=0, greatly
reducing build log sizes.

- - - - -
2a8a979c by Ben Gamari at 2020-11-21T01:13:26-05:00
users-guide: A bit of clean-up in profiling flag documentation

- - - - -
56804e33 by Ben Gamari at 2020-11-21T01:13:26-05:00
testsuite: Refactor CountParserDeps

- - - - -
53ad67ea by Ben Gamari at 2020-11-21T01:13:26-05:00
Introduce -fprof-callers flag

This introducing a new compiler flag to provide a convenient way to
introduce profiler cost-centers on all occurrences of the named
identifier.

Closes #18566.

- - - - -
ecfd0278 by Sylvain Henry at 2020-11-21T01:14:09-05:00
Move Plugins into HscEnv (#17957)

Loaded plugins have nothing to do in DynFlags so this patch moves them
into HscEnv (session state).

"DynFlags plugins" become "Driver plugins" to still be able to register
static plugins.

Bump haddock submodule

- - - - -
72f2257c by Sylvain Henry at 2020-11-21T01:14:09-05:00
Don't initialize plugins in the Core2Core pipeline

Some plugins can be added via TH (cf addCorePlugin). Initialize them in
the driver instead of in the Core2Core pipeline.

- - - - -
ddbeeb3c by Ryan Scott at 2020-11-21T01:14:44-05:00
Add regression test for #10504

This issue was fixed at some point between GHC 8.0 and 8.2. Let's add a
regression test to ensure that it stays fixed.

Fixes #10504.

- - - - -
a4a6dc2a by Ben Gamari at 2020-11-21T01:15:21-05:00
dwarf: Apply info table offset consistently

Previously we failed to apply the info table offset to the aranges and
DIEs, meaning that we often failed to unwind in gdb. For some reason
this only seemed to manifest in the RTS's Cmm closures. Nevertheless,
now we can unwind completely up to `main`

- - - - -
69bfbc21 by Ben Gamari at 2020-11-21T01:15:56-05:00
hadrian: Disable stripping when debug information is enabled

- - - - -
7e93ae8b by Ben Gamari at 2020-11-21T13:13:29-05:00
rts: Post ticky entry counts to the eventlog

We currently only post the entry counters, not the other global
counters as in my experience the former are more useful. We use the heap
profiler's census period to decide when to dump.

Also spruces up the documentation surrounding ticky-ticky a bit.

- - - - -
bc9c3916 by Ben Gamari at 2020-11-22T06:28:10-05:00
Implement -ddump-c-backend argument

To dump output of the C backend.

- - - - -
901bc220 by Ben Gamari at 2020-11-22T12:39:02-05:00
Bump time submodule to 1.11.1

Also bumps directory, Cabal, hpc, time, and unix submodules.

Closes #18847.

- - - - -
92c0afbf by Ben Gamari at 2020-11-22T12:39:38-05:00
hadrian: Dump STG when ticky is enabled

This changes the "ticky" modifier to enable dumping of final STG as this
is generally needed to make sense of the ticky profiles.

- - - - -
d23fef68 by Ben Gamari at 2020-11-22T12:39:38-05:00
hadrian: Introduce notion of flavour transformers

This extends Hadrian's notion of "flavour", as described in #18942.

- - - - -
179d0bec by Ben Gamari at 2020-11-22T12:39:38-05:00
hadrian: Add a viaLlvmBackend modifier

Note that this also slightly changes the semantics of these flavours as
we only use LLVM for >= stage1 builds.

- - - - -
d4d95e51 by Ben Gamari at 2020-11-22T12:39:38-05:00
hadrian: Add profiled_ghc and no_dynamic_ghc modifiers

- - - - -
6815603f by Ben Gamari at 2020-11-22T12:39:38-05:00
hadrian: Drop redundant flavour definitions

Drop the profiled, LLVM, and ThreadSanitizer flavour definitions as
these can now be realized with flavour transformers.

- - - - -
f88f4339 by Ben Gamari at 2020-11-24T02:43:20-05:00
rts: Flush eventlog buffers from flushEventLog

As noted in #18043, flushTrace failed flush anything beyond the writer.
This means that a significant amount of data sitting in capability-local
event buffers may never get flushed, despite the users' pleads for us to
flush.

Fix this by making flushEventLog flush all of the event buffers before
flushing the writer.

Fixes #18043.

- - - - -
7c03cc50 by Ben Gamari at 2020-11-24T02:43:55-05:00
gitlab-ci: Run LLVM job on appropriately-labelled MRs

Namely, those marked with the ~"LLVM backend" label

- - - - -
9b95d815 by Ben Gamari at 2020-11-24T02:43:55-05:00
gitlab-ci: Run LLVM builds on Debian 10

The current Debian 9 image doesn't provide LLVM 7.

- - - - -
2ed3e6c0 by Ben Gamari at 2020-11-24T02:43:55-05:00
CmmToLlvm: Declare signature for memcmp

Otherwise `opt` fails with:

    error: use of undefined value '@memcmp$def'

- - - - -
be5d74ca by Moritz Angermann at 2020-11-26T16:00:32-05:00
[Sized Cmm] properly retain sizes.

This replaces all Word<N> = W<N># Word# and Int<N> = I<N># Int#  with
Word<N> = W<N># Word<N># and Int<N> = I<N># Int<N>#, thus providing us
with properly sized primitives in the codegenerator instead of pretending
they are all full machine words.

This came up when implementing darwinpcs for arm64.  The darwinpcs reqires
us to pack function argugments in excess of registers on the stack.  While
most procedure call standards (pcs) assume arguments are just passed in
8 byte slots; and thus the caller does not know the exact signature to make
the call, darwinpcs requires us to adhere to the prototype, and thus have
the correct sizes.  If we specify CInt in the FFI call, it should correspond
to the C int, and not just be Word sized, when it's only half the size.

This does change the expected output of T16402 but the new result is no
less correct as it eliminates the narrowing (instead of the `and` as was
previously done).

Bumps the array, bytestring, text, and binary submodules.

Co-Authored-By: Ben Gamari <ben at well-typed.com>

Metric Increase:
    T13701
    T14697

- - - - -
a84e53f9 by Andreas Klebinger at 2020-11-26T16:00:32-05:00
RTS: Fix failed inlining of copy_tag.

On windows using gcc-10 gcc failed to inline copy_tag into evacuate.

To fix this we now set the always_inline attribute for the various
copy* functions in Evac.c. The main motivation here is not the
overhead of the function call, but rather that this allows the code
to "specialize" for the size of the closure we copy which is often
known at compile time.

An earlier commit also tried to avoid evacuate_large inlining. But
didn't quite succeed. So I also marked evacuate_large as noinline.

Fixes #12416

- - - - -
cdbd16f5 by Sylvain Henry at 2020-11-26T16:00:33-05:00
Fix toArgRep to support 64-bit reps on all systems

[This is @Ericson2314 writing a commit message for @hsyl20's patch.]

(Progress towards #11953, #17377, #17375)

`Int64Rep` and `Word64Rep` are currently broken on 64-bit systems.  This
is because they should use "native arg rep" but instead use "large arg
rep" as they do on 32-bit systems, which is either a non-concept or a
128-bit rep depending on one's vantage point.

Now, these reps currently aren't used during 64-bit compilation, so the
brokenness isn't observed, but I don't think that constitutes reasons
not to fix it. Firstly, the linked issues there is a clearly expressed
desire to use explicit-bitwidth constructs in more places. Secondly, per
[1], there are other bugs that *do* manifest from not threading
explicit-bitwidth information all the way through the compilation
pipeline. One can therefore view this as one piece of the larger effort
to do that, improve ergnomics, and squash remaining bugs.

Also, this is needed for !3658. I could just merge this as part of that,
but I'm keen on merging fixes "as they are ready" so the fixes that
aren't ready are isolated and easier to debug.

[1]: https://mail.haskell.org/pipermail/ghc-devs/2020-October/019332.html

- - - - -
a9378e69 by Tim Barnes at 2020-11-26T16:00:34-05:00
Set dynamic users-guide TOC spacing (fixes #18554)

- - - - -
86a59d93 by Ben Gamari at 2020-11-26T16:00:34-05:00
rts: Use RTS_LIKELY in CHECK

Most compilers probably already infer that
`barf` diverges but it nevertheless doesn't
hurt to be explicit.
- - - - -
5757e82b by Matthew Pickering at 2020-11-26T16:00:35-05:00
Remove special case for GHC.ByteCode.Instr

This was added in
https://github.com/nomeata/ghc-heap-view/commit/34935206e51b9c86902481d84d2f368a6fd93423

GHC.ByteCode.Instr.BreakInfo no longer exists so the special case is dead code.

Any check like this can be easily dealt with in client code.

- - - - -
d9c8b5b4 by Matthew Pickering at 2020-11-26T16:00:35-05:00
Split Up getClosureDataFromHeapRep

Motivation

1. Don't enforce the repeated decoding of an info table, when the client
can cache it (ghc-debug)
2. Allow the constructor information decoding to be overridden, this
casues segfaults in ghc-debug

- - - - -
3e3555cc by Andreas Klebinger at 2020-11-26T16:00:35-05:00
RegAlloc: Add missing raPlatformfield to RegAllocStatsSpill

Fixes #18994

Co-Author: Benjamin Maurer <maurer.benjamin at gmail.com>

- - - - -
a1a75aa9 by Ben Gamari at 2020-11-27T06:20:41-05:00
rts: Allocate MBlocks with MAP_TOP_DOWN on Windows

As noted in #18991, we would previously allocate heap in low memory.
Due to this the linker, which typically *needs* low memory, would end up
competing with the heap. In longer builds we end up running out of
low memory entirely, leading to linking failures.

- - - - -
75fc1ed5 by Sylvain Henry at 2020-11-28T15:40:23-05:00
Hadrian: fix detection of ghc-pkg for cross-compilers

- - - - -
7cb5df96 by Sylvain Henry at 2020-11-28T15:40:23-05:00
hadrian: fix ghc-pkg uses (#17601)

Make sure ghc-pkg doesn't read the compiler "settings" file by passing
--no-user-package-db.

- - - - -
e3fd4226 by Ben Gamari at 2020-11-28T15:40:23-05:00
gitlab-ci: Introduce a nightly cross-compilation job

This adds a job to test cross-compilation from x86-64 to AArch64 with
Hadrian.

Fixes #18234

- - - - -
698d3d96 by Ben Gamari at 2020-11-28T15:41:00-05:00
gitlab-ci: Only deploy GitLab Pages in ghc/ghc>

The deployments are quite large and yet are currently only served for
the ghc/ghc> project.

- - - - -
625726f9 by David Eichmann at 2020-11-28T15:41:37-05:00
ghc-heap: partial TSO/STACK decoding

Co-authored-by: Sven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss at gmail.com>

- - - - -
22ea9c29 by Andreas Klebinger at 2020-11-28T15:42:13-05:00
Small optimization to CmmSink.

Inside `regsUsedIn` we can avoid some thunks by specializing the
recursion. In particular we avoid the thunk for `(f e z)` in the
MachOp/Load branches, where we know this will evaluate to z.

Reduces allocations for T3294 by ~1%.

- - - - -
bba42c62 by John Ericson at 2020-11-28T15:42:49-05:00
Make primop handler indentation more consistent

- - - - -
c82bc8e9 by John Ericson at 2020-11-28T15:42:49-05:00
Cleanup some primop constructor names

Harmonize the internal (big sum type) names of the native vs fixed-sized
number primops a bit. (Mainly by renaming the former.)

No user-facing names are changed.

- - - - -
ae14f160 by Ben Gamari at 2020-11-28T15:43:25-05:00
testsuite: Mark T14702 as fragile on Windows

Due to #18953.

- - - - -
1bc104b0 by Ben Gamari at 2020-11-29T15:33:18-05:00
withTimings: Emit allocations counter

This will allow us to back out the allocations per compiler pass from
the eventlog. Note that we dump the allocation counter rather than the
difference since this will allow us to determine how much work is done
*between* `withTiming` blocks.

- - - - -
e992ea84 by GHC GitLab CI at 2020-11-29T15:33:54-05:00
ThreadPaused: Don't zero slop until free vars are pushed

When threadPaused blackholes a thunk it calls `OVERWRITING_CLOSURE` to
zero the slop for the benefit of the sanity checker. Previously this was
done *before* pushing the thunk's free variables to the update
remembered set. Consequently we would pull zero'd pointers to the update
remembered set.

- - - - -
e82cd140 by GHC GitLab CI at 2020-11-29T15:33:54-05:00
nonmoving: Fix regression from TSAN work

The TSAN rework (specifically aad1f803) introduced a subtle regression
in GC.c, swapping `g0` in place of `gen`. Whoops!

Fixes #18997.

- - - - -
35a5207e by GHC GitLab CI at 2020-11-29T15:33:54-05:00
rts/Messages: Add missing write barrier in THROWTO message update

After a THROWTO message has been handle the message closure is
overwritten by a NULL message. We must ensure that the original
closure's pointers continue to be visible to the nonmoving GC.

- - - - -
0120829f by GHC GitLab CI at 2020-11-29T15:33:54-05:00
nonmoving: Add missing write barrier in shrinkSmallByteArray

- - - - -
8a4d8fb6 by GHC GitLab CI at 2020-11-29T15:33:54-05:00
Updates: Don't zero slop until closure has been pushed

Ensure that the the free variables have been pushed to the update
remembered set before we zero the slop.

- - - - -
2793cfdc by GHC GitLab CI at 2020-11-29T15:33:54-05:00
OSThreads: Fix error code checking

pthread_join returns its error code and apparently doesn't set errno.

- - - - -
e391a16f by GHC GitLab CI at 2020-11-29T15:33:54-05:00
nonmoving: Don't join to mark_thread on shutdown

The mark thread is not joinable as we detach from it on creation.

- - - - -
60d088ab by Ben Gamari at 2020-11-29T15:33:54-05:00
nonmoving: Add reference to Ueno 2016

- - - - -
3aa60362 by GHC GitLab CI at 2020-11-29T15:33:54-05:00
nonmoving: Ensure that evacuated large objects are marked

See Note [Non-moving GC: Marking evacuated objects].

- - - - -
8d304a99 by Ben Gamari at 2020-11-30T10:15:22-05:00
rts/m32: Refactor handling of allocator seeding

Previously, in an attempt to reduce fragmentation, each new allocator
would map a region of M32_MAX_PAGES fresh pages to seed itself. However,
this ends up being extremely wasteful since it turns out that we often
use fewer than this.  Consequently, these pages end up getting freed
which, ends up fragmenting our address space more than than we would
have if we had naively allocated pages on-demand.

Here we refactor m32 to avoid this waste while achieving the
fragmentation mitigation previously desired. In particular, we move all
page allocation into the global m32_alloc_page, which will pull a page
from the free page pool. If the free page pool is empty we then refill
it by allocating a region of M32_MAP_PAGES and adding them to the pool.

Furthermore, we do away with the initial seeding entirely. That is, the
allocator starts with no active pages: pages are rather allocated on an
as-needed basis.

On the whole this ends up being a pleasingly simple change,
simultaneously making m32 more efficient, more robust, and simpler.

Fixes #18980.

- - - - -
b6629289 by Ben Gamari at 2020-11-30T10:15:58-05:00
rts: Use CHECK instead of assert

Use the GHC wrappers instead of <assert.h>.

- - - - -
9f4efa6a by Ben Gamari at 2020-11-30T10:15:58-05:00
rts/linker: Replace some ASSERTs with CHECK

In the past some people have confused ASSERT, which is for checking
internal invariants, which CHECK, which should be used when checking
things that might fail due to bad input (and therefore should be enabled
even in the release compiler). Change some of these cases in the linker
to use CHECK.

- - - - -
0f8a4655 by Ryan Scott at 2020-11-30T10:16:34-05:00
Allow deploy:pages job to fail

See #18973.

- - - - -
49ebe369 by chessai at 2020-11-30T19:47:40-05:00
Optimisations in Data.Foldable (T17867)

This PR concerns the following functions from `Data.Foldable`:
* minimum
* maximum
* sum
* product
* minimumBy
* maximumBy

- Default implementations of these functions now use `foldl'` or `foldMap'`.
- All have been marked with INLINEABLE to make room for further optimisations.

- - - - -
4d79ef65 by chessai at 2020-11-30T19:47:40-05:00
Apply suggestion to libraries/base/Data/Foldable.hs
- - - - -
6af074ce by chessai at 2020-11-30T19:47:40-05:00
Apply suggestion to libraries/base/Data/Foldable.hs
- - - - -
ab334262 by Viktor Dukhovni at 2020-11-30T19:48:17-05:00
dirty MVAR after mutating TSO queue head

While the original head and tail of the TSO queue may be in the same
generation as the MVAR, interior elements of the queue could be younger
after a GC run and may then be exposed by putMVar operation that updates
the queue head.

Resolves #18919

- - - - -
5eb163f3 by Ben Gamari at 2020-11-30T19:48:53-05:00
rts/linker: Don't allow shared libraries to be loaded multiple times

- - - - -
490aa14d by Ben Gamari at 2020-11-30T19:48:53-05:00
rts/linker: Initialise CCSs from native shared objects

- - - - -
6ac3db5f by Ben Gamari at 2020-11-30T19:48:53-05:00
rts/linker: Move shared library loading logic into Elf.c

- - - - -
b6698d73 by GHC GitLab CI at 2020-11-30T19:48:53-05:00
rts/linker: Don't declare dynamic objects with image_mapped

This previously resulted in warnings due to spurious unmap failures.

- - - - -
b94a65af by jneira at 2020-11-30T19:49:31-05:00
Include tried paths in findToolDir error

- - - - -
72a87fbc by Richard Eisenberg at 2020-12-01T19:57:41-05:00
Move core flattening algorithm to Core.Unify

This sets the stage for a later change, where this
algorithm will be needed from GHC.Core.InstEnv.

This commit also splits GHC.Core.Map into
GHC.Core.Map.Type and GHC.Core.Map.Expr,
in order to avoid module import cycles
with GHC.Core.

- - - - -
0dd45d0a by Richard Eisenberg at 2020-12-01T19:57:41-05:00
Bump the # of commits searched for perf baseline

The previous value of 75 meant that a feature branch with
more than 75 commits would get spurious CI passes.

This affects #18692, but does not fix that ticket, because
if a baseline cannot be found, we should fail, not succeed.

- - - - -
8bb52d91 by Richard Eisenberg at 2020-12-01T19:57:41-05:00
Remove flattening variables

This patch redesigns the flattener to simplify type family applications
directly instead of using flattening meta-variables and skolems. The key new
innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS
is either a type variable or exactly-saturated type family application; either
can now be rewritten using a CEqCan constraint in the inert set.

Because the flattener no longer reduces all type family applications to
variables, there was some performance degradation if a lengthy type family
application is now flattened over and over (not making progress). To
compensate, this patch contains some extra optimizations in the flattener,
leading to a number of performance improvements.

Close #18875.
Close #18910.

There are many extra parts of the compiler that had to be affected in writing
this patch:

* The family-application cache (formerly the flat-cache) sometimes stores
  coercions built from Given inerts. When these inerts get kicked out, we must
  kick out from the cache as well. (This was, I believe, true previously, but
  somehow never caused trouble.) Kicking out from the cache requires adding a
  filterTM function to TrieMap.

* This patch obviates the need to distinguish "blocking" coercion holes from
  non-blocking ones (which, previously, arose from CFunEqCans). There is thus
  some simplification around coercion holes.

* Extra commentary throughout parts of the code I read through, to preserve
  the knowledge I gained while working.

* A change in the pure unifier around unifying skolems with other types.
  Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented
  in Note [Binding when looking up instances] in GHC.Core.InstEnv.

* Some more use of MCoercion where appropriate.

* Previously, class-instance lookup automatically noticed that e.g. C Int was
  a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to
  a variable. Now, a little more care must be taken around checking for
  unifying instances.

* Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly,
  because (=>) is not a tycon in Haskell. Fixed now, but there are some
  knock-on changes in e.g. TrieMap code and in the canonicaliser.

* New function anyFreeVarsOf{Type,Co} to check whether a free variable
  satisfies a certain predicate.

* Type synonyms now remember whether or not they are "forgetful"; a forgetful
  synonym drops at least one argument. This is useful when flattening; see
  flattenView.

* The pattern-match completeness checker invokes the solver. This invocation
  might need to look through newtypes when checking representational equality.
  Thus, the desugarer needs to keep track of the in-scope variables to know
  what newtype constructors are in scope. I bet this bug was around before but
  never noticed.

* Extra-constraints wildcards are no longer simplified before printing.
  See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver.

* Whether or not there are Given equalities has become slightly subtler.
  See the new HasGivenEqs datatype.

* Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical
  explains a significant new wrinkle in the new approach.

* See Note [What might match later?] in GHC.Tc.Solver.Interact, which
  explains the fix to #18910.

* The inert_count field of InertCans wasn't actually used, so I removed
  it.

Though I (Richard) did the implementation, Simon PJ was very involved
in design and review.

This updates the Haddock submodule to avoid #18932 by adding
a type signature.

-------------------------
Metric Decrease:
    T12227
    T5030
    T9872a
    T9872b
    T9872c
Metric Increase:
    T9872d
-------------------------

- - - - -
d66660ba by Richard Eisenberg at 2020-12-01T19:57:41-05:00
Rename the flattener to become the rewriter.

Now that flattening doesn't produce flattening variables,
it's not really flattening anything: it's rewriting. This
change also means that the rewriter can no longer be confused
the core flattener (in GHC.Core.Unify), which is sometimes used
during type-checking.

- - - - -
add0aeae by Ben Gamari at 2020-12-01T19:58:17-05:00
rts: Introduce mmapAnonForLinker

Previously most of the uses of mmapForLinker were mapping anonymous
memory, resulting in a great deal of unnecessary repetition. Factor this
out into a new helper.

Also fixes a few places where error checking was missing or suboptimal.

- - - - -
97d71646 by Ben Gamari at 2020-12-01T19:58:17-05:00
rts/linker: Introduce munmapForLinker

Consolidates munmap calls to ensure consistent error handling.

- - - - -
d8872af0 by Ben Gamari at 2020-12-01T19:58:18-05:00
rts/Linker: Introduce Windows implementations for mmapForLinker, et al.

- - - - -
c35d0e03 by Ben Gamari at 2020-12-01T19:58:18-05:00
rts/m32: Introduce NEEDS_M32 macro

Instead of relying on RTS_LINKER_USE_MMAP

- - - - -
41c64eb5 by Ben Gamari at 2020-12-01T19:58:18-05:00
rts/linker: Use m32 to allocate symbol extras in PEi386

- - - - -
e0b08c5f by Ben Gamari at 2020-12-03T13:01:47-05:00
gitlab-ci: Fix copy-paste error

Also be more consistent in quoting.

- - - - -
33ec3a06 by Ben Gamari at 2020-12-03T23:11:31-05:00
gitlab-ci: Run linters through ci.sh

Ensuring that the right toolchain is used.

- - - - -
4a437bc1 by Shayne Fletcher at 2020-12-05T09:06:38-05:00
Fix bad span calculations of post qualified imports

- - - - -
8fac4b93 by Ben Gamari at 2020-12-05T09:07:13-05:00
testsuite: Add a test for #18923

- - - - -
62ed6957 by Simon Peyton Jones at 2020-12-08T15:31:41-05:00
Fix kind inference for data types. Again.

This patch fixes several aspects of kind inference for data type
declarations, especially data /instance/ declarations

Specifically

1. In kcConDecls/kcConDecl make it clear that the tc_res_kind argument
   is only used in the H98 case; and in that case there is no result
   kind signature; and hence no need for the disgusting splitPiTys in
   kcConDecls (now thankfully gone).

   The GADT case is a bit different to before, and much nicer.
   This is what fixes #18891.

   See Note [kcConDecls: kind-checking data type decls]

2. Do not look at the constructor decls of a data/newtype instance
   in tcDataFamInstanceHeader. See GHC.Tc.TyCl.Instance
   Note [Kind inference for data family instances].  This was a
   new realisation that arose when doing (1)

   This causes a few knock-on effects in the tests suite, because
   we require more information than before in the instance /header/.

   New user-manual material about this in "Kind inference in data type
   declarations" and "Kind inference for data/newtype instance
   declarations".

3. Minor improvement in kcTyClDecl, combining GADT and H98 cases

4. Fix #14111 and #8707 by allowing the header of a data instance
   to affect kind inferece for the the data constructor signatures;
   as described at length in Note [GADT return types] in GHC.Tc.TyCl

   This led to a modest refactoring of the arguments (and argument
   order) of tcConDecl/tcConDecls.

5. Fix #19000 by inverting the sense of the test in new_locs
   in GHC.Tc.Solver.Canonical.canDecomposableTyConAppOK.

- - - - -
0abe3ddf by Adam Sandberg Ericsson at 2020-12-08T15:32:19-05:00
hadrian: build the _l and _thr_l rts flavours in the develN flavours

The ghc binary requires the eventlog rts since
fc644b1a643128041cfec25db84e417851e28bab

- - - - -
51e3bb6d by Andreas Klebinger at 2020-12-08T22:43:21-05:00
CodeGen: Make folds User/DefinerOfRegs INLINEABLE.

Reduces allocation for the test case I was looking at by about 1.2%.
Mostly from avoiding allocation of some folding functions which turn
into let-no-escape bindings which just reuse their environment instead.

We also force inlining in a few key places in CmmSink which helps a bit
more.

- - - - -
69ae10c3 by Andreas Klebinger at 2020-12-08T22:43:21-05:00
CmmSink: Force inlining of foldRegsDefd

Helps avoid allocating the folding function. Improves
perf for T3294 by about 1%.

- - - - -
6e3da800 by Andreas Klebinger at 2020-12-08T22:43:21-05:00
Cmm: Make a few types and utility function slightly stricter.

About 0.6% reduction in allocations for the code I was looking at.

Not a huge difference but no need to throw away performance.

- - - - -
aef44d7f by Andreas Klebinger at 2020-12-08T22:43:21-05:00
Cmm.Sink: Optimize retaining of assignments, live sets.

Sinking requires us to track live local regs after each
cmm statement. We used to do this via "Set LocalReg".

However we can replace this with a solution based on IntSet
which is overall more efficient without losing much. The thing
we lose is width of the variables, which isn't used by the sinking
pass anyway.

I also reworked how we keep assignments to regs mentioned in
skipped assignments. I put the details into
Note [Keeping assignemnts mentioned in skipped RHSs].

The gist of it is instead of keeping track of it via the use count
which is a `IntMap Int` we now use the live regs set (IntSet) which
is quite a bit faster.

I think it also matches the semantics a lot better. The skipped
(not discarded) assignment does in fact keep the regs on it's rhs
alive so keeping track of this in the live set seems like the clearer
solution as well.

Improves allocations for T3294 by yet another 1%.

- - - - -
59f2249b by Andreas Klebinger at 2020-12-08T22:43:21-05:00
GHC.Cmm.Opt: Be stricter in results.

Optimization either returns Nothing if nothing is to be done or
`Just <cmmExpr>` otherwise. There is no point in being lazy in
`cmmExpr`. We usually inspect this element so the thunk gets forced
not long after.

We might eliminate it as dead code once in a blue moon but that's
not a case worth optimizing for.

Overall the impact of this is rather low. As Cmm.Opt doesn't allocate
much (compared to the rest of GHC) to begin with.

- - - - -
54b88eac by Andreas Klebinger at 2020-12-08T22:43:57-05:00
Bump time submodule.

This should fix #19002.

- - - - -
35e7b0c6 by Kirill Elagin at 2020-12-10T01:45:54-05:00
doc: Clarify the default for -fomit-yields

“Yield points enabled” is confusing (and probably wrong?
I am not 100% sure what it means). Change it to a simple “on”.

Undo this change from 2c23fff2e03e77187dc4d01f325f5f43a0e7cad2.
- - - - -
3551c554 by Kirill Elagin at 2020-12-10T01:45:54-05:00
doc: Extra-clarify -fomit-yields

Be more clear on what this optimisation being on by default means
in terms of yields.
- - - - -
6484f0d7 by Sergei Trofimovich at 2020-12-10T01:46:33-05:00
rts/linker/Elf.c: add missing <dlfcn.h> include (musl support)

The change fixes build failure on musl:

```
rts/linker/Elf.c:2031:3: error:
     warning: implicit declaration of function 'dlclose'; did you mean 'close'? [-Wimplicit-function-declaration]
     2031 |   dlclose(nc->dlopen_handle);
          |   ^~~~~~~
          |   close
```

Signed-off-by: Sergei Trofimovich <slyfox at gentoo.org>

- - - - -
ab24ed9b by Ben Gamari at 2020-12-11T03:55:51-05:00
users guide: Fix syntax errors

Fixes errors introduced by 3a55b3a2574f913d046f3a6f82db48d7f6df32e3.

- - - - -
d3a24d31 by Ben Gamari at 2020-12-11T03:55:51-05:00
users guide: Describe GC lifecycle events

Every time I am asked about how to interpret these events I need to
figure it out from scratch. It's well past time that the users guide
properly documents these.

- - - - -
741309b9 by Ben Gamari at 2020-12-11T03:56:27-05:00
gitlab-ci: Fix incorrect Docker image for nightly cross job

Also refactor the job definition to eliminate the bug by construction.

- - - - -
19703bc8 by Ben Gamari at 2020-12-11T03:56:27-05:00
gitlab-ci: Fix name of flavour in ThreadSanitizer job

It looks like I neglected to update this after introduce flavour
transformers.

- - - - -
381eb660 by Sylvain Henry at 2020-12-11T12:57:35-05:00
Display FFI labels (fix #18539)

- - - - -
4548d1f8 by Aaron Allen at 2020-12-11T12:58:14-05:00
Elide extraneous messages for :doc command (#15784)

Do not print `<has no documentation>` alongside a valid doc.
Additionally, if two matching symbols lack documentation then the
message will only be printed once. Hence, `<has no documentation>` will
be printed at most once and only if all matching symbols are lacking
docs.

- - - - -
5eba91b6 by Aaron Allen at 2020-12-11T12:58:14-05:00
Add :doc test case for duplicate record fields

Tests that the output of the `:doc` command is correct for duplicate
record fields defined using -XDuplicateRecordFields.

- - - - -
5feb9b2d by Ryan Scott at 2020-12-11T22:39:29-05:00
Delete outdated Note [Kind-checking tyvar binders for associated types]

This Note has severely bitrotted, as it has no references anywhere in the
codebase, and none of the functions that it mentions exist anymore. Let's just
delete this. While I was in town, I deleted some outdated comments from
`checkFamPatBinders` of a similar caliber.

Fixes #19008.

[ci skip]

- - - - -
f9f9f030 by Sylvain Henry at 2020-12-11T22:40:08-05:00
Arrows: correctly query arrow methods (#17423)

Consider the following code:

    proc (C x y) -> ...

Before this patch, the evidence binding for the Arrow dictionary was
attached to the C pattern:

    proc (C x y) { $dArrow = ... } -> ...

But then when we desugar this, we use arrow operations ("arr", ">>>"...)
specialised for this arrow:

    let
        arr_xy = arr $dArrow -- <-- Not in scope!
        ...
    in
        arr_xy (\(C x y) { $dArrow = ... } -> ...)

This patch allows arrow operations to be type-checked before the proc
itself, avoiding this issue.

Fix #17423

- - - - -
aaa8f00f by Sylvain Henry at 2020-12-11T22:40:48-05:00
Validate script: fix configure command when using stack

- - - - -
b4a929a1 by Sylvain Henry at 2020-12-11T22:41:30-05:00
Hadrian: fix libffi tarball parsing

Fix parsing of "libffi-3.3.tar.gz".

NB: switch to a newer libffi isn't done in this patch

- - - - -
690c8946 by Sylvain Henry at 2020-12-11T22:42:09-05:00
Parser: move parser utils into their own module

Move code unrelated to runtime evaluation out of GHC.Runtime.Eval

- - - - -
76be0e32 by Sylvain Henry at 2020-12-11T22:42:48-05:00
Move SizedSeq into ghc-boot

- - - - -
3a16d764 by Sylvain Henry at 2020-12-11T22:42:48-05:00
ghci: don't compile unneeded modules

- - - - -
2895fa60 by Sylvain Henry at 2020-12-11T22:42:48-05:00
ghci: reuse Arch from ghc-boot

- - - - -
480a38d4 by Sylvain Henry at 2020-12-11T22:43:30-05:00
rts: don't use siginterrupt (#19019)

- - - - -
4af6126d by Sylvain Henry at 2020-12-11T22:44:11-05:00
Use static array in zeroCount

- - - - -
5bd71bfd by Sebastian Graf at 2020-12-12T04:45:09-05:00
DmdAnal: Annotate top-level function bindings with demands (#18894)

It's useful to annotate a non-exported top-level function like `g` in

```hs
module Lib (h) where

g :: Int -> Int -> (Int,Int)
g m 1 = (m, 0)
g m n = (2 * m, 2 `div` n)
{-# NOINLINE g #-}

h :: Int -> Int
h 1 = 0
h m
  | odd m     = snd (g m 2)
  | otherwise = uncurry (+) (g 2 m)
```

with its demand `UCU(CS(P(1P(U),SP(U))`, which tells us that whenever `g` was
called, the second component of the returned pair was evaluated strictly.

Since #18903 we do so for local functions, where we can see all calls.
For top-level functions, we can assume that all *exported* functions are
demanded according to `topDmd` and thus get sound demands for
non-exported top-level functions.

The demand on `g` is crucial information for Nested CPR, which may the
go on and unbox `g` for the second pair component. That is true even if
that pair component may diverge, as is the case for the call site `g 13
0`, which throws a div-by-zero exception.

In `T18894b`, you can even see the new demand annotation enabling us to
eta-expand a function that we wouldn't be able to eta-expand without
Call Arity.

We only track bindings of function type in order not to risk huge compile-time
regressions, see `isInterestingTopLevelFn`.

There was a CoreLint check that rejected strict demand annotations on
recursive or top-level bindings, which seems completely unjustified.
All the cases I investigated were fine, so I removed it.

Fixes #18894.

- - - - -
3aae036e by Sebastian Graf at 2020-12-12T04:45:09-05:00
Demand: Simplify `CU(U)` to `U` (#19005)

Both sub-demands encode the same information.
This is a trivial change and already affects a few regression tests
(e.g. `T5075`), so no separate regression test is necessary.

- - - - -
c6477639 by Adam Sandberg Ericsson at 2020-12-12T04:45:48-05:00
hadrian: correctly copy the docs dir into the bindist #18669

- - - - -
e033dd05 by Adam Sandberg Ericsson at 2020-12-12T10:52:19+00:00
mkDocs: support hadrian bindists #18973

- - - - -
78580ba3 by John Ericson at 2020-12-13T07:14:50-05:00
Remove old .travis.yml

- - - - -
c696bb2f by Cale Gibbard at 2020-12-14T13:37:09-05:00
Implement type applications in patterns

The haddock submodule is also updated so that it understands the changes
to patterns.

- - - - -
7e9debd4 by Ben Gamari at 2020-12-14T13:37:09-05:00
Optimise nullary type constructor usage

During the compilation of programs GHC very frequently deals with
the `Type` type, which is a synonym of `TYPE 'LiftedRep`. This patch
teaches GHC to avoid expanding the `Type` synonym (and other nullary
type synonyms) during type comparisons, saving a good amount of work.
This optimisation is described in `Note [Comparing nullary type
synonyms]`.

To maximize the impact of this optimisation, we introduce a few
special-cases to reduce `TYPE 'LiftedRep` to `Type`. See
`Note [Prefer Type over TYPE 'LiftedPtrRep]`.

Closes #17958.

Metric Decrease:
   T18698b
   T1969
   T12227
   T12545
   T12707
   T14683
   T3064
   T5631
   T5642
   T9020
   T9630
   T9872a
   T13035
   haddock.Cabal
   haddock.base

- - - - -
92377c27 by Ben Gamari at 2020-12-14T13:41:58-05:00
Revert "Optimise nullary type constructor usage"

This was inadvertently merged.

This reverts commit 7e9debd4ceb068effe8ac81892d2cabcb8f55850.

- - - - -
d0e8c10d by Sylvain Henry at 2020-12-14T19:45:13+01:00
Move Unit related fields from DynFlags to HscEnv

The unit database cache, the home unit and the unit state were stored in
DynFlags while they ought to be stored in the compiler session state
(HscEnv). This patch fixes this.

It introduces a new UnitEnv type that should be used in the future to
handle separate unit environments (especially host vs target units).

Related to #17957

Bump haddock submodule

- - - - -
af855ac1 by Andreas Klebinger at 2020-12-14T15:22:13-05:00
Optimize dumping of consecutive whitespace.

The naive way of putting out n characters of indent would be something
like `hPutStr hdl (replicate n ' ')`. However this is quite inefficient
as we allocate an absurd number of strings consisting of simply spaces
as we don't cache them.

To improve on this we now track if we can simply write ascii spaces via
hPutBuf instead. This is the case when running with -ddump-to-file where
we force the encoding to be UTF8.

This avoids both the cost of going through encoding as well as avoiding
allocation churn from all the white space. Instead we simply use hPutBuf
on a preallocated unlifted string.

When dumping stg like this:

> nofib/spectral/simple/Main.hs -fforce-recomp -ddump-stg-final -ddump-to-file -c +RTS -s

Allocations went from 1,778 MB to 1,702MB. About a 4% reduction of
allocation! I did not measure the difference in runtime but expect it
to be similar.

Bumps the haddock submodule since the interface of GHC's Pretty
slightly changed.

-------------------------
Metric Decrease:
    T12227
-------------------------

- - - - -
dad87210 by Ben Gamari at 2020-12-14T15:22:29-05:00
Optimise nullary type constructor usage

During the compilation of programs GHC very frequently deals with
the `Type` type, which is a synonym of `TYPE 'LiftedRep`. This patch
teaches GHC to avoid expanding the `Type` synonym (and other nullary
type synonyms) during type comparisons, saving a good amount of work.
This optimisation is described in `Note [Comparing nullary type
synonyms]`.

To maximize the impact of this optimisation, we introduce a few
special-cases to reduce `TYPE 'LiftedRep` to `Type`. See
`Note [Prefer Type over TYPE 'LiftedPtrRep]`.

Closes #17958.

Metric Decrease:
   T18698b
   T1969
   T12227
   T12545
   T12707
   T14683
   T3064
   T5631
   T5642
   T9020
   T9630
   T9872a
   T13035
   haddock.Cabal
   haddock.base

- - - - -
6c2eb223 by Andrew Martin at 2020-12-14T18:48:51-05:00
Implement BoxedRep proposal

This implements the BoxedRep proposal, refacoring the `RuntimeRep`
hierarchy from:

```haskell
data RuntimeRep = LiftedPtrRep | UnliftedPtrRep | ...
```

to

```haskell
data RuntimeRep = BoxedRep Levity | ...
data Levity = Lifted | Unlifted
```

Closes #17526.

- - - - -


30 changed files:

- .gitlab-ci.yml
- .gitlab/ci.sh
- − .travis.yml
- aclocal.m4
- compiler/GHC.hs
- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/Builtin/Types.hs-boot
- compiler/GHC/Builtin/Types/Prim.hs
- + compiler/GHC/Builtin/Types/Prim.hs-boot
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/CommonBlockElim.hs
- compiler/GHC/Cmm/Dataflow/Label.hs
- compiler/GHC/Cmm/Expr.hs
- compiler/GHC/Cmm/Info/Build.hs
- + compiler/GHC/Cmm/LRegSet.hs
- compiler/GHC/Cmm/Lexer.x
- compiler/GHC/Cmm/Liveness.hs
- compiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Opt.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/Parser/Monad.hs
- compiler/GHC/Cmm/Sink.hs
- compiler/GHC/Cmm/Utils.hs
- compiler/GHC/CmmToAsm.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/185a01b0a2522b8197710e339b21179267d4245a...6c2eb2232b39ff4720fda0a4a009fb6afbc9dcea

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/185a01b0a2522b8197710e339b21179267d4245a...6c2eb2232b39ff4720fda0a4a009fb6afbc9dcea
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201214/aaff8449/attachment-0001.html>


More information about the ghc-commits mailing list