[Git][ghc/ghc][wip/par-simpl] 119 commits: Fire RULES in the Specialiser

Matthew Pickering (@mpickering) gitlab at gitlab.haskell.org
Tue Dec 6 12:05:19 UTC 2022



Matthew Pickering pushed to branch wip/par-simpl at Glasgow Haskell Compiler / GHC


Commits:
f9f17b68 by Simon Peyton Jones at 2022-11-10T12:20:03+00:00
Fire RULES in the Specialiser

The Specialiser has, for some time, fires class-op RULES in the
specialiser itself: see
   Note [Specialisation modulo dictionary selectors]

This MR beefs it up a bit, so that it fires /all/ RULES in the
specialiser, not just class-op rules.  See
   Note [Fire rules in the specialiser]
The result is a bit more specialisation; see test
   simplCore/should_compile/T21851_2

This pushed me into a bit of refactoring.  I made a new data types
GHC.Core.Rules.RuleEnv, which combines
  - the several source of rules (local, home-package, external)
  - the orphan-module dependencies

in a single record for `getRules` to consult.  That drove a bunch of
follow-on refactoring, including allowing me to remove
cr_visible_orphan_mods from the CoreReader data type.

I moved some of the RuleBase/RuleEnv stuff into GHC.Core.Rule.

The reorganisation in the Simplifier improve compile times a bit
(geom mean -0.1%), but T9961 is an outlier

Metric Decrease:
    T9961

- - - - -
2b3d0bee by Simon Peyton Jones at 2022-11-10T12:21:13+00:00
Make indexError work better

The problem here is described at some length in
Note [Boxity for bottoming functions] and
Note [Reboxed crud for bottoming calls] in GHC.Core.Opt.DmdAnal.

This patch adds a SPECIALISE pragma for indexError, which
makes it much less vulnerable to the problem described in
these Notes.

(This came up in another line of work, where a small change made
indexError do reboxing (in nofib/spectral/simple/table_sort)
that didn't happen before my change.  I've opened #22404
to document the fagility.

- - - - -
399e921b by Simon Peyton Jones at 2022-11-10T12:21:14+00:00
Fix DsUselessSpecialiseForClassMethodSelector msg

The error message for DsUselessSpecialiseForClassMethodSelector
was just wrong (a typo in some earlier work); trivial fix

- - - - -
dac0682a by Sebastian Graf at 2022-11-10T21:16:01-05:00
WorkWrap: Unboxing unboxed tuples is not always useful (#22388)

See Note [Unboxing through unboxed tuples].

Fixes #22388.

- - - - -
1230c268 by Sebastian Graf at 2022-11-10T21:16:01-05:00
Boxity: Handle argument budget of unboxed tuples correctly (#21737)

Now Budget roughly tracks the combined width of all arguments after unarisation.
See the changes to `Note [Worker argument budgets]`.

Fixes #21737.

- - - - -
2829fd92 by Cheng Shao at 2022-11-11T00:26:54-05:00
autoconf: check getpid getuid raise

This patch adds checks for getpid, getuid and raise in autoconf. These
functions are absent in wasm32-wasi and thus needs to be checked.

- - - - -
f5dfd1b4 by Cheng Shao at 2022-11-11T00:26:55-05:00
hadrian: add -Wwarn only for cross-compiling unix

- - - - -
2e6ab453 by Cheng Shao at 2022-11-11T00:26:55-05:00
hadrian: add targetSupportsThreadedRts flag

This patch adds a targetSupportsThreadedRts flag to indicate whether
the target supports the threaded rts at all, different from existing
targetSupportsSMP that checks whether -N is supported by the RTS. All
existing flavours have also been updated accordingly to respect this
flags.

Some targets (e.g. wasm32-wasi) does not support the threaded rts,
therefore this flag is needed for the default flavours to work. It
makes more sense to have proper autoconf logic to check for threading
support, but for the time being, we just set the flag to False iff the
target is wasm32.

- - - - -
8104f6f5 by Cheng Shao at 2022-11-11T00:26:55-05:00
Fix Cmm symbol kind

- - - - -
b2035823 by Norman Ramsey at 2022-11-11T00:26:55-05:00
add the two key graph modules from Martin Erwig's FGL

Martin Erwig's FGL (Functional Graph Library) provides an "inductive"
representation of graphs.  A general graph has labeled nodes and
labeled edges.  The key operation on a graph is to decompose it by
removing one node, together with the edges that connect the node to
the rest of the graph.  There is also an inverse composition
operation.

The decomposition and composition operations make this representation
of graphs exceptionally well suited to implement graph algorithms in
which the graph is continually changing, as alluded to in #21259.

This commit adds `GHC.Data.Graph.Inductive.Graph`, which defines the
interface, and `GHC.Data.Graph.Inductive.PatriciaTree`, which provides
an implementation.  Both modules are taken from `fgl-5.7.0.3` on
Hackage, with these changes:

  - Copyright and license text have been copied into the files
    themselves, not stored separately.

  - Some calls to `error` have been replaced with calls to `panic`.

  - Conditional-compilation support for older versions of GHC,
    `containers`, and `base` has been removed.

- - - - -
3633a5f5 by Norman Ramsey at 2022-11-11T00:26:55-05:00
add new modules for reducibility and WebAssembly translation

- - - - -
df7bfef8 by Cheng Shao at 2022-11-11T00:26:55-05:00
Add support for the wasm32-wasi target tuple

This patch adds the wasm32-wasi tuple support to various places in the
tree: autoconf, hadrian, ghc-boot and also the compiler. The codegen
logic will come in subsequent commits.

- - - - -
32ae62e6 by Cheng Shao at 2022-11-11T00:26:55-05:00
deriveConstants: parse .ll output for wasm32 due to broken nm

This patch makes deriveConstants emit and parse an .ll file when
targeting wasm. It's a necessary workaround for broken llvm-nm on
wasm, which isn't capable of reporting correct constant values when
parsing an object.

- - - - -
07e92c92 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: workaround cmm's improper variadic ccall breaking wasm32 typechecking

Unlike other targets, wasm requires the function signature of the call
site and callee to strictly match. So in Cmm, when we call a C
function that actually returns a value, we need to add an _unused
local variable to receive it, otherwise type error awaits.

An even bigger problem is calling variadic functions like barf() and
such. Cmm doesn't support CAPI calling convention yet, so calls to
variadic functions just happen to work in some cases with some
target's ABI. But again, it doesn't work with wasm. Fortunately, the
wasm C ABI lowers varargs to a stack pointer argument, and it can be
passed NULL when no other arguments are expected to be passed. So we
also add the additional unused NULL arguments to those functions, so
to fix wasm, while not affecting behavior on other targets.

- - - - -
00124d12 by Cheng Shao at 2022-11-11T00:26:55-05:00
testsuite: correct sleep() signature in T5611

In libc, sleep() returns an integer. The ccall type signature should
match the libc definition, otherwise it causes linker error on wasm.

- - - - -
d72466a9 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: prefer ffi_type_void over FFI_TYPE_VOID

This patch uses ffi_type_void instead of FFI_TYPE_VOID in the
interpreter code, since the FFI_TYPE_* macros are not available in
libffi-wasm32 yet. The libffi public documentation also only mentions
the lower-case ffi_type_* symbols, so we should prefer the lower-case
API here.

- - - - -
4d36a1d3 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: don't define RTS_USER_SIGNALS when signal.h is not present

In the rts, we have a RTS_USER_SIGNALS macro, and most signal-related
logic is guarded with RTS_USER_SIGNALS. This patch extends the range
of code guarded with RTS_USER_SIGNALS, and define RTS_USER_SIGNALS iff
signal.h is actually detected by autoconf. This is required for
wasm32-wasi to work, which lacks signals.

- - - - -
3f1e164f by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: use HAVE_GETPID to guard subprocess related logic

We've previously added detection of getpid() in autoconf. This patch
uses HAVE_GETPID to guard some subprocess related logic in the RTS.
This is required for certain targets like wasm32-wasi, where there
isn't a process model at all.

- - - - -
50bf5e77 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: IPE.c: don't do mutex stuff when THREADED_RTS is not defined

This patch adds the missing THREADED_RTS CPP guard to mutex logic in
IPE.c.

- - - - -
ed3b3da0 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: genericRaise: use exit() instead when not HAVE_RAISE

We check existence of raise() in autoconf, and here, if not
HAVE_RAISE, we should use exit() instead in genericRaise.

- - - - -
c0ba1547 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: checkSuid: don't do it when not HAVE_GETUID

When getuid() is not present, don't do checkSuid since it doesn't make
sense anyway on that target.

- - - - -
d2d6dfd2 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: wasm32 placeholder linker

This patch adds minimal placeholder linker logic for wasm32, just
enough to unblock compiling rts on wasm32. RTS linker functionality is
not properly implemented yet for wasm32.

- - - - -
65ba3285 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: RtsStartup: chdir to PWD on wasm32

This patch adds a wasm32-specific behavior to RtsStartup logic. When
the PWD environment variable is present, we chdir() to it first.

The point is to workaround an issue in wasi-libc: it's currently not
possible to specify the initial working directory, it always defaults
to / (in the virtual filesystem mapped from some host directory). For
some use cases this is sufficient, but there are some other cases
(e.g. in the testsuite) where the program needs to access files
outside.

- - - - -
65b82542 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: no timer for wasm32

Due to the lack of threads, on wasm32 there can't be a background
timer that periodically resets the context switch flag. This patch
disables timer for wasm32, and also makes the scheduler default to -C0
on wasm32 to avoid starving threads.

- - - - -
e007586f by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: RtsSymbols: empty RTS_POSIX_ONLY_SYMBOLS for wasm32

The default RTS_POSIX_ONLY_SYMBOLS doesn't make sense on wasm32.

- - - - -
0e33f667 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: Schedule: no FORKPROCESS_PRIMOP_SUPPORTED on wasm32

On wasm32 there isn't a process model at all, so no
FORKPROCESS_PRIMOP_SUPPORTED.

- - - - -
88bbdb31 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: LibffiAdjustor: adapt to ffi_alloc_prep_closure interface for wasm32

libffi-wasm32 only supports non-standard libffi closure api via
ffi_alloc_prep_closure(). This patch implements
ffi_alloc_prep_closure() via standard libffi closure api on other
targets, and uses it to implement adjustor functionality.

- - - - -
15138746 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: don't return memory to OS on wasm32

This patch makes the storage manager not return any memory on wasm32.
The detailed reason is described in Note [Megablock allocator on
wasm].

- - - - -
631af3cc by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: make flushExec a no-op on wasm32

This patch makes flushExec a no-op on wasm32, since there's no such
thing as executable memory on wasm32 in the first place.

- - - - -
654a3d46 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: RtsStartup: don't call resetTerminalSettings, freeThreadingResources on wasm32

This patch prevents resetTerminalSettings and freeThreadingResources
to be called on wasm32, since there is no TTY or threading on wasm32
at all.

- - - - -
f271e7ca by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: OSThreads.h: stub types for wasm32

This patch defines stub Condition/Mutex/OSThreadId/ThreadLocalKey
types for wasm32, just enough to unblock compiling RTS. Any
threading-related functionality has been patched to be disabled on
wasm32.

- - - - -
a6ac67b0 by Cheng Shao at 2022-11-11T00:26:55-05:00
Add register mapping for wasm32

This patch adds register mapping logic for wasm32. See Note [Register
mapping on WebAssembly] in wasm32 NCG for more description.

- - - - -
d7b33982 by Cheng Shao at 2022-11-11T00:26:55-05:00
rts: wasm32 specific logic

This patch adds the rest of wasm32 specific logic in rts.

- - - - -
7f59b0f3 by Cheng Shao at 2022-11-11T00:26:55-05:00
base: fall back to using monotonic clock to emulate cputime on wasm32

On wasm32, we have to fall back to using monotonic clock to emulate
cputime, since there's no native support for cputime as a clock id.

- - - - -
5fcbae0b by Cheng Shao at 2022-11-11T00:26:55-05:00
base: more autoconf checks for wasm32

This patch adds more autoconf checks to base, since those functions
and headers may exist on other POSIX systems but don't exist on
wasm32.

- - - - -
00a9359f by Cheng Shao at 2022-11-11T00:26:55-05:00
base: avoid using unsupported posix functionality on wasm32

This base patch avoids using unsupported posix functionality on
wasm32.

- - - - -
34b8f611 by Cheng Shao at 2022-11-11T00:26:55-05:00
autoconf: set CrossCompiling=YES in cross bindist configure

This patch fixes the bindist autoconf logic to properly set
CrossCompiling=YES when it's a cross GHC bindist.

- - - - -
5ebeaa45 by Cheng Shao at 2022-11-11T00:26:55-05:00
compiler: add util functions for UniqFM and UniqMap

This patch adds addToUFM_L (backed by insertLookupWithKey),
addToUniqMap_L and intersectUniqMap_C. These UniqFM/UniqMap util
functions are used by the wasm32 NCG.

- - - - -
177c56c1 by Cheng Shao at 2022-11-11T00:26:55-05:00
driver: avoid -Wl,--no-as-needed for wasm32

The driver used to pass -Wl,--no-as-needed for LLD linking. This is
actually only supported for ELF targets, and must be avoided when
linking for wasm32.

- - - - -
06f01c74 by Cheng Shao at 2022-11-11T00:26:55-05:00
compiler: allow big arith for wasm32

This patch enables Cmm big arithmetic on wasm32, since 64-bit
arithmetic can be efficiently lowered to wasm32 opcodes.

- - - - -
df6bb112 by Cheng Shao at 2022-11-11T00:26:55-05:00
driver: pass -Wa,--no-type-check for wasm32 when runAsPhase

This patch passes -Wa,--no-type-check for wasm32 when compiling
assembly. See the added note for more detailed explanation.

- - - - -
c1fe4ab6 by Cheng Shao at 2022-11-11T00:26:55-05:00
compiler: enforce cmm switch planning for wasm32

This patch forcibly enable Cmm switch planning for wasm32, since
otherwise the switch tables we generate may exceed the br_table
maximum allowed size.

- - - - -
a8adc71e by Cheng Shao at 2022-11-11T00:26:55-05:00
compiler: annotate CmmFileEmbed with blob length

This patch adds the blob length field to CmmFileEmbed. The wasm32 NCG
needs to know the precise size of each data segment.

- - - - -
36340328 by Cheng Shao at 2022-11-11T00:26:55-05:00
compiler: wasm32 NCG

This patch adds the wasm32 NCG.

- - - - -
435f42ea by Cheng Shao at 2022-11-11T00:26:55-05:00
ci: add wasm32-wasi release bindist job

- - - - -
d8262fdc by Cheng Shao at 2022-11-11T00:26:55-05:00
ci: add a stronger test for cross bindists

This commit adds a simple GHC API program that parses and reprints the
original hello world program used for basic testing of cross bindists.
Before there's full cross-compilation support in the test suite
driver, this provides better coverage than the original test.

- - - - -
8e6ae882 by Cheng Shao at 2022-11-11T00:26:55-05:00
CODEOWNERS: add wasm-specific maintainers

- - - - -
707d5651 by Zubin Duggal at 2022-11-11T00:27:31-05:00
Clarify that LLVM upper bound is non-inclusive during configure (#22411)

- - - - -
430eccef by Ben Gamari at 2022-11-11T13:16:45-05:00
rts: Check for program_invocation_short_name via autoconf

Instead of assuming support on all Linuxes.

- - - - -
6dab0046 by Matthew Pickering at 2022-11-11T13:17:22-05:00
driver: Fix -fdefer-diagnostics flag

The `withDeferredDiagnostics` wrapper wasn't doing anything because the
session it was modifying wasn't used in hsc_env. Therefore the fix is
simple, just push the `getSession` call into the scope of
`withDeferredDiagnostics`.

Fixes #22391

- - - - -
d0c691b6 by Simon Peyton Jones at 2022-11-11T13:18:07-05:00
Add a fast path for data constructor workers

See Note [Fast path for data constructors] in
GHC.Core.Opt.Simplify.Iteration

This bypasses lots of expensive logic, in the special case of
applications of data constructors.  It is a surprisingly worthwhile
improvement, as you can see in the figures below.

Metrics: compile_time/bytes allocated
------------------------------------------------
          CoOpt_Read(normal)   -2.0%
    CoOpt_Singletons(normal)   -2.0%
    ManyConstructors(normal)   -1.3%
              T10421(normal)   -1.9% GOOD
             T10421a(normal)   -1.5%
              T10858(normal)   -1.6%
              T11545(normal)   -1.7%
              T12234(optasm)   -1.3%
              T12425(optasm)   -1.9% GOOD
              T13035(normal)   -1.0% GOOD
              T13056(optasm)   -1.8%
              T13253(normal)   -3.3% GOOD
              T15164(normal)   -1.7%
              T15304(normal)   -3.4%
              T15630(normal)   -2.8%
              T16577(normal)   -4.3% GOOD
              T17096(normal)   -1.1%
              T17516(normal)   -3.1%
              T18282(normal)   -1.9%
              T18304(normal)   -1.2%
             T18698a(normal)   -1.2% GOOD
             T18698b(normal)   -1.5% GOOD
              T18923(normal)   -1.3%
               T1969(normal)   -1.3% GOOD
              T19695(normal)   -4.4% GOOD
             T21839c(normal)   -2.7% GOOD
             T21839r(normal)   -2.7% GOOD
               T4801(normal)   -3.8% GOOD
               T5642(normal)   -3.1% GOOD
               T6048(optasm)   -2.5% GOOD
               T9020(optasm)   -2.7% GOOD
               T9630(normal)   -2.1% GOOD
               T9961(normal)  -11.7% GOOD
               WWRec(normal)   -1.0%

                   geo. mean   -1.1%
                   minimum    -11.7%
                   maximum     +0.1%

Metric Decrease:
    T10421
    T12425
    T13035
    T13253
    T16577
    T18698a
    T18698b
    T1969
    T19695
    T21839c
    T21839r
    T4801
    T5642
    T6048
    T9020
    T9630
    T9961

- - - - -
3c37d30b by Krzysztof Gogolewski at 2022-11-11T19:18:39+01:00
Use a more efficient printer for code generation (#21853)

The changes in `GHC.Utils.Outputable` are the bulk of the patch
and drive the rest.
The types `HLine` and `HDoc` in Outputable can be used instead of `SDoc`
and support printing directly to a handle with `bPutHDoc`.
See Note [SDoc versus HDoc] and Note [HLine versus HDoc].

The classes `IsLine` and `IsDoc` are used to make the existing code polymorphic
over `HLine`/`HDoc` and `SDoc`. This is done for X86, PPC, AArch64, DWARF
and dependencies (printing module names, labels etc.).

Co-authored-by: Alexis King <lexi.lambda at gmail.com>

Metric Decrease:
    CoOpt_Read
    ManyAlternatives
    ManyConstructors
    T10421
    T12425
    T12707
    T13035
    T13056
    T13253
    T13379
    T18140
    T18282
    T18698a
    T18698b
    T1969
    T20049
    T21839c
    T21839r
    T3064
    T3294
    T4801
    T5321FD
    T5321Fun
    T5631
    T6048
    T783
    T9198
    T9233

- - - - -
6b92b47f by Matthew Craven at 2022-11-11T18:32:14-05:00
Weaken wrinkle 1 of Note [Scrutinee Constant Folding]

Fixes #22375.

Co-authored-by:  Simon Peyton Jones <simon.peytonjones at gmail.com>

- - - - -
154c70f6 by Simon Peyton Jones at 2022-11-11T23:40:10+00:00
Fix fragile RULE setup in GHC.Float

In testing my type-vs-constraint patch I found that the handling
of Natural literals was very fragile -- and I somehow tripped that
fragility in my work.

So this patch fixes the fragility.
See Note [realToFrac natural-to-float]

This made a big (9%) difference in one existing test in
perf/should_run/T1-359

Metric Decrease:
    T10359

- - - - -
778c6adc by Simon Peyton Jones at 2022-11-11T23:40:10+00:00
Type vs Constraint: finally nailed

This big patch addresses the rats-nest of issues that have plagued
us for years, about the relationship between Type and Constraint.
See #11715/#21623.

The main payload of the patch is:
* To introduce CONSTRAINT :: RuntimeRep -> Type
* To make TYPE and CONSTRAINT distinct throughout the compiler

Two overview Notes in GHC.Builtin.Types.Prim

* Note [TYPE and CONSTRAINT]

* Note [Type and Constraint are not apart]
  This is the main complication.

The specifics

* New primitive types (GHC.Builtin.Types.Prim)
  - CONSTRAINT
  - ctArrowTyCon (=>)
  - tcArrowTyCon (-=>)
  - ccArrowTyCon (==>)
  - funTyCon     FUN     -- Not new
  See Note [Function type constructors and FunTy]
  and Note [TYPE and CONSTRAINT]

* GHC.Builtin.Types:
  - New type Constraint = CONSTRAINT LiftedRep
  - I also stopped nonEmptyTyCon being built-in; it only needs to be wired-in

* Exploit the fact that Type and Constraint are distinct throughout GHC
  - Get rid of tcView in favour of coreView.
  - Many tcXX functions become XX functions.
    e.g. tcGetCastedTyVar --> getCastedTyVar

* Kill off Note [ForAllTy and typechecker equality], in (old)
  GHC.Tc.Solver.Canonical.  It said that typechecker-equality should ignore
  the specified/inferred distinction when comparein two ForAllTys.  But
  that wsa only weakly supported and (worse) implies that we need a separate
  typechecker equality, different from core equality. No no no.

* GHC.Core.TyCon: kill off FunTyCon in data TyCon.  There was no need for it,
  and anyway now we have four of them!

* GHC.Core.TyCo.Rep: add two FunTyFlags to FunCo
  See Note [FunCo] in that module.

* GHC.Core.Type.  Lots and lots of changes driven by adding CONSTRAINT.
  The key new function is sORTKind_maybe; most other changes are built
  on top of that.

  See also `funTyConAppTy_maybe` and `tyConAppFun_maybe`.

* Fix a longstanding bug in GHC.Core.Type.typeKind, and Core Lint, in
  kinding ForAllTys.  See new tules (FORALL1) and (FORALL2) in GHC.Core.Type.
  (The bug was that before (forall (cv::t1 ~# t2). blah), where
  blah::TYPE IntRep, would get kind (TYPE IntRep), but it should be
  (TYPE LiftedRep).  See Note [Kinding rules for types] in GHC.Core.Type.

* GHC.Core.TyCo.Compare is a new module in which we do eqType and cmpType.
  Of course, no tcEqType any more.

* GHC.Core.TyCo.FVs. I moved some free-var-like function into this module:
  tyConsOfType, visVarsOfType, and occCheckExpand.  Refactoring only.

* GHC.Builtin.Types.  Compiletely re-engineer boxingDataCon_maybe to
  have one for each /RuntimeRep/, rather than one for each /Type/.
  This dramatically widens the range of types we can auto-box.
  See Note [Boxing constructors] in GHC.Builtin.Types
  The boxing types themselves are declared in library ghc-prim:GHC.Types.

  GHC.Core.Make.  Re-engineer the treatment of "big" tuples (mkBigCoreVarTup
  etc) GHC.Core.Make, so that it auto-boxes unboxed values and (crucially)
  types of kind Constraint. That allows the desugaring for arrows to work;
  it gathers up free variables (including dictionaries) into tuples.
  See  Note [Big tuples] in GHC.Core.Make.

  There is still work to do here: #22336. But things are better than
  before.

* GHC.Core.Make.  We need two absent-error Ids, aBSENT_ERROR_ID for types of
  kind Type, and aBSENT_CONSTRAINT_ERROR_ID for vaues of kind Constraint.
  Ditto noInlineId vs noInlieConstraintId in GHC.Types.Id.Make;
  see Note [inlineId magic].

* GHC.Core.TyCo.Rep. Completely refactor the NthCo coercion.  It is now called
  SelCo, and its fields are much more descriptive than the single Int we used to
  have.  A great improvement.  See Note [SelCo] in GHC.Core.TyCo.Rep.

* GHC.Core.RoughMap.roughMatchTyConName.  Collapse TYPE and CONSTRAINT to
  a single TyCon, so that the rough-map does not distinguish them.

* GHC.Core.DataCon
  - Mainly just improve documentation

* Some significant renamings:
  GHC.Core.Multiplicity: Many -->  ManyTy (easier to grep for)
                         One  -->  OneTy
  GHC.Core.TyCo.Rep TyCoBinder      -->   GHC.Core.Var.PiTyBinder
  GHC.Core.Var      TyCoVarBinder   -->   ForAllTyBinder
                    AnonArgFlag     -->   FunTyFlag
                    ArgFlag         -->   ForAllTyFlag
  GHC.Core.TyCon    TyConTyCoBinder --> TyConPiTyBinder
  Many functions are renamed in consequence
  e.g. isinvisibleArgFlag becomes isInvisibleForAllTyFlag, etc

* I refactored FunTyFlag (was AnonArgFlag) into a simple, flat data type
    data FunTyFlag
      = FTF_T_T           -- (->)  Type -> Type
      | FTF_T_C           -- (-=>) Type -> Constraint
      | FTF_C_T           -- (=>)  Constraint -> Type
      | FTF_C_C           -- (==>) Constraint -> Constraint

* GHC.Tc.Errors.Ppr.  Some significant refactoring in the TypeEqMisMatch case
  of pprMismatchMsg.

* I made the tyConUnique field of TyCon strict, because I
  saw code with lots of silly eval's.  That revealed that
  GHC.Settings.Constants.mAX_SUM_SIZE can only be 63, because
  we pack the sum tag into a 6-bit field.  (Lurking bug squashed.)

Fixes
* #21530

Updates haddock submodule slightly.

Performance changes
~~~~~~~~~~~~~~~~~~~
I was worried that compile times would get worse, but after
some careful profiling we are down to a geometric mean 0.1%
increase in allocation (in perf/compiler).  That seems fine.

There is a big runtime improvement in T10359

Metric Decrease:
    LargeRecord
    MultiLayerModulesTH_OneShot
    T13386
    T13719
Metric Increase:
    T8095

- - - - -
360f5fec by Simon Peyton Jones at 2022-11-11T23:40:11+00:00
Indent closing "#-}" to silence HLint

- - - - -
e160cf47 by Krzysztof Gogolewski at 2022-11-12T08:05:28-05:00
Fix merge conflict in T18355.stderr

Fixes #22446

- - - - -
294f9073 by Simon Peyton Jones at 2022-11-12T23:14:13+00:00
Fix a trivial typo in dataConNonlinearType

Fixes #22416

- - - - -
268a3ce9 by Ben Gamari at 2022-11-14T09:36:57-05:00
eventlog: Ensure that IPE output contains actual info table pointers

The refactoring in 866c736e introduced a rather subtle change in the
semantics of the IPE eventlog output, changing the eventlog field from
encoding info table pointers to "TNTC pointers" (which point to entry
code when tables-next-to-code is enabled). Fix this.

Fixes #22452.

- - - - -
d91db679 by Matthew Pickering at 2022-11-14T16:48:10-05:00
testsuite: Add tests for T22347

These are fixed in recent versions but might as well add regression
tests.

See #22347

- - - - -
8f6c576b by Matthew Pickering at 2022-11-14T16:48:45-05:00
testsuite: Improve output from tests which have failing pre_cmd

There are two changes:

* If a pre_cmd fails, then don't attempt to run the test.
* If a pre_cmd fails, then print the stdout and stderr from running that
  command (which hopefully has a nice error message).

For example:

```
=====> 1 of 1 [0, 0, 0]
*** framework failure for test-defaulting-plugin(normal) pre_cmd failed: 2
** pre_cmd was "$MAKE -s --no-print-directory -C defaulting-plugin package.test-defaulting-plugin TOP={top}".
stdout:
stderr:
DefaultLifted.hs:19:13: error: [GHC-76037]
    Not in scope: type constructor or class ‘Typ’
    Suggested fix:
      Perhaps use one of these:
        ‘Type’ (imported from GHC.Tc.Utils.TcType),
        data constructor ‘Type’ (imported from GHC.Plugins)
   |
19 | instance Eq Typ where
   |             ^^^
make: *** [Makefile:17: package.test-defaulting-plugin] Error 1

Performance Metrics (test environment: local):
```

Fixes #22329

- - - - -
2b7d5ccc by Madeline Haraj at 2022-11-14T22:44:17+00:00
Implement UNPACK support for sum types.

This is based on osa's unpack_sums PR from ages past.

The meat of the patch is implemented in dataConArgUnpackSum
and described in Note [UNPACK for sum types].

- - - - -
78f7ecb0 by Andreas Klebinger at 2022-11-14T22:20:29-05:00
Expand on the need to clone local binders.

Fixes #22402.

- - - - -
65ce43cc by Krzysztof Gogolewski at 2022-11-14T22:21:05-05:00
Fix :i Constraint printing "type Constraint = Constraint"

Since Constraint became a synonym for CONSTRAINT 'LiftedRep,
we need the same code for handling printing as for the synonym
Type = TYPE 'LiftedRep.
This addresses the same bug as #18594, so I'm reusing the test.

- - - - -
94549f8f by ARATA Mizuki at 2022-11-15T21:36:03-05:00
configure: Don't check for an unsupported version of LLVM

The upper bound is not inclusive.

Fixes #22449

- - - - -
02d3511b by Bodigrim at 2022-11-15T21:36:41-05:00
Fix capitalization in haddock for TestEquality

- - - - -
08bf2881 by Cheng Shao at 2022-11-16T09:16:29+00:00
base: make Foreign.Marshal.Pool use RTS internal arena for allocation

`Foreign.Marshal.Pool` used to call `malloc` once for each allocation
request. Each `Pool` maintained a list of allocated pointers, and
traverses the list to `free` each one of those pointers. The extra O(n)
overhead is apparently bad for a `Pool` that serves a lot of small
allocation requests.

This patch uses the RTS internal arena to implement `Pool`, with these
benefits:

- Gets rid of the extra O(n) overhead.
- The RTS arena is simply a bump allocator backed by the block
  allocator, each allocation request is likely faster than a libc
  `malloc` call.

Closes #14762 #18338.

- - - - -
37cfe3c0 by Krzysztof Gogolewski at 2022-11-16T14:50:06-05:00
Misc cleanup

* Replace catMaybes . map f with mapMaybe f
* Use concatFS to concatenate multiple FastStrings
* Fix documentation of -exclude-module
* Cleanup getIgnoreCount in GHCi.UI

- - - - -
b0ac3813 by Lawton Nichols at 2022-11-19T03:22:14-05:00
Give better errors for code corrupted by Unicode smart quotes (#21843)

Previously, we emitted a generic and potentially confusing error during lexical
analysis on programs containing smart quotes (“/”/‘/’). This commit adds
smart quote-aware lexer errors.

- - - - -
cb8430f8 by Sebastian Graf at 2022-11-19T03:22:49-05:00
Make OpaqueNo* tests less noisy to unrelated changes

- - - - -
b1a8af69 by Sebastian Graf at 2022-11-19T03:22:49-05:00
Simplifier: Consider `seq` as a `BoringCtxt` (#22317)

See `Note [Seq is boring]` for the rationale.

Fixes #22317.

- - - - -
9fd11585 by Sebastian Graf at 2022-11-19T03:22:49-05:00
Make T21839c's ghc/max threshold more forgiving

- - - - -
4b6251ab by Simon Peyton Jones at 2022-11-19T03:23:24-05:00
Be more careful when reporting unbound RULE binders

See Note [Variables unbound on the LHS] in GHC.HsToCore.Binds.

Fixes #22471.

- - - - -
e8f2b80d by Peter Trommler at 2022-11-19T03:23:59-05:00
PPC NCG: Fix generating assembler code

Fixes #22479

- - - - -
f2f9ef07 by Bodigrim at 2022-11-20T18:39:30-05:00
Extend documentation for Data.IORef

- - - - -
ef511b23 by Simon Peyton Jones at 2022-11-20T18:40:05-05:00
Buglet in GHC.Tc.Module.checkBootTyCon

This lurking bug used the wrong function to compare two
types in GHC.Tc.Module.checkBootTyCon

It's hard to trigger the bug, which only came up during
!9343, so there's no regression test in this MR.

- - - - -
451aeac3 by Bodigrim at 2022-11-20T18:40:44-05:00
Add since pragmas for c_interruptible_open and hostIsThreaded

- - - - -
8d6aaa49 by Duncan Coutts at 2022-11-22T02:06:16-05:00
Introduce CapIOManager as the per-cap I/O mangager state

Rather than each I/O manager adding things into the Capability structure
ad-hoc, we should have a common CapIOManager iomgr member of the
Capability structure, with a common interface to initialise etc.

The content of the CapIOManager struct will be defined differently for
each I/O manager implementation. Eventually we should be able to have
the CapIOManager be opaque to the rest of the RTS, and known just to the
I/O manager implementation. We plan for that by making the Capability
contain a pointer to the CapIOManager rather than containing the
structure directly.

Initially just move the Unix threaded I/O manager's control FD.

- - - - -
8901285e by Duncan Coutts at 2022-11-22T02:06:17-05:00
Add hook markCapabilityIOManager

To allow I/O managers to have GC roots in the Capability, within the
CapIOManager structure.

Not yet used in this patch.

- - - - -
5cf709c5 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Move APPEND_TO_BLOCKED_QUEUE from cmm to C

The I/O and delay blocking primitives for the non-threaded way
currently access the blocked_queue and sleeping_queue directly.

We want to move where those queues are to make their ownership clearer:
to have them clearly belong to the I/O manager impls rather than to the
scheduler. Ultimately we will want to change their representation too.

It's inconvenient to do that if these queues are accessed directly from
cmm code. So as a first step, replace the APPEND_TO_BLOCKED_QUEUE with a
C version appendToIOBlockedQueue(), and replace the open-coded
sleeping_queue insertion with insertIntoSleepingQueue().

- - - - -
ced9acdb by Duncan Coutts at 2022-11-22T02:06:17-05:00
Move {blocked,sleeping}_queue from scheduler global vars to CapIOManager

The blocked_queue_{hd,tl} and the sleeping_queue are currently
cooperatively managed between the scheduler and (some but not all of)
the non-threaded I/O manager implementations.

They lived as global vars with the scheduler, but are poked by I/O
primops and the I/O manager backends.

This patch is a step on the path towards making the management of I/O or
timer blocking belong to the I/O managers and not the scheduler.

Specifically, this patch moves the {blocked,sleeping}_queue from being
global vars in the scheduler to being members of the CapIOManager struct
within each Capability. They are not yet exclusively used by the I/O
managers: they are still poked from a couple other places, notably in
the scheduler before calling awaitEvent.

- - - - -
0f68919e by Duncan Coutts at 2022-11-22T02:06:17-05:00
Remove the now-unused markScheduler

The global vars {blocked,sleeping}_queue are now in the Capability and
so get marked there via markCapabilityIOManager.

- - - - -
39a91f60 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Move macros for checking for pending IO or timers

from Schedule.h to Schedule.c and IOManager.h

This is just moving, the next step will be to rejig them slightly.

For the non-threaded RTS the scheduler needs to be able to test for
there being pending I/O operation or pending timers. The implementation
of these tests should really be considered to be part of the I/O
managers and not part of the scheduler.

- - - - -
664b034b by Duncan Coutts at 2022-11-22T02:06:17-05:00
Replace EMPTY_{BLOCKED,SLEEPING}_QUEUE macros by function

These are the macros originaly from Scheduler.h, previously moved to
IOManager.h, and now replaced with a single inline function
anyPendingTimeoutsOrIO(). We can use a single function since the two
macros were always checked together.

Note that since anyPendingTimeoutsOrIO is defined for all IO manager
cases, including threaded, we do not need to guard its use by cpp
 #if !defined(THREADED_RTS)

- - - - -
32946220 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Expand emptyThreadQueues inline for clarity

It was not really adding anything. The name no longer meant anything
since those I/O and timeout queues do not belong to the scheuler.

In one of the two places it was used, the comments already had to
explain what it did, whereas now the code matches the comment nicely.

- - - - -
9943baf9 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Move the awaitEvent declaration into IOManager.h

And add or adjust comments at the use sites of awaitEvent.

- - - - -
054dcc9d by Duncan Coutts at 2022-11-22T02:06:17-05:00
Pass the Capability *cap explicitly to awaitEvent

It is currently only used in the non-threaded RTS so it works to use
MainCapability, but it's a bit nicer to pass the cap anyway. It's
certainly shorter.

- - - - -
667fe5a4 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Pass the Capability *cap explicitly to appendToIOBlockedQueue

And to insertIntoSleepingQueue. Again, it's a bit cleaner and simpler
though not strictly necessary given that these primops are currently
only used in the non-threaded RTS.

- - - - -
7181b074 by Duncan Coutts at 2022-11-22T02:06:17-05:00
Reveiew feedback: improve one of the TODO comments

The one about the nonsense (const False) test on WinIO for there being any IO
or timers pending, leading to unnecessary complication later in the
scheduler.

- - - - -
e5b68183 by Andreas Klebinger at 2022-11-22T02:06:52-05:00
Optimize getLevity.

Avoid the intermediate data structures allocated by splitTyConApp.
This avoids ~0.5% of allocations for a build using -O2.

Fixes #22254

- - - - -
de5fb348 by Andreas Klebinger at 2022-11-22T02:07:28-05:00
hadrian:Set TNTC when running testsuite.

- - - - -
9d61c182 by Oleg Grenrus at 2022-11-22T15:59:34-05:00
Add unsafePtrEquality# restricted to UnliftedTypes

- - - - -
e817c871 by Jonathan Dowland at 2022-11-22T16:00:14-05:00
utils/unlit: adjust parser to match Report spec

The Haskell 2010 Report says that, for Latex-style Literate format,
"Program code begins on the first line following a line that begins
\begin{code}". (This is unchanged from the 98 Report)

However the unlit.c implementation only matches a line that contains
"\begin{code}" and nothing else. One consequence of this is that one
cannot suffix Latex options to the code environment. I.e., this does
not work:

\begin{code}[label=foo,caption=Foo Code]

Adjust the matcher to conform to the specification from the Report.

The Haskell Wiki currently recommends suffixing a '%' to \begin{code}
in order to deliberately hide a code block from Haskell. This is bad
advice, as it's relying on an implementation quirk rather than specified
behaviour. None-the-less, some people have tried to use it, c.f.
<https://mail.haskell.org/pipermail/haskell-cafe/2009-September/066780.html>

An alternative solution is to define a separate, equivalent Latex
environment to "code", that is functionally identical in Latex but
ignored by unlit. This should not be a burden: users are required to
manually define the code environment anyway, as it is not provided
by the Latex verbatim or lstlistings packages usually used for
presenting code in documents.

Fixes #3549.

- - - - -
0b7fef11 by Teo Camarasu at 2022-11-23T12:44:33-05:00
Fix eventlog all option

Previously it didn't enable/disable nonmoving_gc and ticky event types

Fixes #21813

- - - - -
04d0618c by Arnaud Spiwack at 2022-11-23T12:45:14-05:00
Expand Note [Linear types] with the stance on linting linearity

Per the discussion on #22123

- - - - -
e1538516 by Lawton Nichols at 2022-11-23T12:45:55-05:00
Add documentation on custom Prelude modules (#22228)

Specifically, custom Prelude modules that are named `Prelude`.

- - - - -
b5c71454 by Sylvain Henry at 2022-11-23T12:46:35-05:00
Don't let configure perform trivial substitutions (#21846)

Hadrian now performs substitutions, especially to generate .cabal files
from .cabal.in files. Two benefits:

1. We won't have to re-configure when we modify thing.cabal.in. Hadrian
   will take care of this for us.

2. It paves the way to allow the same package to be configured
   differently by Hadrian in the same session. This will be useful to
   fix #19174: we want to build a stage2 cross-compiler for the host
   platform and a stage1 compiler for the cross target platform in the
   same Hadrian session.

- - - - -
99aca26b by nineonine at 2022-11-23T12:47:11-05:00
CApiFFI: add ConstPtr for encoding const-qualified pointer return types (#22043)

Previously, when using `capi` calling convention in foreign declarations,
code generator failed to handle const-cualified pointer return types.
This resulted in CC toolchain throwing `-Wincompatible-pointer-types-discards-qualifiers`
warning.

`Foreign.C.Types.ConstPtr` newtype was introduced to handle these cases -
special treatment was put in place to generate appropritetly qualified C
wrapper that no longer triggers the above mentioned warning.

Fixes #22043

- - - - -
040bfdc3 by M Farkas-Dyck at 2022-11-23T21:59:03-05:00
Scrub some no-warning pragmas.

- - - - -
178c1fd8 by Vladislav Zavialov at 2022-11-23T21:59:39-05:00
Check if the SDoc starts with a single quote (#22488)

This patch fixes pretty-printing of character literals
inside promoted lists and tuples.

When we pretty-print a promoted list or tuple whose first element
starts with a single quote, we want to add a space between the opening
bracket and the element:

	'[True]    -- ok
	'[ 'True]  -- ok
	'['True]   -- not ok

If we don't add the space, we accidentally produce a character
literal '['.

Before this patch, pprSpaceIfPromotedTyCon inspected the type as an AST
and tried to guess if it would be rendered with a single quote. However,
it missed the case when the inner type was itself a character literal:

	'[ 'x']  -- ok
	'['x']   -- not ok

Instead of adding this particular case, I opted for a more future-proof
solution: check the SDoc directly. This way we can detect if the single
quote is actually there instead of trying to predict it from the AST.
The new function is called spaceIfSingleQuote.

- - - - -
11627c42 by Matthew Pickering at 2022-11-23T22:00:15-05:00
notes: Fix references to HPT space leak note

Updating this note was missed when updating the HPT to the HUG.

Fixes #22477

- - - - -
86ff1523 by Andrei Borzenkov at 2022-11-24T17:24:51-05:00
Convert diagnostics in GHC.Rename.Expr to proper TcRnMessage (#20115)

Problem: avoid usage of TcRnMessageUnknown

Solution:
The following `TcRnMessage` messages has been introduced:
  TcRnNoRebindableSyntaxRecordDot
  TcRnNoFieldPunsRecordDot
  TcRnIllegalStaticExpression
  TcRnIllegalStaticFormInSplice
  TcRnListComprehensionDuplicateBinding
  TcRnEmptyStmtsGroup
  TcRnLastStmtNotExpr
  TcRnUnexpectedStatementInContext
  TcRnIllegalTupleSection
  TcRnIllegalImplicitParameterBindings
  TcRnSectionWithoutParentheses

Co-authored-by: sheaf <sam.derbyshire at gmail.com>

- - - - -
d198a19a by Cheng Shao at 2022-11-24T17:25:29-05:00
rts: fix missing Arena.h symbols in RtsSymbols.c

It was an unfortunate oversight in !8961 and broke devel2 builds.

- - - - -
5943e739 by Bodigrim at 2022-11-25T04:38:28-05:00
Assorted fixes to avoid Data.List.{head,tail}

- - - - -
1f1b99b8 by sheaf at 2022-11-25T04:38:28-05:00
Review suggestions for assorted fixes to avoid Data.List.{head,tail}

- - - - -
13d627bb by Vladislav Zavialov at 2022-11-25T04:39:04-05:00
Print unticked promoted data constructors (#20531)

Before this patch, GHC unconditionally printed ticks before promoted
data constructors:

	ghci> type T = True  -- unticked (user-written)
	ghci> :kind! T
	T :: Bool
	= 'True              -- ticked (compiler output)

After this patch, GHC prints ticks only when necessary:

	ghci> type F = False    -- unticked (user-written)
	ghci> :kind! F
	F :: Bool
	= False                 -- unticked (compiler output)

	ghci> data False        -- introduce ambiguity
	ghci> :kind! F
	F :: Bool
	= 'False                -- ticked by necessity (compiler output)

The old behavior can be enabled by -fprint-redundant-promotion-ticks.

Summary of changes:
* Rename PrintUnqualified to NamePprCtx
* Add QueryPromotionTick to it
* Consult the GlobalRdrEnv to decide whether to print a tick (see mkPromTick)
* Introduce -fprint-redundant-promotion-ticks

Co-authored-by: Artyom Kuznetsov <hi at wzrd.ht>

- - - - -
d10dc6bd by Simon Peyton Jones at 2022-11-25T22:31:27+00:00
Fix decomposition of TyConApps

Ticket #22331 showed that we were being too eager to decompose
a Wanted TyConApp, leading to incompleteness in the solver.

To understand all this I ended up doing a substantial rewrite
of the old Note [Decomposing equalities], now reborn as
Note [Decomposing TyConApp equalities]. Plus rewrites of other
related Notes.

The actual fix is very minor and actually simplifies the code: in
`can_decompose` in `GHC.Tc.Solver.Canonical.canTyConApp`, we now call
`noMatchableIrreds`.  A closely related refactor: we stop trying to
use the same "no matchable givens" function here as in
`matchClassInst`.  Instead split into two much simpler functions.

- - - - -
2da5c38a by Will Hawkins at 2022-11-26T04:05:04-05:00
Redirect output of musttail attribute test

Compilation output from test for support of musttail attribute leaked to
the console.

- - - - -
0eb1c331 by Cheng Shao at 2022-11-28T08:55:53+00:00
Move hs_mulIntMayOflo cbits to ghc-prim

It's only used by wasm NCG at the moment, but ghc-prim is a more
reasonable place for hosting out-of-line primops. Also, we only need a
single version of hs_mulIntMayOflo.

- - - - -
36b53a9d by Cheng Shao at 2022-11-28T09:05:57+00:00
compiler: generate ccalls for clz/ctz/popcnt in wasm NCG

We used to generate a single wasm clz/ctz/popcnt opcode, but it's
wrong when it comes to subwords, so might as well generate ccalls for
them. See #22470 for details.

- - - - -
d4134e92 by Cheng Shao at 2022-11-28T23:48:14-05:00
compiler: remove unused MO_U_MulMayOflo

We actually only emit MO_S_MulMayOflo and never emit MO_U_MulMayOflo anywhere.

- - - - -
8d15eadc by Apoorv Ingle at 2022-11-29T03:09:31-05:00
Killing cc_fundeps, streamlining kind equality orientation, and type equality processing order

Fixes: #217093
Associated to #19415

This change
* Flips the orientation of the the generated kind equality coercion in canEqLHSHetero;
* Removes `cc_fundeps` in CDictCan as the check was incomplete;
* Changes `canDecomposableTyConAppOk` to ensure we process kind equalities before type equalities and avoiding a call to `canEqLHSHetero` while processing wanted TyConApp equalities
* Adds 2 new tests for validating the change
   - testsuites/typecheck/should_compile/T21703.hs and
   - testsuites/typecheck/should_fail/T19415b.hs (a simpler version of T19415.hs)
* Misc: Due to the change in the equality direction some error messages now have flipped type mismatch errors
* Changes in Notes:
  - Note [Fundeps with instances, and equality orientation] supercedes Note [Fundeps with instances]
  - Added Note [Kind Equality Orientation] to visualize the kind flipping
  - Added Note [Decomposing Dependent TyCons and Processing Wanted Equalties]

- - - - -
646969d4 by Krzysztof Gogolewski at 2022-11-29T03:10:13-05:00
Change printing of sized literals to match the proposal

Literals in Core were printed as e.g. 0xFF#16 :: Int16#.
The proposal 451 now specifies syntax 0xFF#Int16.
This change affects the Core printer only - more to be done later.

Part of #21422.

- - - - -
02e282ec by Simon Peyton Jones at 2022-11-29T03:10:48-05:00
Be a bit more selective about floating bottoming expressions

This MR arranges to float a bottoming expression to the top
only if it escapes a value lambda.

See #22494 and Note [Floating to the top] in SetLevels.

This has a generally beneficial effect in nofib

+-------------------------------++----------+
|                               ||tsv (rel) |
+===============================++==========+
|           imaginary/paraffins ||   -0.93% |
|                imaginary/rfib ||   -0.05% |
|                      real/fem ||   -0.03% |
|                    real/fluid ||   -0.01% |
|                   real/fulsom ||   +0.05% |
|                   real/gamteb ||   -0.27% |
|                       real/gg ||   -0.10% |
|                   real/hidden ||   -0.01% |
|                      real/hpg ||   -0.03% |
|                      real/scs ||  -11.13% |
|         shootout/k-nucleotide ||   -0.01% |
|               shootout/n-body ||   -0.08% |
|   shootout/reverse-complement ||   -0.00% |
|        shootout/spectral-norm ||   -0.02% |
|             spectral/fibheaps ||   -0.20% |
|           spectral/hartel/fft ||   -1.04% |
|         spectral/hartel/solid ||   +0.33% |
|     spectral/hartel/wave4main ||   -0.35% |
|                 spectral/mate ||   +0.76% |
+===============================++==========+
|                     geom mean ||   -0.12% |

The effect on compile time is generally slightly beneficial

Metrics: compile_time/bytes allocated
----------------------------------------------
MultiLayerModulesTH_OneShot(normal)  +0.3%
                  PmSeriesG(normal)  -0.2%
                  PmSeriesT(normal)  -0.1%
                     T10421(normal)  -0.1%
                    T10421a(normal)  -0.1%
                     T10858(normal)  -0.1%
                     T11276(normal)  -0.1%
                    T11303b(normal)  -0.2%
                     T11545(normal)  -0.1%
                     T11822(normal)  -0.1%
                     T12150(optasm)  -0.1%
                     T12234(optasm)  -0.3%
                     T13035(normal)  -0.2%
                     T16190(normal)  -0.1%
                     T16875(normal)  -0.4%
                    T17836b(normal)  -0.2%
                     T17977(normal)  -0.2%
                    T17977b(normal)  -0.2%
                     T18140(normal)  -0.1%
                     T18282(normal)  -0.1%
                     T18304(normal)  -0.2%
                    T18698a(normal)  -0.1%
                     T18923(normal)  -0.1%
                     T20049(normal)  -0.1%
                    T21839r(normal)  -0.1%
                      T5837(normal)  -0.4%
                      T6048(optasm)  +3.2% BAD
                      T9198(normal)  -0.2%
                      T9630(normal)  -0.1%
       TcPlugin_RewritePerf(normal)  -0.4%
             hard_hole_fits(normal)  -0.1%

                          geo. mean  -0.0%
                          minimum    -0.4%
                          maximum    +3.2%

The T6048 outlier is hard to pin down, but it may be the effect of
reading in more interface files definitions. It's a small program for
which compile time is very short, so I'm not bothered about it.

Metric Increase:
    T6048

- - - - -
ab23dc5e by Ben Gamari at 2022-11-29T03:11:25-05:00
testsuite: Mark unpack_sums_6 as fragile due to #22504

This test is explicitly dependent upon runtime, which is generally not
appropriate given that the testsuite is run in parallel and generally
saturates the CPU.

- - - - -
def47dd3 by Ben Gamari at 2022-11-29T03:11:25-05:00
testsuite: Don't use grep -q in unpack_sums_7

`grep -q` closes stdin as soon as it finds the pattern it is looking
for, resulting in #22484.

- - - - -
3b7adbfc by Matthew Pickering at 2022-11-29T12:00:38+00:00
Wip: parr

- - - - -
ba177155 by Matthew Pickering at 2022-12-06T12:04:52+00:00
wip

- - - - -
8fedd354 by Matthew Pickering at 2022-12-06T12:04:52+00:00
Perf experiments

@simonpj I have pushed a branch which has access to four configurations. Could you please have a look and see if you think there is another approach which doesn't compromise performance for the serial case?

The test I ran was `T9233`.

| config | allocs |
| ------ | ------ |
| normal | 718,212,096 |
| fresh unique in subst_id_bndr | 814,082,368  |
| fresh uniques after floating (substExpr) | 838,098,280 |
| fresh uniques after floating (insert let) | 929,485,912 |

The two relevant places to look are in `subst_id_bndr` and `simplLazyBind` where there are calls to `uniqifyFloats_lazy` and `uniqifyFloats_strict`.

- - - - -


30 changed files:

- .gitlab-ci.yml
- .gitlab/ci.sh
- + .gitlab/hello.hs
- CODEOWNERS
- compiler/CodeGen.Platform.h
- compiler/GHC.hs
- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Builtin/PrimOps.hs
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/Builtin/Types.hs-boot
- compiler/GHC/Builtin/Types/Literals.hs
- compiler/GHC/Builtin/Types/Prim.hs
- − compiler/GHC/Builtin/Types/Prim.hs-boot
- compiler/GHC/Builtin/Uniques.hs
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/CLabel.hs-boot
- compiler/GHC/Cmm/ContFlowOpt.hs
- compiler/GHC/Cmm/DebugBlock.hs
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/ProcPoint.hs
- + compiler/GHC/Cmm/Reducibility.hs
- compiler/GHC/Cmm/Reg.hs
- compiler/GHC/Cmm/Utils.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/AArch64.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bea1328fe8202fb87c6702da3e07cfb3e13195f7...8fedd354e6a34649f6504f2641a5856720ac4415

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bea1328fe8202fb87c6702da3e07cfb3e13195f7...8fedd354e6a34649f6504f2641a5856720ac4415
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221206/0958c847/attachment-0001.html>


More information about the ghc-commits mailing list