[Git][ghc/ghc][wip/T24150] 250 commits: hadrian/bindist: Eliminate extraneous `dirname` invocation

Teo Camarasu (@teo) gitlab at gitlab.haskell.org
Wed May 15 09:27:16 UTC 2024



Teo Camarasu pushed to branch wip/T24150 at Glasgow Haskell Compiler / GHC


Commits:
60023351 by Ben Gamari at 2024-03-19T22:33:10-04:00
hadrian/bindist: Eliminate extraneous `dirname` invocation

Previously we would call `dirname` twice per installed library file.
We now instead reuse this result. This helps appreciably on Windows, where
processes are quite expensive.

- - - - -
616ac300 by Ben Gamari at 2024-03-19T22:33:10-04:00
hadrian: Package mingw toolchain in expected location

This fixes #24525, a regression due to 41cbaf44a6ab5eb9fa676d65d32df8377898dc89.
Specifically, GHC expects to find the mingw32 toolchain in the binary distribution
root. However, after this patch it was packaged in the `lib/` directory.

- - - - -
de9daade by Ben Gamari at 2024-03-19T22:33:11-04:00
gitlab/rel_eng: More upload.sh tweaks

- - - - -
1dfe12db by Ben Gamari at 2024-03-19T22:33:11-04:00
rel_eng: Drop dead prepare_docs codepath

- - - - -
dd2d748b by Ben Gamari at 2024-03-19T22:33:11-04:00
rel_env/recompress_all: unxz before recompressing

Previously we would rather compress the xz *again*, before in addition
compressing it with the desired scheme.

Fixes #24545.

- - - - -
9d936c57 by Ben Gamari at 2024-03-19T22:33:11-04:00
mk-ghcup-metadata: Fix directory of testsuite tarball

As reported in #24546, the `dlTest` artifact should be extracted into
the `testsuite` directory.

- - - - -
6d398066 by Ben Gamari at 2024-03-19T22:33:11-04:00
ghcup-metadata: Don't populate dlOutput unless necessary

ghcup can apparently infer the output name of an artifact from its URL.
Consequently, we should only include the `dlOutput` field when it would
differ from the filename of `dlUri`.

Fixes #24547.

- - - - -
576f8b7e by Zubin Duggal at 2024-03-19T22:33:46-04:00
Revert "Apply shellcheck suggestion to SUBST_TOOLDIR"

This reverts commit c82770f57977a2b5add6e1378f234f8dd6153392.

The shellcheck suggestion is spurious and results in SUBST_TOOLDIR being a
no-op. `set` sets positional arguments for bash, but we want to set the variable
given as the first autoconf argument.

Fixes #24542

Metric decreases because the paths in the settings file are now shorter,
so we allocate less when we read the settings file.

-------------------------
Metric Decrease:
    T12425
    T13035
    T9198
-------------------------

- - - - -
cdfe6e01 by Fendor at 2024-03-19T22:34:22-04:00
Compact serialisation of IfaceAppArgs

In #24563, we identified that IfaceAppArgs serialisation tags each
cons cell element with a discriminator byte. These bytes add up
quickly, blowing up interface files considerably when
'-fwrite-if-simplified-core' is enabled.

We compact the serialisation by writing out the length of
'IfaceAppArgs', followed by serialising the elements directly without
any discriminator byte.

This improvement can decrease the size of some interface files by up
to 35%.

- - - - -
97a2bb1c by Simon Peyton Jones at 2024-03-20T17:11:29+00:00
Expand untyped splices in tcPolyExprCheck

Fixes #24559

- - - - -
5f275176 by Alan Zimmerman at 2024-03-20T22:44:12-04:00
EPA: Clean up Exactprint helper functions a bit

- Introduce a helper lens to compose on `EpAnn a` vs `a` versions
- Rename some prime versions of functions back to non-prime
  They were renamed during the rework

- - - - -
da2a10ce by Vladislav Zavialov at 2024-03-20T22:44:48-04:00
Type operators in promoteOccName (#24570)

Type operators differ from term operators in that they are lexically
classified as (type) constructors, not as (type) variables.

Prior to this change, promoteOccName did not account for this
difference, causing a scoping issue that affected RequiredTypeArguments.

  type (!@#) = Bool
  f = idee (!@#)      -- Not in scope: ‘!@#’  (BUG)

Now we have a special case in promoteOccName to account for this.

- - - - -
247fc0fa by Preetham Gujjula at 2024-03-21T10:19:18-04:00
docs: Remove mention of non-existent Ord instance for Complex

The documentation for Data.Complex says that the Ord instance for Complex Float
is deficient, but there is no Ord instance for Complex a. The Eq instance for
Complex Float is similarly deficient, so we use that as an example instead.

- - - - -
6fafc51e by Andrei Borzenkov at 2024-03-21T10:19:54-04:00
Fix TH handling in `pat_to_type_pat` function (#24571)

There was missing case for `SplicePat` in `pat_to_type_at` function,
hence patterns with splicing that checked against `forall->` doesn't work
properly because they fall into the "illegal pattern" case.

Code example that is now accepted:

  g :: forall a -> ()
  g $([p| a |]) = ()

- - - - -
52072f8e by Sylvain Henry at 2024-03-21T21:01:59-04:00
Type-check default declarations before deriving clauses (#24566)

See added Note and #24566. Default declarations must be type-checked
before deriving clauses.

- - - - -
7dfdf3d9 by Sylvain Henry at 2024-03-21T21:02:40-04:00
Lexer: small perf changes

- Use unsafeChr because we know our values to be valid
- Remove some unnecessary use of `ord` (return Word8 values directly)

- - - - -
864922ef by Sylvain Henry at 2024-03-21T21:02:40-04:00
JS: fix some comments

- - - - -
3e0b2b1f by Sebastian Graf at 2024-03-21T21:03:16-04:00
Simplifier: Re-do dependency analysis in abstractFloats (#24551)

In #24551, we abstracted a string literal binding over a type variable,
triggering a CoreLint error when that binding floated to top-level.

The solution implemented in this patch fixes this by re-doing dependency
analysis on a simplified recursive let binding that is about to be type
abstracted, in order to find the minimal set of type variables to abstract over.
See wrinkle (AB5) of Note [Floating and type abstraction] for more details.

Fixes #24551

- - - - -
8a8ac65a by Matthew Craven at 2024-03-23T00:20:52-04:00
Improve toInteger @Word32 on 64-bit platforms

On 64-bit platforms, every Word32 fits in an Int, so we can
convert to Int# without having to perform the overflow check
integerFromWord# uses internally.

- - - - -
0c48f2b9 by Apoorv Ingle at 2024-03-23T00:21:28-04:00
Fix for #24552 (see testcase T24552)

Fixes for a bug in desugaring pattern synonyms matches, introduced
while working on  on expanding `do`-blocks in #18324

The `matchWrapper` unecessarily (and incorrectly) filtered out the
default wild patterns in a match. Now the wild pattern alternative is
simply ignored by the pm check as its origin is `Generated`.
The current code now matches the expected semantics according to the language spec.

- - - - -
b72705e9 by Simon Peyton Jones at 2024-03-23T00:22:04-04:00
Print more info about kinds in error messages

This fixes #24553, where GHC unhelpfully said

  error: [GHC-83865]
    • Expected kind ‘* -> * -> *’, but ‘Foo’ has kind ‘* -> * -> *’

See Note [Showing invisible bits of types in error messages]

- - - - -
8f7cfc7e by Tristan Cacqueray at 2024-03-23T00:22:44-04:00
docs: remove the don't use float hint

This hint is outdated, ``Complex Float`` are now specialised,
and the heap space suggestion needs more nuance so it should
be explained in the unboxed/storable array documentation.

- - - - -
5bd8ed53 by Andreas Klebinger at 2024-03-23T16:18:33-04:00
NCG: Fix a bug in jump shortcutting.

When checking if a jump has more than one destination account for the
possibility of some jumps not being representable by a BlockId.

We do so by having isJumpishInstr return a `Maybe BlockId` where Nothing
represents non-BlockId jump destinations.

Fixes #24507

- - - - -
8d67f247 by Ben Gamari at 2024-03-23T16:19:09-04:00
docs: Drop old release notes, add for 9.12.1

- - - - -
7db8c992 by Cheng Shao at 2024-03-25T13:45:46-04:00
rts: fix clang compilation on aarch64

This patch fixes function prototypes in ARMOutlineAtomicsSymbols.h
which causes "error: address argument to atomic operation must be a
pointer to _Atomic type" when compiling with clang on aarch64.

- - - - -
237194ce by Sylvain Henry at 2024-03-25T13:46:27-04:00
Lexer: fix imports for Alex 3.5.1 (#24583)

- - - - -
810660b7 by Cheng Shao at 2024-03-25T22:19:16-04:00
libffi-tarballs: bump libffi-tarballs submodule to libffi 3.4.6

This commit bumps the libffi-tarballs submodule to libffi 3.4.6, which
includes numerous upstream libffi fixes, especially
https://github.com/libffi/libffi/issues/760.

- - - - -
d2ba41e8 by Alan Zimmerman at 2024-03-25T22:19:51-04:00
EPA: do not duplicate comments in signature RHS

- - - - -
32a8103f by Rodrigo Mesquita at 2024-03-26T21:16:12-04:00
configure: Use LDFLAGS when trying linkers

A user may configure `LDFLAGS` but not `LD`. When choosing a linker, we
will prefer `ldd`, then `ld.gold`, then `ld.bfd` -- however, we have to
check for a working linker. If either of these fail, we try the next in
line.

However, we were not considering the `$LDFLAGS` when checking if these
linkers worked. So we would pick a linker that does not support the
current $LDFLAGS and fail further down the line when we used that linker
with those flags.

Fixes #24565, where `LDFLAGS=-Wl,-z,pack-relative-relocs` is not
supported by `ld.gold` but that was being picked still.

- - - - -
bf65a7c3 by Rodrigo Mesquita at 2024-03-26T21:16:48-04:00
bindist: Clean xattrs of bin and lib at configure time

For issue #21506, we started cleaning the extended attributes of
binaries and libraries from the bindist *after* they were installed to
workaround notarisation (#17418), as part of `make install`.

However, the `ghc-toolchain` binary that is now shipped with the bindist
must be run at `./configure` time. Since we only cleaned the xattributes
of the binaries and libs after they were installed, in some situations
users would be unable to run `ghc-toolchain` from the bindist, failing
at configure time (#24554).

In this commit we move the xattr cleaning logic to the configure script.

Fixes #24554

- - - - -
cfeb70d3 by Rodrigo Mesquita at 2024-03-26T21:17:24-04:00
Revert "NCG: Fix a bug in jump shortcutting."

This reverts commit 5bd8ed53dcefe10b72acb5729789e19ceb22df66.

Fixes #24586

- - - - -
13223f6d by Serge S. Gulin at 2024-03-27T07:28:51-04:00
JS: `h$rts_isProfiled` is removed from `profiling` and left its version at
`rts/js/config.js`

- - - - -
0acfe391 by Alan Zimmerman at 2024-03-27T07:29:27-04:00
EPA: Do not extend declaration range for trailine zero len semi

The lexer inserts virtual semicolons having zero width.
Do not use them to extend the list span of items in a list.

- - - - -
cd0fb82f by Alan Zimmerman at 2024-03-27T19:33:08+00:00
EPA: Fix FamDecl range

The span was incorrect if opt_datafam_kind_sig was empty

- - - - -
f8f384a8 by Ben Gamari at 2024-03-29T01:23:03-04:00
Fix type of _get_osfhandle foreign import

Fixes #24601.

- - - - -
00d3ecf0 by Alan Zimmerman at 2024-03-29T12:19:10+00:00
EPA: Extend StringLiteral range to include trailing commas

This goes slightly against the exact printing philosophy where
trailing decorations should be in an annotation, but the
practicalities of adding it to the WarningTxt environment, and the
problems caused by deviating do not make a more principles approach
worthwhile.

- - - - -
efab3649 by brandon s allbery kf8nh at 2024-03-31T20:04:01-04:00
clarify Note [Preproccesing invocations]

- - - - -
c8a4c050 by Ben Gamari at 2024-04-02T12:50:35-04:00
rts: Fix TSAN_ENABLED CPP guard

This should be `#if defined(TSAN_ENABLED)`, not `#if TSAN_ENABLED`,
lest we suffer warnings.

- - - - -
e91dad93 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix errors when compiling with TSAN

This commit fixes rts compilation errors when compiling with TSAN:

- xxx_FENCE macros are redefined and trigger CPP warnings.
- Use SIZEOF_W. WORD_SIZE_IN_BITS is provided by MachDeps.h which
  Cmm.h doesn't include by default.

- - - - -
a9ab9455 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix clang-specific errors when compiling with TSAN

This commit fixes clang-specific rts compilation errors when compiling
with TSAN:

- clang doesn't have -Wtsan flag
- Fix prototype of ghc_tsan_* helper functions
- __tsan_atomic_* functions aren't clang built-ins and
  sanitizer/tsan_interface_atomic.h needs to be included
- On macOS, TSAN runtime library is
  libclang_rt.tsan_osx_dynamic.dylib, not libtsan. -fsanitize-thread
  as a link-time flag will take care of linking the TSAN runtime
  library anyway so remove tsan as an rts extra library

- - - - -
865bd717 by Cheng Shao at 2024-04-02T12:50:35-04:00
compiler: fix github link to __tsan_memory_order in a comment

- - - - -
07cb627c by Cheng Shao at 2024-04-02T12:50:35-04:00
ci: improve TSAN CI jobs

- Run TSAN jobs with +thread_sanitizer_cmm which enables Cmm
  instrumentation as well.
- Run TSAN jobs in deb12 which ships gcc-12, a reasonably recent gcc
  that @bgamari confirms he's using in #GHC:matrix.org. Ideally we
  should be using latest clang release for latest improvements in
  sanitizers, though that's left as future work.
- Mark TSAN jobs as manual+allow_failure in validate pipelines. The
  purpose is to demonstrate that we have indeed at least fixed
  building of TSAN mode in CI without blocking the patch to land, and
  once merged other people can begin playing with TSAN using their own
  dev setups and feature branches.

- - - - -
a1c18c7b by Andrei Borzenkov at 2024-04-02T12:51:11-04:00
Merge tc_infer_hs_type and tc_hs_type into one function using ExpType philosophy (#24299, #23639)

This patch implements refactoring which is a prerequisite to
updating kind checking of type patterns. This is a huge simplification
of the main worker that checks kind of HsType.

It also fixes the issues caused by previous code duplication, e.g.
that we didn't add module finalizers from splices in inference mode.

- - - - -
817e8936 by Rodrigo Mesquita at 2024-04-02T20:13:05-04:00
th: Hide the Language.Haskell.TH.Lib.Internal module from haddock

Fixes #24562

- - - - -
b36ee57b by Sylvain Henry at 2024-04-02T20:13:46-04:00
JS: reenable h$appendToHsString optimization (#24495)

The optimization introducing h$appendToHsString wasn't kicking in
anymore (while it did in 9.8.1) because of the changes introduced in #23270 (7e0c8b3bab30).
This patch reenables the optimization by matching on case-expression, as
done in Cmm for unpackCString# standard thunks.

The test is also T24495 added in the next commits (two commits for ease
of backporting to 9.8).

- - - - -
527616e9 by Sylvain Henry at 2024-04-02T20:13:46-04:00
JS: fix h$appendToHsString implementation (#24495)

h$appendToHsString needs to wrap its argument in an updatable thunk
to behave like unpackAppendCString#. Otherwise if a SingleEntry thunk is
passed, it is stored as-is in a CONS cell, making the resulting list
impossible to deepseq (forcing the thunk doesn't update the contents of
the CONS cell)!

The added test checks that the optimization kicks in and that
h$appendToHsString works as intended.

Fix #24495

- - - - -
faa30b41 by Simon Peyton Jones at 2024-04-02T20:14:22-04:00
Deal with duplicate tyvars in type declarations

GHC was outright crashing before this fix: #24604

- - - - -
e0b0c717 by Simon Peyton Jones at 2024-04-02T20:14:58-04:00
Try using MCoercion in exprIsConApp_maybe

This is just a simple refactor that makes exprIsConApp_maybe
a little bit more direct, simple, and efficient.

Metrics: compile_time/bytes allocated
    geo. mean                                          -0.1%
    minimum                                            -2.0%
    maximum                                            -0.0%

Not a big gain, but worthwhile given that the code is, if anything,
easier to grok.

- - - - -
15f4d867 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Initial ./configure support for selecting I/O managers

In this patch we just define new CPP vars, but don't yet use them
or replace the existing approach. That will follow.

The intention here is that every I/O manager can be enabled/disabled at
GHC build time (subject to some constraints). More than one I/O manager
can be enabled to be built. At least one I/O manager supporting the
non-threaded RTS must be enabled as well as at least one supporting the
non-threaded RTS. The I/O managers enabled here will become the choices
available at runtime at RTS startup (in later patches). The choice can
be made with RTS flags. There are separate sets of choices for the
threaded and non-threaded RTS ways, because most I/O managers are
specific to these ways. Furthermore we must establish a default I/O
manager for the threaded and non-threaded RTS.

Most I/O managers are platform-specific so there are checks to ensure
each one can be enabled on the platform. Such checks are also where (in
future) any system dependencies (e.g. libraries) can be checked.

The output is a set of CPP flags (in the mk/config.h file), with one
flag per named I/O manager:
* IOMGR_BUILD_<name>                : which ones should be built (some)
* IOMGR_DEFAULT_NON_THREADED_<name> : which one is default (exactly one)
* IOMGR_DEFAULT_THREADED_<name>     : which one is default (exactly one)

and a set of derived flags in IOManager.h

* IOMGR_ENABLED_<name>              : enabled for the current RTS way

Note that IOMGR_BUILD_<name> just says that an I/O manager will be
built for _some_ RTS way (i.e. threaded or non-threaded). The derived
flags IOMGR_ENABLED_<name> in IOManager.h say if each I/O manager is
enabled in the "current" RTS way. These are the ones that can be used
for conditional compilation of the I/O manager code.

Co-authored-by: Pi Delport <pi at well-typed.com>

- - - - -
85b0f87a by Duncan Coutts at 2024-04-03T01:27:17-04:00
Change the handling of the RTS flag --io-manager=

Now instead of it being just used on Windows to select between the WinIO
vs the MIO or Win32-legacy I/O managers, it is now used on all platforms
for selecting the I/O manager to use.

Right now it remains the case that there is only an actual choice on
Windows, but that will change later.

Document the --io-manager flag in the user guide.

This change is also reflected in the RTS flags types in the base
library. Deprecate the export of IoSubSystem from GHC.RTS.Flags with a
message to import it from GHC.IO.Subsystem.

The way the 'IoSubSystem' is detected also changes. Instead of looking
at the RTS flag, there is now a C bool global var in the RTS which gets
set on startup when the I/O manager is selected. This bool var says
whether the selected I/O manager classifies as "native" on Windows,
which in practice means the WinIO I/O manager has been selected.

Similarly, the is_io_mng_native_p RTS helper function is re-implemented
in terms of the selected I/O manager, rather than based on the RTS
flags.

We do however remove the ./configure --native-io-manager flag because
we're bringing the WinIO/MIO/Win32-legacy choice under the new general
scheme for selecting I/O managers, and that new scheme involves no
./configure time user choices, just runtime RTS flag choices.

- - - - -
1a8f020f by Duncan Coutts at 2024-04-03T01:27:17-04:00
Convert {init,stop,exit}IOManager to switch style

Rather than ad-hoc cpp conitionals on THREADED_RTS and mingw32_HOST_OS,
we use a style where we switch on the I/O manager impl, with cases for
each I/O manager impl.

- - - - -
a5bad3d2 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Split up the CapIOManager content by I/O manager

Using the new IOMGR_ENABLED_<name> CPP defines.

- - - - -
1d36e609 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Convert initIOManagerAfterFork and wakeupIOManager to switch style

- - - - -
c2f26f36 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move most of waitRead#/Write# from cmm to C

Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.

- - - - -
457705a8 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move most of the delay# impl from cmm to C

Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.

Uses a new IOManager API: syncDelay, following the naming convention of
sync* for thread-synchronous I/O & timer/delay operations.

As part of porting from cmm to C, we maintain the rule that the
why_blocked gets accessed using load acquire and store release atomic
memory operations. There was one exception to this rule: in the delay#
primop cmm code on posix (not win32), the why_blocked was being updated
using a store relaxed, not a store release. I've no idea why. In this
convesion I'm playing it safe here and using store release consistently.

- - - - -
e93058e0 by Duncan Coutts at 2024-04-03T01:27:18-04:00
insertIntoSleepingQueue is no longer public

No longer defined in IOManager.h, just a private function in
IOManager.c. Since it is no longer called from cmm code, just from
syncDelay. It ought to get moved further into the select() I/O manager
impl, rather than living in IOManager.c.

On the other hand appendToIOBlockedQueue is still called from cmm code
in the win32-legacy I/O manager primops async{Read,Write}#, and it is
also used by the select() I/O manager. Update the CPP and comments to
reflect this.

- - - - -
60ce9910 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move anyPendingTimeoutsOrIO impl from .h to .c

The implementation is eventually going to need to use more private
things, which will drag in unwanted includes into IOManager.h, so it's
better to move the impl out of the header file and into the .c file, at
the slight cost of it no longer being inline.

At the same time, change to the "switch (iomgr_type)" style.

- - - - -
f70b8108 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Take a simpler approach to gcc warnings in IOManager.c

We have lots of functions with conditional implementations for
different I/O managers. Some functions, for some I/O managers,
naturally have implementations that do nothing or barf. When only one
such I/O manager is enabled then the whole function implementation will
have an implementation that does nothing or barfs. This then results in
warnings from gcc that parameters are unused, or that the function
should be marked with attribute noreturn (since barf does not return).
The USED_IF_THREADS trick for fine-grained warning supression is fine
for just two cases, but an equivalent here would need
USED_IF_THE_ONLY_ENABLED_IOMGR_IS_X_OR_Y which would have combinitorial
blowup. So we take a coarse grained approach and simply disable these
two warnings for the whole file.

So we use a GCC pragma, with its handy push/pop support:

 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wsuggest-attribute=noreturn"
 #pragma GCC diagnostic ignored "-Wunused-parameter"

...

 #pragma GCC diagnostic pop

- - - - -
b48805b9 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add a new trace class for the iomanager

It makes sense now for it to be separate from the scheduler class of
tracers.

Enabled with +RTS -Do. Document the -Do debug flag in the user guide.

- - - - -
f0c1f862 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Have the throwTo impl go via (new) IOManager APIs

rather than directly operating on the IO manager's data structures.

Specifically, when thowing an async exception to a thread that is
blocked waiting for I/O or waiting for a timer, then we want to cancel
that I/O waiting or cancel the timer. Currently this is done directly in
removeFromQueues() in RaiseAsync.c. We want it to go via proper APIs
both for modularity but also to let us support multiple I/O managers.

So add sync{IO,Delay}Cancel, which is the cancellation for the
corresponding sync{IO,Delay}. The implementations of these use the usual
"switch (iomgr_type)" style.

- - - - -
4f9e9c4e by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move awaitEvent into a proper IOManager API

and have the scheduler use it.

Previously the scheduler calls awaitEvent directly, and awaitEvent is
implemented directly in the RTS I/O managers (select, win32). This
relies on the old scheme where there's a single active I/O manager for
each platform and RTS way.

We want to move that to go via an API in IOManager.{h,c} which can then
call out to the active I/O manager.

Also take the opportunity to split awaitEvent into two. The existing
awaitEvent has a bool wait parameter, to say if the call should be
blocking or non-blocking. We split this into two separate functions:
pollCompletedTimeoutsOrIO and awaitCompletedTimeoutsOrIO. We split them
for a few reasons: they have different post-conditions (specifically the
await version is supposed to guarantee that there are threads runnable
when it completes). Secondly, it is also anticipated that in future I/O
managers the implementations of the two cases will be simpler if they
are separated.

- - - - -
5ad4b30f by Duncan Coutts at 2024-04-03T01:27:18-04:00
Rename awaitEvent in select and win32 I/O managers

These are now just called from IOManager.c and are the per-I/O manager
backend impls (whereas previously awaitEvent was the entry point).

Follow the new naming convention in the IOManager.{h,c} of
awaitCompletedTimeoutsOrIO, with the I/O manager's name as a suffix:
so awaitCompletedTimeoutsOrIO{Select,Win32}.

- - - - -
d30c6bc6 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Tidy up a couple things in Select.{h,c}

Use the standard #include {Begin,End}Private.h style rather than
RTS_PRIVATE on individual decls.

And conditionally build the code for the select I/O manager based on
the new CPP IOMGR_ENABLED_SELECT rather than on THREADED_RTS.

- - - - -
4161f516 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add an IOManager API for scavenging TSO blocked_info

When the GC scavenges a TSO it needs to scavenge the tso->blocked_info
but the blocked_info is a big union and what lives there depends on the
two->why_blocked, which for I/O-related reasons is something that in
principle is the responsibility of the I/O manager and not the GC. So
the right thing to do is for the GC to ask the I/O manager to sscavenge
the blocked_info if it encounters any I/O-related why_blocked reasons.

So we add scavengeTSOIOManager in IOManager.{h,c} with the usual style.

Now as it happens, right now, there is no special scavenging to do, so
the implementation of scavengeTSOIOManager is a fancy no-op. That's
because the select I/O manager uses only the fd and target members,
which are not GC pointers, and the win32-legacy I/O manager _ought_ to
be using GC-managed heap objects for the StgAsyncIOResult but it is
actually usingthe C heap, so again no GC pointers. If the win32-legacy
were doing this more sensibly, then scavengeTSOIOManager would be the
right place to do the GC magic.

Future I/O managers will need GC heap objects in the tso->blocked_info
and will make use of this functionality.

- - - - -
94a87d21 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add I/O manager API notifyIOManagerCapabilitiesChanged

Used in setNumCapabilities.

It only does anything for MIO on Posix.

Previously it always invoked Haskell code, but that code only did
anything on non-Windows (and non-JS), and only threaded. That currently
effectively means the MIO I/O manager on Posix.

So now it only invokes it for the MIO Posix case.

- - - - -
3be6d591 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Select an I/O manager early in RTS startup

We need to select the I/O manager to use during startup before the
per-cap I/O manager initialisation.

- - - - -
aaa294d0 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Make struct CapIOManager be fully opaque

Provide an opaque (forward) definition in Capability.h (since the cap
contains a *CapIOManager) and then only provide a full definition in
a new file IOManagerInternals.h. This new file is only supposed to be
included by the IOManager implementation, not by its users. So that
means IOManager.c and individual I/O manager implementations.

The posix/Signals.c still needs direct access, but that should be
eliminated. Anything that needs direct access either needs to be clearly
part of an I/O manager (e.g. the sleect() one) or go via a proper API.

- - - - -
877a2a80 by Duncan Coutts at 2024-04-03T01:27:18-04:00
The select() I/O manager does have some global initialisation

It's just to make sure an exception CAF is a GC root.

- - - - -
9c51473b by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add tracing for the main I/O manager actions

Using the new tracer class.

Note: The unconditional definition of showIOManager should be
compatible with the debugTrace change in 7c7d1f6.

Co-authored-by: Pi Delport <pi at well-typed.com>

- - - - -
c7d3e3a3 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Include the default I/O manager in the +RTS --info output

Document the extra +RTS --info output in the user guide

- - - - -
8023bad4 by Duncan Coutts at 2024-04-03T01:27:18-04:00
waitRead# / waitWrite# do not work for win32-legacy I/O manager

Previously it was unclear that they did not work because the code path
was shared with other I/O managers (in particular select()).

Following the code carefully shows that what actually happens is that
the calling thread would block forever: the thread will be put into the
blocked queue, but no other action is scheduled that will ever result in
it getting unblocked.

It's better to just fail loudly in case anyone accidentally calls it,
also it's less confusing code.

- - - - -
83a74d20 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Conditionally ignore some GCC warnings

Some GCC versions don't know about some warnings, and they complain
that we're ignoring unknown warnings. So we try to ignore the warning
based on the GCC version.

- - - - -
1adc6fa4 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Accept changes to base-exports

All the changes are in fact not changes at all.

Previously, the IoSubSystem data type was defined in GHC.RTS.Flags and
exported from both GHC.RTS.Flags and GHC.IO.SubSystem. Now, the data
type is defined in GHC.IO.SubSystem and still exported from both
modules.

Therefore, the same exports and same instances are still available from
both modules. But the base-exports records only the defining module, and
so it looks like a change when it is fully compatible.

Related: we do add a deprecation to the export of the type via
GHC.RTS.Flags, telling people to use the export from GHC.IO.SubSystem.

Also the sort order for some unrelated Show instances changed. No idea
why.

The same changes apply in the other versions, with a few more changes
due to sort order weirdness.

- - - - -
8d950968 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Accept metric decrease in T12227

I can't think of any good reason that anything in this MR should have
changed the number of allocations, up or down.

(Yes this is an empty commit.)

Metric Decrease:
    T12227

- - - - -
e869605e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Several improvements to the handling of coercions

* Make `mkSymCo` and `mkInstCo` smarter
  Fixes #23642

* Fix return role of `SelCo` in the coercion optimiser.
  Fixes #23617

* Make the coercion optimiser `opt_trans_rule` work better for newtypes
  Fixes #23619

- - - - -
1efd0714 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
FloatOut: improve floating for join point

See the new Note [Floating join point bindings].

* Completely get rid of the complicated join_ceiling nonsense, which
  I have never understood.

* Do not float join points at all, except perhaps to top level.

* Some refactoring around wantToFloat, to treat Rec and NonRec more
  uniformly

- - - - -
9c00154d by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve eta-expansion through call stacks

See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity

This is a one-line change, that fixes an inconsistency
-               || isCallStackPredTy ty
+               || isCallStackPredTy ty || isCallStackTy ty

- - - - -
95a9a172 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Spelling, layout, pretty-printing only

- - - - -
bdf1660f by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve exprIsConApp_maybe a little

Eliminate a redundant case at birth.  This sometimes reduces
Simplifier iterations.

See Note [Case elim in exprIsConApp_maybe].

- - - - -
609cd32c by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo

When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced.  It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so.  So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.

- - - - -
ae24c9bc by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Slight improvement in WorkWrap

Ensure that WorkWrap preserves lambda binders, in case of join points.  Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
  * it does no harm,
  * it is slightly more efficient, and
  * presumably it made something better!

Anyway I have kept it in a separate commit.

- - - - -
e9297181 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Use named record fields for the CastIt { ... } data constructor

This is a pure refactor

- - - - -
b4581e23 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Remove a long-commented-out line

Pure refactoring

- - - - -
e026bdf2 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Simplifier improvements

This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do.  But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.

The main changes are below.  There are also many new or rewritten Notes.

Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]

* The SimplEnv now has a seInlineDepth field, which says how deep
  in unfoldings we are.  See Note [Inline depth] in Simplify.Env.
  Currently used only for the next point: avoiding repeatedly
  simplifying coercions.

* Avoid repeatedly simplifying coercions.
  see Note [Avoid re-simplifying coercions] in Simplify.Iteration
  As you'll see from the Note, this makes use of the seInlineDepth.

* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
  This is another part of Note [Post-inline for single-use things], and
  is really good for reducing simplifier iterations in situations like
      case K e of { K x -> blah }
  wher x is used once in blah.

* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
  elimination.  Note [Case elim in exprIsConApp_maybe]

* Improve the case-merge transformation:
  - Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
    and friends.  See Note [Merge Nested Cases] in GHC.Core.Utils.
  - Add a new case for `tagToEnum#`; see wrinkle (MC3).
  - Add a new case to look through join points: see wrinkle (MC4)

postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
  that are used exactly once. See Note [Post-inline for single-use things].

* Do not postInlineUnconditionally join point, ever.
  Doing so does not reduce allocation, which is the main point,
  and with join points that are used a lot it can bloat code.
  See point (1) of Note [Duplicating join points] in
  GHC.Core.Opt.Simplify.Iteration.

* Do not postInlineUnconditionally a strict (demanded) binding.
  It will not allocate a thunk (it'll turn into a case instead)
  so again the main point of inlining it doesn't hold.  Better
  to check per-call-site.

* Improve occurrence analyis for bottoming function calls, to help
  postInlineUnconditionally.  See Note [Bottoming function calls]
  in GHC.Core.Opt.OccurAnal

Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
  use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
  See Note [Seq is boring]  Also, wrinkle (SB1), inline in that
  `seq` context only for INLINE functions (UnfWhen guidance).

* In GHC.Core.Opt.Simplify.Utils.interestingArg,
  - return ValueArg for OtherCon [c1,c2, ...], but
  - return NonTrivArg for OtherCon []
  This makes a function a little less likely to inline if all we
  know is that the argument is evaluated, but nothing else.

* isConLikeUnfolding is no longer true for OtherCon {}.
  This propagates to exprIsConLike.  Con-like-ness has /positive/
  information.

Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
  See these two long Notes
    Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
    Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline

* When making join points, don't do so if the join point is so small
  it will immediately be inlined; check uncondInlineJoin.

* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
  heuristics for join points. In general we /do not/ want to inline
  join points /even if they are small/.  See Note [Duplicating join points]
  GHC.Core.Opt.Simplify.Iteration.

  But sometimes we do: see Note [Inlining join points] in
  GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.

* Do not add an unfolding to a join point at birth.  This is a tricky one
  and has a long Note [Do not add unfoldings to join points at birth]
  It shows up in two places
  - In `mkDupableAlt` do not add an inlining
  - (trickier) In `simplLetUnfolding` don't add an unfolding for a
    fresh join point
  I am not fully satisifed with this, but it works and is well documented.

* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
  having a non-inlined join point.

Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.

* Compile times improve slightly. Here are the figures over 1%.

  I investiate the biggest differnce in T18304. It's a very small module, just
  a few hundred nodes. The large percentage difffence is due to a single
  function that didn't quite inline before, and does now, making code size a
  bit bigger.  I decided gains outweighed the losses.

    Metrics: compile_time/bytes allocated (changes over +/- 1%)
    ------------------------------------------------
           CoOpt_Singletons(normal)   -9.2% GOOD
                LargeRecord(normal)  -23.5% GOOD
MultiComponentModulesRecomp(normal)   +1.2%
MultiLayerModulesTH_OneShot(normal)   +4.1%  BAD
                  PmSeriesS(normal)   -3.8%
                  PmSeriesV(normal)   -1.5%
                     T11195(normal)   -1.3%
                     T12227(normal)  -20.4% GOOD
                     T12545(normal)   -3.2%
                     T12707(normal)   -2.1% GOOD
                     T13253(normal)   -1.2%
                 T13253-spj(normal)   +8.1%  BAD
                     T13386(normal)   -3.1% GOOD
                     T14766(normal)   -2.6% GOOD
                     T15164(normal)   -1.4%
                     T15304(normal)   +1.2%
                     T15630(normal)   -8.2%
                    T15630a(normal)          NEW
                     T15703(normal)  -14.7% GOOD
                     T16577(normal)   -2.3% GOOD
                     T17516(normal)  -39.7% GOOD
                     T18140(normal)   +1.2%
                     T18223(normal)  -17.1% GOOD
                     T18282(normal)   -5.0% GOOD
                     T18304(normal)  +10.8%  BAD
                     T18923(normal)   -2.9% GOOD
                      T1969(normal)   +1.0%
                     T19695(normal)   -1.5%
                     T20049(normal)  -12.7% GOOD
                    T21839c(normal)   -4.1% GOOD
                      T3064(normal)   -1.5%
                      T3294(normal)   +1.2%  BAD
                      T4801(normal)   +1.2%
                      T5030(normal)  -15.2% GOOD
                   T5321Fun(normal)   -2.2% GOOD
                      T6048(optasm)  -16.8% GOOD
                       T783(normal)   -1.2%
                      T8095(normal)   -6.0% GOOD
                      T9630(normal)   -4.7% GOOD
                      T9961(normal)   +1.9%  BAD
                      WWRec(normal)   -1.4%
        info_table_map_perf(normal)   -1.3%
                 parsing001(normal)   +1.5%

                          geo. mean   -2.0%
                          minimum    -39.7%
                          maximum    +10.8%

* Runtimes generally improve. In the testsuite perf/should_run gives:
   Metrics: runtime/bytes allocated
   ------------------------------------------
             Conversions(normal)   -0.3%
                 T13536a(optasm)  -41.7% GOOD
                   T4830(normal)   -0.1%
           haddock.Cabal(normal)   -0.1%
            haddock.base(normal)   -0.1%
        haddock.compiler(normal)   -0.1%

                       geo. mean   -0.8%
                       minimum    -41.7%
                       maximum     +0.0%

* For runtime, nofib is a better test.  The news is mostly good.
  Here are the number more than +/- 0.1%:

    # bytes allocated
    ==========================++==========
       imaginary/digits-of-e1 ||  -14.40%
       imaginary/digits-of-e2 ||   -4.41%
          imaginary/paraffins ||   -0.17%
               imaginary/rfib ||   -0.15%
       imaginary/wheel-sieve2 ||   -0.10%
                real/compress ||   -0.47%
                   real/fluid ||   -0.10%
                  real/fulsom ||   +0.14%
                  real/gamteb ||   -1.47%
                      real/gg ||   -0.20%
                   real/infer ||   +0.24%
                     real/pic ||   -0.23%
                  real/prolog ||   -0.36%
                     real/scs ||   -0.46%
                 real/smallpt ||   +4.03%
        shootout/k-nucleotide ||  -20.23%
              shootout/n-body ||   -0.42%
       shootout/spectral-norm ||   -0.13%
              spectral/boyer2 ||   -3.80%
         spectral/constraints ||   -0.27%
          spectral/hartel/ida ||   -0.82%
                spectral/mate ||  -20.34%
                spectral/para ||   +0.46%
             spectral/rewrite ||   +1.30%
              spectral/sphere ||   -0.14%
    ==========================++==========
                    geom mean ||   -0.59%

    real/smallpt has a huge nest of local definitions, and I
    could not pin down a reason for a regression.  But there are
    three big wins!

Metric Decrease:
    CoOpt_Singletons
    LargeRecord
    T12227
    T12707
    T13386
    T13536a
    T14766
    T15703
    T16577
    T17516
    T18223
    T18282
    T18923
    T21839c
    T20049
    T5321Fun
    T5030
    T6048
    T8095
    T9630
    T783
Metric Increase:
    MultiLayerModulesTH_OneShot
    T13253-spj
    T18304
    T18698a
    T9961
    T3294

- - - - -
27db3c5e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Testsuite message changes from simplifier improvements

- - - - -
271a7812 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Account for bottoming functions in OccurAnal

This fixes #24582, a small but long-standing bug

- - - - -
0fde229f by Ben Gamari at 2024-04-04T07:04:58-04:00
testsuite: Introduce template-haskell-exports test

- - - - -
0c4a9686 by Luite Stegeman at 2024-04-04T07:05:39-04:00
Update correct counter in bumpTickyAllocd

- - - - -
5f085d3a by Fendor at 2024-04-04T14:47:33-04:00
Replace `SizedSeq` with `FlatBag` for flattened structure

LinkedLists are notoriously memory ineffiecient when all we do is
traversing a structure.
As 'UnlinkedBCO' has been identified as a data structure that impacts
the overall memory usage of GHCi sessions, we avoid linked lists and
prefer flattened structure for storing.

We introduce a new memory efficient representation of sequential
elements that has special support for the cases:

* Empty
* Singleton
* Tuple Elements

This improves sharing in the 'Empty' case and avoids the overhead of
'Array' until its constant overhead is justified.

- - - - -
82cfe10c by Fendor at 2024-04-04T14:47:33-04:00
Compact FlatBag array representation

`Array` contains three additional `Word`'s we do not need in `FlatBag`. Move
`FlatBag` to `SmallArray`.

Expand the API of SmallArray by `sizeofSmallArray` and add common
traversal functions, such as `mapSmallArray` and `foldMapSmallArray`.
Additionally, allow users to force the elements of a `SmallArray`
via `rnfSmallArray`.

- - - - -
36a75b80 by Andrei Borzenkov at 2024-04-04T14:48:10-04:00
Change how invisible patterns represented in  haskell syntax and TH AST (#24557)

Before this patch:
  data ArgPat p
    = InvisPat (LHsType p)
    | VisPat (LPat p)

With this patch:
  data Pat p
    = ...
    | InvisPat (LHsType p)
    ...

And the same transformation in the TH land. The rest of the
changes is just updating code to handle new AST and writing tests
to check if it is possible to create invalid states using TH.

Metric Increase:
    MultiLayerModulesTH_OneShot

- - - - -
28009fbc by Matthew Pickering at 2024-04-04T14:48:46-04:00
Fix off by one error in seekBinNoExpand and seekBin

- - - - -
9b9e031b by Ben Gamari at 2024-04-04T21:30:08-04:00
compiler: Allow more types in GHCForeignImportPrim

For many, many years `GHCForeignImportPrim` has suffered from the rather
restrictive limitation of not allowing any non-trivial types in arguments
or results. This limitation was justified by the code generator allegely
barfing in the presence of such types.

However, this restriction appears to originate well before the NCG
rewrite and the new NCG does not appear to have any trouble with such
types (see the added `T24598` test). Lift this restriction.

Fixes #24598.

- - - - -
1324b862 by Alan Zimmerman at 2024-04-04T21:30:44-04:00
EPA: Use EpaLocation not SrcSpan in ForeignDecls

This allows us to update them for makeDeltaAst in ghc-exactprint

- - - - -
19883a23 by Alan Zimmerman at 2024-04-05T16:58:17-04:00
EPA: Use EpaLocation for RecFieldsDotDot

So we can update it to a delta position in makeDeltaAst if needed.

- - - - -
e8724327 by Matthew Pickering at 2024-04-05T16:58:53-04:00
Remove accidentally committed test.hs

- - - - -
88cb3e10 by Fendor at 2024-04-08T09:03:34-04:00
Avoid UArray when indexing is not required

`UnlinkedBCO`'s can occur many times in the heap. Each `UnlinkedBCO`
references two `UArray`'s but never indexes them. They are only needed
to encode the elements into a `ByteArray#`. The three words for
the lower bound, upper bound and number of elements are essentially
unused, thus we replace `UArray` with a wrapper around `ByteArray#`.
This saves us up to three words for each `UnlinkedBCO`.

Further, to avoid re-allocating these words for `ResolvedBCO`, we repeat
the procedure for `ResolvedBCO` and add custom `Binary` and `Show` instances.

For example, agda's repl session has around 360_000 UnlinkedBCO's,
so avoiding these three words is already saving us around 8MB residency.

- - - - -
f2cc1107 by Fendor at 2024-04-08T09:04:11-04:00
Never UNPACK `FastMutInt` for counting z-encoded `FastString`s

In `FastStringTable`, we count the number of z-encoded FastStrings
that exist in a GHC session.
We used to UNPACK the counters to not waste memory, but live retainer
analysis showed that we allocate a lot of `FastMutInt`s, retained by
`mkFastZString`.

We lazily compute the `FastZString`, only incrementing the counter when the `FastZString` is
forced.
The function `mkFastStringWith` calls `mkZFastString` and boxes the
`FastMutInt`, leading to the following core:

    mkFastStringWith
      = \ mk_fs _  ->
             = case stringTable of
                { FastStringTable _ n_zencs segments# _ ->
                    ...
                         case ((mk_fs (I# ...) (FastMutInt n_zencs))
                            `cast` <Co:2> :: ...)
                            ...

Marking this field as `NOUNPACK` avoids this reboxing, eliminating the
allocation of a fresh `FastMutInt` on every `FastString` allocation.

- - - - -
c6def949 by Matthew Pickering at 2024-04-08T16:06:51-04:00
Force in_multi to avoid retaining entire hsc_env

- - - - -
fbb91a63 by Fendor at 2024-04-08T16:06:51-04:00
Eliminate name thunk in declaration fingerprinting

Thunk analysis showed that we have about 100_000 thunks (in agda and
`-fwrite-simplified-core`) pointing to the name of the name decl.
Forcing this thunk fixes this issue.

The thunk created here is retained by the thunk created by forkM, it is
better to eagerly force this because the result (a `Name`) is already
retained indirectly via the `IfaceDecl`.

- - - - -
3b7b0c1c by Alan Zimmerman at 2024-04-08T16:07:27-04:00
EPA: Use EpaLocation in WarningTxt

This allows us to use an EpDelta if needed when using makeDeltaAst.

- - - - -
12b997df by Alan Zimmerman at 2024-04-08T16:07:27-04:00
EPA: Move DeltaPos and EpaLocation' into GHC.Types.SrcLoc

This allows us to use a NoCommentsLocation for the possibly trailing
comma location in a StringLiteral.
This in turn allows us to correctly roundtrip via makeDeltaAst.

- - - - -
868c8a78 by Fendor at 2024-04-09T08:51:50-04:00
Prefer packed representation for CompiledByteCode

As there are many 'CompiledByteCode' objects alive during a GHCi
session, representing its element in a more packed manner improves space
behaviour at a minimal cost.

When running GHCi on the agda codebase, we find around 380 live
'CompiledByteCode' objects. Packing their respective 'UnlinkedByteCode'
can save quite some pointers.

- - - - -
be3bddde by Alan Zimmerman at 2024-04-09T08:52:26-04:00
EPA: Capture all comments in a ClassDecl

Hopefully the final fix needed for #24533

- - - - -
3d0806fc by Jade at 2024-04-10T05:39:53-04:00
Validate -main-is flag using parseIdentifier

Fixes #24368

- - - - -
dd530bb7 by Rodrigo Mesquita at 2024-04-10T05:40:29-04:00
rts: free error message before returning

Fixes a memory leak in rts/linker/PEi386.c

- - - - -
e008a19a by Alexis King at 2024-04-10T05:40:29-04:00
linker: Avoid linear search when looking up Haskell symbols via dlsym

See the primary Note [Looking up symbols in the relevant objects] for a
more in-depth explanation.

When dynamically loading a Haskell symbol (typical when running a splice or
GHCi expression), before this commit we would search for the symbol in
all dynamic libraries that were loaded. However, this could be very
inefficient when too many packages are loaded (which can happen if there are
many package dependencies) because the time to lookup the would be
linear in the number of packages loaded.

This commit drastically improves symbol loading performance by
introducing a mapping from units to the handles of corresponding loaded
dlls. These handles are returned by dlopen when we load a dll, and can
then be used to look up in a specific dynamic library.

Looking up a given Name is now much more precise because we can get
lookup its unit in the mapping and lookup the symbol solely in the
handles of the dynamic libraries loaded for that unit.

In one measurement, the wait time before the expression was executed
went from +-38 seconds down to +-2s.

This commit also includes Note [Symbols may not be found in pkgs_loaded],
explaining the fallback to the old behaviour in case no dll can be found
in the unit mapping for a given Name.

Fixes #23415

Co-authored-by: Rodrigo Mesquita (@alt-romes)

- - - - -
dcfaa190 by Rodrigo Mesquita at 2024-04-10T05:40:29-04:00
rts: Make addDLL a wrapper around loadNativeObj

Rewrite the implementation of `addDLL` as a wrapper around the more
principled `loadNativeObj` rts linker function. The latter should be
preferred while the former is preserved for backwards compatibility.

`loadNativeObj` was previously only available on ELF platforms, so this
commit further refactors the rts linker to transform loadNativeObj_ELF
into loadNativeObj_POSIX, which is available in ELF and MachO platforms.

The refactor made it possible to remove the `dl_mutex` mutex in favour
of always using `linker_mutex` (rather than a combination of both).

Lastly, we implement `loadNativeObj` for Windows too.

- - - - -
12931698 by Rodrigo Mesquita at 2024-04-10T05:40:29-04:00
Use symbol cache in internal interpreter too

This commit makes the symbol cache that was used by the external
interpreter available for the internal interpreter too.

This follows from the analysis in #23415 that suggests the internal
interpreter could benefit from this cache too, and that there is no good
reason not to have the cache for it too. It also makes it a bit more
uniform to have the symbol cache range over both the internal and
external interpreter.

This commit also refactors the cache into a function which is used by
both `lookupSymbol` and also by `lookupSymbolInDLL`, extending the
caching logic to `lookupSymbolInDLL` too.

- - - - -
dccd3ea1 by Ben Gamari at 2024-04-10T05:40:29-04:00
testsuite: Add test for lookupSymbolInNativeObj

- - - - -
1b1a92bd by Alan Zimmerman at 2024-04-10T05:41:05-04:00
EPA: Remove unnecessary XRec in CompleteMatchSig

The XRec for [LIdP pass] is not needed for exact printing, remove it.

- - - - -
6e18ce2b by Ben Gamari at 2024-04-12T08:16:09-04:00
users-guide: Clarify language extension documentation

Over the years the users guide's language extension documentation has
gone through quite a few refactorings. In the process some of the
descriptions have been rendered non-sensical. For instance, the
description of `NoImplicitPrelude` actually describes the semantics of
`ImplicitPrelude`.

To fix this we:

 * ensure that all extensions are named in their "positive" sense (e.g.
   `ImplicitPrelude` rather than `NoImplicitPrelude`).
 * rework the documentation to avoid flag-oriented wording
   like "enable" and "disable"
 * ensure that the polarity of the documentation is consistent with
   reality.

Fixes #23895.

- - - - -
a933aff3 by Zubin Duggal at 2024-04-12T08:16:45-04:00
driver: Make `checkHomeUnitsClosed` faster

The implementation of `checkHomeUnitsClosed` was traversing every single path
in the unit dependency graph - this grows exponentially and quickly grows to be
infeasible on larger unit dependency graphs.

Instead we replace this with a faster implementation which follows from the
specificiation of the closure property - there is a closure error if there are
units which are both are both (transitively) depended upon by home units and
(transitively) depend on home units, but are not themselves home units.

To compute the set of units required for closure, we first compute the closure
of the unit dependency graph, then the transpose of this closure, and find all
units that are reachable from the home units in the transpose of the closure.

- - - - -
23c3e624 by Andreas Klebinger at 2024-04-12T08:17:21-04:00
RTS: Emit warning when -M < -H

Fixes #24487

- - - - -
d23afb8c by Ben Gamari at 2024-04-12T08:17:56-04:00
testsuite: Add broken test for CApiFFI with -fprefer-bytecode

See #24634.

- - - - -
a4bb3a51 by Ben Gamari at 2024-04-12T08:18:32-04:00
base: Deprecate GHC.Pack

As proposed in #21461.

Closes #21540.

- - - - -
55eb8c98 by Ben Gamari at 2024-04-12T08:19:08-04:00
ghc-internal: Fix mentions of ghc-internal in deprecation warnings

Closes #24609.

- - - - -
b0fbd181 by Ben Gamari at 2024-04-12T08:19:44-04:00
rts: Implement set_initial_registers for AArch64

Fixes #23680.

- - - - -
14c9ec62 by Ben Gamari at 2024-04-12T08:20:20-04:00
ghcup-metadata: Use Debian 9 binaries on Ubuntu 16, 17

Closes #24646.

- - - - -
35a1621e by Ben Gamari at 2024-04-12T08:20:55-04:00
Bump unix submodule to 2.8.5.1

Closes #24640.

- - - - -
a1c24df0 by Finley McIlwaine at 2024-04-12T08:21:31-04:00
Correct default -funfolding-use-threshold in docs

- - - - -
0255d03c by Oleg Grenrus at 2024-04-12T08:22:07-04:00
FastString is a __Modified__ UTF-8

- - - - -
c3489547 by Matthew Pickering at 2024-04-12T13:13:44-04:00
rts: Improve tracing message when nursery is resized

It is sometimes more useful to know how much bigger or smaller the
nursery got when it is resized.

In particular I am trying to investigate situations where we end up with
fragmentation due to the nursery (#24577)

- - - - -
5e4f4ba8 by Simon Peyton Jones at 2024-04-12T13:14:20-04:00
Don't generate wrappers for `type data` constructors with StrictData

Previously, the logic for checking if a data constructor needs a wrapper or not
would take into account whether the constructor's fields have explicit
strictness (e.g., `data T = MkT !Int`), but the logic would _not_ take into
account whether `StrictData` was enabled. This meant that something like `type
data T = MkT Int` would incorrectly generate a wrapper for `MkT` if
`StrictData` was enabled, leading to the horrible errors seen in #24620. To fix
this, we disable generating wrappers for `type data` constructors altogether.

Fixes #24620.

Co-authored-by: Ryan Scott <ryan.gl.scott at gmail.com>

- - - - -
dbdf1995 by Alex Mason at 2024-04-15T15:28:26+10:00
Implements MO_S_Mul2 and MO_U_Mul2 using the  UMULH, UMULL and SMULH instructions for AArch64

Also adds a test for MO_S_Mul2

- - - - -
42bd0407 by Teo Camarasu at 2024-04-16T20:06:39-04:00
Make template-haskell a stage1 package

Promoting template-haskell from a stage0 to a stage1 package means that
we can much more easily refactor template-haskell.

We implement this by duplicating the in-tree `template-haskell`.
A new `template-haskell-next` library is autogenerated to mirror `template-haskell`
`stage1:ghc` to depend on the new interface of the library including the
`Binary` instances without adding an explicit dependency on `template-haskell`.

This is controlled by the `bootstrap-th` cabal flag

When building `template-haskell` modules as part of this vendoring we do
not have access to quote syntax, so we cannot use variable quote
notation (`'Just`). So we either replace these with hand-written `Name`s
or hide the code behind CPP.

We can remove the `th_hack` from hadrian, which was required when
building stage0 packages using the in-tree `template-haskell` library.

For more details see Note [Bootstrapping Template Haskell].

Resolves #23536

Co-Authored-By: Sebastian Graf <sgraf1337 at gmail.com>
Co-Authored-By: Matthew Craven <5086-clyring at users.noreply.gitlab.haskell.org>

- - - - -
3d973e47 by Ben Gamari at 2024-04-16T20:07:15-04:00
Bump parsec submodule to 3.1.17.0

- - - - -
9d38bfa0 by Simon Peyton Jones at 2024-04-16T20:07:51-04:00
Clone CoVars in CorePrep

This MR addresses #24463.  It's all explained in the new

   Note [Cloning CoVars and TyVars]

- - - - -
0fe2b410 by Andreas Klebinger at 2024-04-16T20:08:27-04:00
NCG: Fix a bug where we errounously removed a required jump instruction.

Add a new method to the Instruction class to check if we can eliminate a
jump in favour of fallthrough control flow.

Fixes #24507

- - - - -
9f99126a by Teo Camarasu at 2024-04-16T20:09:04-04:00
Fix documentation preview from doc-tarball job

- Include all the .html files and assets in the job artefacts
- Include all the .pdf files in the job artefacts
- Mark the artefact as an "exposed" artefact meaning it turns up in the
  UI.

Resolves #24651

- - - - -
3a0642ea by Ben Gamari at 2024-04-16T20:09:39-04:00
rts: Ignore EINTR while polling in timerfd itimer implementation

While the RTS does attempt to mask signals, it may be that a foreign
library unmasks them. This previously caused benign warnings which we
now ignore.

See #24610.

- - - - -
9a53cd3f by Alan Zimmerman at 2024-04-16T20:10:15-04:00
EPA: Add additional comments field to AnnsModule

This is used in exact printing to store comments coming after the
`where` keyword but before any comments allocated to imports or decls.

It is used in ghc-exactprint, see
https://github.com/alanz/ghc-exactprint/commit/44bbed311fd8f0d053053fef195bf47c17d34fa7

- - - - -
e5c43259 by Bryan Richter at 2024-04-16T20:10:51-04:00
Remove unrunnable FreeBSD CI jobs

FreeBSD runner supply is inelastic. Currently there is only one, and
it's unavailable because of a hardware issue.

- - - - -
914eb49a by Ben Gamari at 2024-04-16T20:11:27-04:00
rel-eng: Fix mktemp usage in recompress-all

We need a temporary directory, not a file.

- - - - -
f30e4984 by Teo Camarasu at 2024-04-16T20:12:03-04:00
Fix ghc API link in docs/index.html

This was missing part of the unit ID meaning it would 404.

Resolves #24674

- - - - -
d7a3d6b5 by Ben Gamari at 2024-04-16T20:12:39-04:00
template-haskell: Declare TH.Lib.Internal as not-home

Rather than `hide`.

Closes #24659.

- - - - -
5eaa46e7 by Matthew Pickering at 2024-04-19T02:14:55-04:00
testsuite: Rename isCross() predicate to needsTargetWrapper()

isCross() was a misnamed because it assumed that all cross targets would
provide a target wrapper, but the two most common cross targets
(javascript, wasm) don't need a target wrapper.

Therefore we rename this predicate to `needsTargetWrapper()` so
situations in the testsuite where we can check whether running
executables requires a target wrapper or not.

- - - - -
55a9d699 by Simon Peyton Jones at 2024-04-19T02:15:32-04:00
Do not float HNFs out of lambdas

This MR adjusts SetLevels so that it is less eager to float a
HNF (lambda or constructor application) out of a lambda, unless
it gets to top level.

Data suggests that this change is a small net win:
 * nofib bytes-allocated falls by -0.09% (but a couple go up)
 * perf/should_compile bytes-allocated falls by -0.5%
 * perf/should_run bytes-allocated falls by -0.1%
See !12410 for more detail.

When fiddling elsewhere, I also found that this patch had a huge
positive effect on the (very delicate) test
  perf/should_run/T21839r
But that improvement doesn't show up in this MR by itself.

Metric Decrease:
    MultiLayerModulesRecomp
    T15703
    parsing001

- - - - -
f0701585 by Alan Zimmerman at 2024-04-19T02:16:08-04:00
EPA: Fix comments in mkListSyntaxTy0

Also extend the test to confirm.

Addresses #24669, 1 of 4

- - - - -
b01c01d4 by Serge S. Gulin at 2024-04-19T02:16:51-04:00
JS: set image `x86_64-linux-deb11-emsdk-closure` for build

- - - - -
c90c6039 by Alan Zimmerman at 2024-04-19T02:17:27-04:00
EPA: Provide correct span for PatBind

And remove unused parameter in checkPatBind

Contributes to #24669

- - - - -
26036f96 by Alan Zimmerman at 2024-04-19T13:11:08-04:00
EPA: Fix span for PatBuilderAppType

Include the location of the prefix @ in the span for InVisPat.

Also removes unnecessary annotations from HsTP.

Contributes to #24669

- - - - -
dba03aab by Matthew Craven at 2024-04-19T13:11:44-04:00
testsuite: Give the pre_cmd for mhu-perf more time

- - - - -
d31fbf6c by Krzysztof Gogolewski at 2024-04-19T21:04:09-04:00
Fix quantification order for a `op` b and a %m -> b

Fixes #23764

Implements https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0640-tyop-quantification-order.rst

Updates haddock submodule.

- - - - -
385cd1c4 by Sebastian Graf at 2024-04-19T21:04:45-04:00
Make `seq#` a magic Id and inline it in CorePrep (#24124)

We can save much code and explanation in Tag Inference and StgToCmm by making
`seq#` a known-key Magic Id in `GHC.Internal.IO` and inline this definition in
CorePrep. See the updated `Note [seq# magic]`.
I also implemented a new `Note [Flatten case-bind]` to get better code for
otherwise nested case scrutinees.

I renamed the contructors of `ArgInfo` to use an `AI` prefix in order to
resolve the clash between `type CpeApp = CoreExpr` and the data constructor of
`ArgInfo`, as well as fixed typos in `Note [CorePrep invariants]`.

Fixes #24252 and #24124.

- - - - -
275e41a9 by Jade at 2024-04-20T11:10:40-04:00
Put the newline after errors instead of before them

This mainly has consequences for GHCi but also slightly alters how the
output of GHC on the commandline looks.

Fixes: #22499

- - - - -
dd339c7a by Teo Camarasu at 2024-04-20T11:11:16-04:00
Remove unecessary stage0 packages

Historically quite a few packages had to be stage0 as they depended on
`template-haskell` and that was stage0. In #23536 we made it so that was
no longer the case. This allows us to remove a bunch of packages from
this list.

A few still remain. A new version of `Win32` is required by
`semaphore-compat`. Including `Win32` in the stage0 set requires also
including `filepath` because otherwise Hadrian's dependency logic gets
confused. Once our boot compiler has a newer version of `Win32` all of
these will be able to be dropped.

Resolves #24652

- - - - -
2f8e3a25 by Alan Zimmerman at 2024-04-20T11:11:52-04:00
EPA: Avoid duplicated comments in splice decls

Contributes to #24669

- - - - -
c70b9ddb by Serge S. Gulin at 2024-04-21T16:33:43+03:00
JS: fix typos and namings (fixes #24602)

You may noted that I've also changed term of

```
, global "h$vt_double" ||= toJExpr IntV
```

See "IntV"

and

```
  WaitReadOp  -> \[] [fd] -> pure $ PRPrimCall $ returnS (app
"h$waidRead" [fd])
```

See "h$waidRead"

- - - - -
3db54f9b by Serge S. Gulin at 2024-04-21T16:33:43+03:00
JS: trivial checks for variable presence (fixes #24602)

- - - - -
777f108f by Serge S. Gulin at 2024-04-21T16:33:43+03:00
JS: fs module imported twice (by emscripten and by ghc-internal). ghc-internal import wrapped
in a closure to prevent conflict with emscripten (fixes #24602)

Better solution is to use some JavaScript module system like AMD, CommonJS or even UMD. It will be investigated at other issues.
At first glance we should try UMD (See https://github.com/umdjs/umd)

- - - - -
a45a5712 by Serge S. Gulin at 2024-04-21T16:33:43+03:00
JS: thread.js requires h$fds and h$fdReady to be declared for static code analysis, minimal
code copied from GHCJS (fixes #24602)

I've just copied some old pieces of GHCJS from publicly available sources (See https://github.com/Taneb/shims/blob/a6dd0202dcdb86ad63201495b8b5d9763483eb35/src/io.js#L607).
Also I didn't put details to h$fds. I took minimal and left only its object initialization: `var h$fds = {};`

- - - - -
ad90bf12 by Serge S. Gulin at 2024-04-21T16:33:43+03:00
JS: heap and stack overflows reporting defined as js hard failure (fixes #24602)

These errors were treated as a hard failure for browser application. The fix is trivial: just throw error.

- - - - -
5962fa52 by Serge S. Gulin at 2024-04-21T16:33:44+03:00
JS: Stubs for code without actual implementation detected by Google Closure Compiler (fixes #24602)

These errors were fixed just by introducing stubbed functions with throw for further implementation.

- - - - -
a0694298 by Serge S. Gulin at 2024-04-21T16:34:07+03:00
JS: Add externs to linker (fixes #24602)

After enabling jsdoc and built-in google closure compiler types I was needed to deal with the following:

1. Define NodeJS-environment types. I've just copied minimal set of externs from semi-official repo (see https://github.com/externs/nodejs/blob/6c6882c73efcdceecf42e7ba11f1e3e5c9c041f0/v8/nodejs.js#L8).
2. Define Emscripten-environment types: `HEAP8`. Emscripten already provides some externs in our code but it supposed to be run in some module system. And its definitions do not work well in plain bundle.
3. We have some functions which purpose is to add to functions some contextual information via function properties. These functions should be marked as `modifies` to let google closure compiler remove calls if these functions are not used actually by call graph. Such functions are: `h$o`, `h$sti`, `h$init_closure`, `h$setObjInfo`.
4. STG primitives such as registries and stuff from `GHC.StgToJS`. `dXX` properties were already present at externs generator function but they are started from `7`, not from `1`. This message is related: `// fixme does closure compiler bite us here?`

- - - - -
e58bb29f by Serge S. Gulin at 2024-04-21T16:34:07+03:00
JS: added both tests: for size and for correctness (fixes #24602)

By some reason MacOS builds add to stderr messages like:

    Ignoring unexpected archive entry:
    __.SYMDEF
    ...

However I left stderr to `/dev/null` for compatibility with linux CI builds.

- - - - -
909f3a9c by Serge S. Gulin at 2024-04-21T16:34:07+03:00
JS: Disable js linker warning for empty symbol table to make js tests running consistent across environments

- - - - -
83eb10da by Serge S. Gulin at 2024-04-21T16:34:07+03:00
JS: Add special preprocessor for js files due of needing to keep jsdoc comments (fixes #24602)

Our js files have defined google closure compiler types at jsdoc entries but these jsdoc entries are removed by cpp preprocessor. I considered that reusing them in javascript-backend would be a nice thing. Right now haskell processor uses `-traditional` option to deal with comments and `//` operators.
But now there are following compiler options: `-C` and `-CC`.
You can read about them at GCC (see https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html#index-CC) and CLang (see https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-CC).
It seems that `-CC` works better for javascript jsdoc than `-traditional`.
At least it leaves `/* ... */` comments w/o changes.

- - - - -
e1cf8dc2 by brandon s allbery kf8nh at 2024-04-22T03:48:26-04:00
fix link in CODEOWNERS

It seems that our local Gitlab no longer has documentation for the
`CODEOWNERS` file, but the master documentation still does. Use
that instead.

- - - - -
593f4e04 by Fendor at 2024-04-23T10:19:14-04:00
Add performance regression test for '-fwrite-simplified-core'

- - - - -
1ba39b05 by Fendor at 2024-04-23T10:19:14-04:00
Typecheck corebindings lazily during bytecode generation

This delays typechecking the corebindings until the bytecode generation
happens.

We also avoid allocating a thunk that is retained by `unsafeInterleaveIO`.
In general, we shouldn't retain values of the hydrated `Type`, as not evaluating
the bytecode object keeps it alive.

It is better if we retain the unhydrated `IfaceType`.

See Note [Hydrating Modules]

- - - - -
e916fc92 by Alan Zimmerman at 2024-04-23T10:19:50-04:00
EPA: Keep comments in a CaseAlt match

The comments now live in the surrounding location, not inside the
Match. Make sure we keep them.

Closes #24707

- - - - -
d2b17f32 by Cheng Shao at 2024-04-23T15:01:22-04:00
driver: force merge objects when building dynamic objects

This patch forces the driver to always merge objects when building
dynamic objects even when ar -L is supported. It is an oversight of
!8887: original rationale of that patch is favoring the relatively
cheap ar -L operation over object merging when ar -L is supported,
which makes sense but only if we are building static objects! Omitting
check for whether we are building dynamic objects will result in
broken .so files with undefined reference errors at executable link
time when building GHC with llvm-ar. Fixes #22210.

- - - - -
209d09f5 by Julian Ospald at 2024-04-23T15:02:03-04:00
Allow non-absolute values for bootstrap GHC variable

Fixes #24682

- - - - -
3fff0977 by Matthew Pickering at 2024-04-23T15:02:38-04:00
Don't depend on registerPackage function in Cabal

More recent versions of Cabal modify the behaviour of libAbiHash which
breaks our usage of registerPackage.

It is simpler to inline the part of registerPackage that we need and
avoid any additional dependency and complication using the higher-level
function introduces.

- - - - -
c62dc317 by Cheng Shao at 2024-04-25T01:32:02-04:00
ghc-bignum: remove obsolete ln script

This commit removes an obsolete ln script in ghc-bignum/gmp. See
060251c24ad160264ae8553efecbb8bed2f06360 for its original intention,
but it's been obsolete for a long time, especially since the removal
of the make build system. Hence the house cleaning.

- - - - -
6399d52b by Cheng Shao at 2024-04-25T01:32:02-04:00
ghc-bignum: update gmp to 6.3.0

This patch bumps the gmp-tarballs submodule and updates gmp to 6.3.0.
The tarball format is now xz, and gmpsrc.patch has been patched into
the tarball so hadrian no longer needs to deal with patching logic
when building in-tree GMP.

- - - - -
65b4b92f by Cheng Shao at 2024-04-25T01:32:02-04:00
hadrian: remove obsolete Patch logic

This commit removes obsolete Patch logic from hadrian, given we no
longer need to patch the gmp tarball when building in-tree GMP.

- - - - -
71f28958 by Cheng Shao at 2024-04-25T01:32:02-04:00
autoconf: remove obsolete patch detection

This commit removes obsolete deletection logic of the patch command
from autoconf scripts, given we no longer need to patch anything in
the GHC build process.

- - - - -
daeda834 by Sylvain Henry at 2024-04-25T01:32:43-04:00
JS: correctly handle RUBBISH literals (#24664)

- - - - -
8a06ddf6 by Matthew Pickering at 2024-04-25T11:16:16-04:00
Linearise ghc-internal and base build

This is achieved by requesting the final package database for
ghc-internal, which mandates it is fully built as a dependency of
configuring the `base` package. This is at the expense of cross-package
parrallelism between ghc-internal and the base package.

Fixes #24436

- - - - -
94da9365 by Andrei Borzenkov at 2024-04-25T11:16:54-04:00
Fix tuple puns renaming (24702)

Move tuple renaming short cutter from `isBuiltInOcc_maybe` to `isPunOcc_maybe`, so we consider incoming module.

I also fixed some hidden bugs that raised after the change was done.

- - - - -
fa03b1fb by Fendor at 2024-04-26T18:03:13-04:00
Refactor the Binary serialisation interface

The goal is simplifiy adding deduplication tables to `ModIface`
interface serialisation.

We identify two main points of interest that make this difficult:

1. UserData hardcodes what `Binary` instances can have deduplication
   tables. Moreover, it heavily uses partial functions.
2. GHC.Iface.Binary hardcodes the deduplication tables for 'Name' and
   'FastString', making it difficult to add more deduplication.

Instead of having a single `UserData` record with fields for all the
types that can have deduplication tables, we allow to provide custom
serialisers for any `Typeable`.
These are wrapped in existentials and stored in a `Map` indexed by their
respective `TypeRep`.
The `Binary` instance of the type to deduplicate still needs to
explicitly look up the decoder via `findUserDataReader` and
`findUserDataWriter`, which is no worse than the status-quo.

`Map` was chosen as microbenchmarks indicate it is the fastest for a
small number of keys (< 10).

To generalise the deduplication table serialisation mechanism, we
introduce the types `ReaderTable` and `WriterTable` which provide a
simple interface that is sufficient to implement a general purpose
deduplication mechanism for `writeBinIface` and `readBinIface`.

This allows us to provide a list of deduplication tables for
serialisation that can be extended more easily, for example for
`IfaceTyCon`, see the issue https://gitlab.haskell.org/ghc/ghc/-/issues/24540
for more motivation.

In addition to this refactoring, we split `UserData` into `ReaderUserData`
and `WriterUserData`, to avoid partial functions and reduce overall
memory usage, as we need fewer mutable variables.

Bump haddock submodule to accomodate for `UserData` split.

-------------------------
Metric Increase:
    MultiLayerModulesTH_Make
    MultiLayerModulesRecomp
    T21839c
-------------------------

- - - - -
bac57298 by Fendor at 2024-04-26T18:03:13-04:00
Split `BinHandle` into `ReadBinHandle` and `WriteBinHandle`

A `BinHandle` contains too much information for reading data.
For example, it needs to keep a `FastMutInt` and a `IORef BinData`,
when the non-mutable variants would suffice.

Additionally, this change has the benefit that anyone can immediately
tell whether the `BinHandle` is used for reading or writing.

Bump haddock submodule BinHandle split.

- - - - -
4d6394dd by Simon Peyton Jones at 2024-04-26T18:03:49-04:00
Fix missing escaping-kind check in tcPatSynSig

Note [Escaping kind in type signatures] explains how we deal
with escaping kinds in type signatures, e.g.
    f :: forall r (a :: TYPE r). a
where the kind of the body is (TYPE r), but `r` is not in
scope outside the forall-type.

I had missed this subtlety in tcPatSynSig, leading to #24686.
This MR fixes it; and a similar bug in tc_top_lhs_type. (The
latter is tested by T24686a.)

- - - - -
981c2c2c by Alan Zimmerman at 2024-04-26T18:04:25-04:00
EPA: check-exact: check that the roundtrip reproduces the source

Closes #24670

- - - - -
a8616747 by Andrew Lelechenko at 2024-04-26T18:05:01-04:00
Document that setEnv is not thread-safe

- - - - -
1e41de83 by Bryan Richter at 2024-04-26T18:05:37-04:00
CI: Work around frequent Signal 9 errors

- - - - -
a6d5f9da by Naïm Favier at 2024-04-27T17:52:40-04:00
ghc-internal: add MonadFix instance for (,)

Closes https://gitlab.haskell.org/ghc/ghc/-/issues/24288, implements CLC
proposal https://github.com/haskell/core-libraries-committee/issues/238.

Adds a MonadFix instance for tuples, permitting value recursion in the
"native" writer monad and bringing consistency with the existing
instance for transformers's WriterT (and, to a lesser extent, for Solo).

- - - - -
64feadcd by Rodrigo Mesquita at 2024-04-27T17:53:16-04:00
bindist: Fix xattr cleaning

The original fix (725343aa) was incorrect because it used the shell
bracket syntax which is the quoting syntax in autoconf, making the test
for existence be incorrect and therefore `xattr` was never run.

Fixes #24554

- - - - -
e2094df3 by damhiya at 2024-04-28T23:52:00+09:00
Make read accepts binary integer formats

CLC proposal : https://github.com/haskell/core-libraries-committee/issues/177

- - - - -
1c2fd963 by Alan Zimmerman at 2024-04-29T23:17:00-04:00
EPA: Preserve comments in Match Pats

Closes #24708
Closes #24715
Closes #24734

- - - - -
4189d17e by Sylvain Henry at 2024-04-29T23:17:42-04:00
LLVM: better unreachable default destination in Switch (#24717)

See added note.

Co-authored-by: Siddharth Bhat <siddu.druid at gmail.com>

- - - - -
a3725c88 by Cheng Shao at 2024-04-29T23:18:20-04:00
ci: enable wasm jobs for MRs with wasm label

This patch enables wasm jobs for MRs with wasm label. Previously the
wasm label didn't actually have any effect on the CI pipeline, and
full-ci needed to be applied to run wasm jobs which was a waste of
runners when working on the wasm backend, hence the fix here.

- - - - -
702f7964 by Matthew Pickering at 2024-04-29T23:18:56-04:00
Make interface files and object files depend on inplace .conf file

A potential fix for #24737

- - - - -
728af21e by Cheng Shao at 2024-04-30T05:30:23-04:00
utils: remove obsolete vagrant scripts

Vagrantfile has long been removed in !5288. This commit further
removes the obsolete vagrant scripts in the tree.

- - - - -
36f2c342 by Cheng Shao at 2024-04-30T05:31:00-04:00
Update autoconf scripts

Scripts taken from autoconf 948ae97ca5703224bd3eada06b7a69f40dd15a02

- - - - -
ecbf22a6 by Ben Gamari at 2024-04-30T05:31:36-04:00
ghcup-metadata: Drop output_name field

This is entirely redundant to the filename of the URL. There is no
compelling reason to name the downloaded file differently from its
source.

- - - - -
c56d728e by Zubin Duggal at 2024-04-30T22:45:09-04:00
testsuite: Handle exceptions in framework_fail when testdir is not initialised

When `framework_fail` is called before initialising testdir, it would fail with
an exception reporting the testdir not being initialised instead of the actual failure.

Ensure we report the actual reason for the failure instead of failing in this way.

One way this can manifest is when trying to run a test that doesn't exist using `--only`

- - - - -
d5bea4d6 by Alan Zimmerman at 2024-04-30T22:45:45-04:00
EPA: Fix range for GADT decl with sig only

Closes #24714

- - - - -
4d78c53c by Sylvain Henry at 2024-05-01T17:23:06-04:00
Fix TH dependencies (#22229)

Add a dependency between Syntax and Internal (via module reexport).

- - - - -
37e38db4 by Sylvain Henry at 2024-05-01T17:23:06-04:00
Bump haddock submodule

- - - - -
ca13075c by Sylvain Henry at 2024-05-01T17:23:47-04:00
JS: cleanup to prepare for #24743

- - - - -
40026ac3 by Alan Zimmerman at 2024-05-01T22:45:07-04:00
EPA: Preserve comments for PrefixCon

Preserve comments in

    fun (Con {- c1 -} a b)
        = undefined

Closes #24736

- - - - -
92134789 by Hécate Moonlight at 2024-05-01T22:45:42-04:00
Correct `@since` metadata in HpcFlags

It was introduced in base-4.20, not 4.22.
Fix #24721

- - - - -
a580722e by Cheng Shao at 2024-05-02T08:18:45-04:00
testsuite: fix req_target_smp predicate

- - - - -
ac9c5f84 by Andreas Klebinger at 2024-05-02T08:18:45-04:00
STM: Remove (unused)coarse grained locking.

The STM code had a coarse grained locking mode guarded by #defines that was unused.
This commit removes the code.

- - - - -
917ef81b by Andreas Klebinger at 2024-05-02T08:18:45-04:00
STM: Be more optimistic when validating in-flight transactions.

* Don't lock tvars when performing non-committal validation.
* If we encounter a locked tvar don't consider it a failure.

This means in-flight validation will only fail if committing at the
moment of validation is *guaranteed* to fail.

This prevents in-flight validation from failing spuriously if it happens in
parallel on multiple threads or parallel to thread comitting.

- - - - -
167a56a0 by Alan Zimmerman at 2024-05-02T08:19:22-04:00
EPA: fix span for empty \case(s)

In
    instance SDecide Nat where
      SZero %~ (SSucc _) = Disproved (\case)

Ensure the span for the HsLam covers the full construct.

Closes #24748

- - - - -
9bae34d8 by doyougnu at 2024-05-02T15:41:08-04:00
testsuite: expand size testing infrastructure

- closes #24191
- adds windows_skip, wasm_skip, wasm_arch, find_so, _find_so
- path_from_ghcPkg, collect_size_ghc_pkg, collect_object_size, find_non_inplace functions to testsuite
- adds on_windows and req_dynamic_ghc predicate to testsuite

The design is to not make the testsuite too smart and simply offload to
ghc-pkg for locations of object files and directories.

- - - - -
b85b1199 by Sylvain Henry at 2024-05-02T15:41:49-04:00
GHCi: support inlining breakpoints (#24712)

When a breakpoint is inlined, its context may change (e.g. tyvars in
scope). We must take this into account and not used the breakpoint tick
index as its sole identifier. Each instance of a breakpoint (even with
the same tick index) now gets a different "info" index.

We also need to distinguish modules:
- tick module: module with the break array (tick counters, status, etc.)
- info module: module having the CgBreakInfo (info at occurrence site)

- - - - -
649c24b9 by Oleg Grenrus at 2024-05-03T20:45:42-04:00
Expose constructors of SNat, SChar and SSymbol in ghc-internal

- - - - -
d603f199 by Mikolaj Konarski at 2024-05-03T20:46:19-04:00
Add DCoVarSet to PluginProv (!12037)

- - - - -
ba480026 by Serge S. Gulin at 2024-05-03T20:47:01-04:00
JS: Enable more efficient packing of string data (fixes #24706)

- - - - -
be1e60ee by Simon Peyton Jones at 2024-05-03T20:47:37-04:00
Track in-scope variables in ruleCheckProgram

This small patch fixes #24726, by tracking in-scope variables
properly in -drule-check.  Not hard to do!

- - - - -
58408c77 by Simon Peyton Jones at 2024-05-03T20:47:37-04:00
Add a couple more HasCallStack constraints in SimpleOpt

Just for debugging, no effect on normal code

- - - - -
70e245e8 by Simon Peyton Jones at 2024-05-03T20:47:37-04:00
Add comments to Prep.hs

This documentation patch fixes a TODO left over from !12364

- - - - -
e5687186 by Simon Peyton Jones at 2024-05-03T20:47:37-04:00
Use HasDebugCallStack, rather than HasCallStack

- - - - -
631cefec by Cheng Shao at 2024-05-03T20:48:17-04:00
driver: always merge objects when possible

This patch makes the driver always merge objects with `ld -r` when
possible, and only fall back to calling `ar -L` when merge objects
command is unavailable. This completely reverts !8887 and !12313,
given more fixes in Cabal seems to be needed to avoid breaking certain
configurations and the maintainence cost is exceeding the behefits in
this case :/

- - - - -
1dacb506 by Ben Gamari at 2024-05-03T20:48:53-04:00
Bump time submodule to 1.14

As requested in #24528.

-------------------------
Metric Decrease:
    ghc_bignum_so
    rts_so
Metric Increase:
    cabal_syntax_dir
    rts_so
    time_dir
    time_so
-------------------------

- - - - -
4941b90e by Ben Gamari at 2024-05-03T20:48:53-04:00
Bump terminfo submodule to current master

- - - - -
43d48b44 by Cheng Shao at 2024-05-03T20:49:30-04:00
wasm: use scheduler.postTask() for context switch when available

This patch makes use of scheduler.postTask() for JSFFI context switch
when it's available. It's a more principled approach than our
MessageChannel based setImmediate() implementation, and it's available
in latest version of Chromium based browsers.

- - - - -
08207501 by Cheng Shao at 2024-05-03T20:50:08-04:00
testsuite: give pre_cmd for mhu-perf 5x time

- - - - -
bf3d4db0 by Alan Zimmerman at 2024-05-03T20:50:43-04:00
EPA: Preserve comments for pattern synonym sig

Closes #24749

- - - - -
c49493f2 by Matthew Pickering at 2024-05-04T06:02:57-04:00
tests: Widen acceptance window for dir and so size tests

These are testing things which are sometimes out the control of a GHC
developer. Therefore we shouldn't fail CI if something about these
dependencies change because we can't do anything about it.

It is still useful to have these statistics for visualisation in grafana
though.

Ticket #24759

- - - - -
9562808d by Matthew Pickering at 2024-05-04T06:02:57-04:00
Disable rts_so test

It has already manifested large fluctuations and destabilising CI

Fixes #24762

- - - - -
fc24c5cf by Ryan Scott at 2024-05-04T06:03:33-04:00
unboxedSum{Type,Data}Name: Use GHC.Types as the module

Unboxed sum constructors are now defined in the `GHC.Types` module, so if you
manually quote an unboxed sum (e.g., `''Sum2#`), you will get a `Name` like:

```hs
GHC.Types.Sum2#
```

The `unboxedSumTypeName` function in `template-haskell`, however, mistakenly
believes that unboxed sum constructors are defined in `GHC.Prim`, so
`unboxedSumTypeName 2` would return an entirely different `Name`:

```hs
GHC.Prim.(#|#)
```

This is a problem for Template Haskell users, as it means that they can't be
sure which `Name` is the correct one. (Similarly for `unboxedSumDataName`.)

This patch fixes the implementations of `unboxedSum{Type,Data}Name` to use
`GHC.Types` as the module. For consistency with `unboxedTupleTypeName`, the
`unboxedSumTypeName` function now uses the non-punned syntax for unboxed sums
(`Sum<N>#`) as the `OccName`.

Fixes #24750.

- - - - -
7eab4e01 by Alan Zimmerman at 2024-05-04T16:14:55+01:00
EPA: Widen stmtslist to include last semicolon

Closes #24754

- - - - -
06f7db40 by Teo Camarasu at 2024-05-05T00:19:38-04:00
doc: Fix type error in hs_try_putmvar example
- - - - -
af000532 by Moritz Schuler at 2024-05-05T06:30:58-04:00
Fix parsing of module names in CLI arguments
  closes issue #24732

- - - - -
da74e9c9 by Ben Gamari at 2024-05-05T06:31:34-04:00
ghc-platform: Add Setup.hs

The Hadrian bootstrapping script relies upon `Setup.hs` to drive its
build.

Addresses #24761.

- - - - -
35d34fde by Alan Zimmerman at 2024-05-05T12:52:40-04:00
EPA: preserve comments in class and data decls

Fix checkTyClHdr which was discarding comments.

Closes #24755

- - - - -
03c5dfbf by Simon Peyton Jones at 2024-05-05T12:53:15-04:00
Fix a float-out error

Ticket #24768 showed that the Simplifier was accidentally destroying
a join point.  It turned out to be that we were sending a bottoming
join point to the top, accidentally abstracting over /other/ join
points.

Easily fixed.

- - - - -
adba68e7 by John Ericson at 2024-05-05T19:35:56-04:00
Substitute bindist files with Hadrian not configure

The `ghc-toolchain` overhaul will eventually replace all this stuff with
something much more cleaned up, but I think it is still worth making
this sort of cleanup in the meantime so other untanglings and dead code
cleaning can procede.

I was able to delete a fair amount of dead code doing this too.

`LLVMTarget_CPP` is renamed to / merged with `LLVMTarget` because it
wasn't actually turned into a valid CPP identifier. (Original to
1345c7cc42c45e63ab1726a8fd24a7e4d4222467, actually.)

Progress on #23966

Co-Authored-By: Sylvain Henry <hsyl20 at gmail.com>

- - - - -
18f4ff84 by Alan Zimmerman at 2024-05-05T19:36:32-04:00
EPA: fix mkHsOpTyPV duplicating comments

Closes #24753

- - - - -
a19201d4 by Matthew Craven at 2024-05-06T19:54:29-04:00
Add test cases for #24664

...since none are present in the original MR !12463 fixing this issue.

- - - - -
46328a49 by Alan Zimmerman at 2024-05-06T19:55:05-04:00
EPA: preserve comments in data decls

Closes #24771

- - - - -
3b51995c by Andrei Borzenkov at 2024-05-07T14:39:40-04:00
Rename Solo# data constructor to MkSolo# (#24673)

- data Solo# a = (# a #)
+ data Solo# a = MkSolo# a

And `(# foo #)` syntax now becomes just a syntactic
sugar for `MkSolo# a`.

- - - - -
4d59abf2 by Arsen Arsenović at 2024-05-07T14:40:24-04:00
Add the cmm_cpp_is_gcc predicate to the testsuite

A future C-- test called T24474-cmm-override-g0 relies on the
GCC-specific behaviour of -g3 implying -dD, which, in turn, leads to it
emitting #defines past the preprocessing stage.  Clang, at least, does
not do this, so the test would fail if ran on Clang.

As the behaviour here being tested is ``-optCmmP-g3'' undoing effects of
the workaround we apply as a fix for bug #24474, and the workaround was
for GCC-specific behaviour, the test needs to be marked as fragile on
other compilers.

- - - - -
25b0b404 by Arsen Arsenović at 2024-05-07T14:40:24-04:00
Split out the C-- preprocessor, and make it pass -g0

Previously, C-- was processed with the C preprocessor program.  This
means that it inherited flags passed via -optc.  A flag that is somewhat
often passed through -optc is -g.  At certain -g levels (>=2), GCC
starts emitting defines *after* preprocessing, for the purposes of
debug info generation.  This is not useful for the C-- compiler, and, in
fact, causes lexer errors.  We can suppress this effect (safely, if
supported) via -g0.

As a workaround, in older versions of GCC (<=10), GCC only emitted
defines if a certain set of -g*3 flags was passed.  Newer versions check
the debug level.  For the former, we filter out those -g*3 flags and,
for the latter, we specify -g0 on top of that.

As a compatible and effective solution, this change adds a C--
preprocessor distinct from the C compiler and preprocessor, but that
keeps its flags.  The command line produced for C-- preprocessing now
looks like:

  $pgmCmmP $optCs_without_g3 $g0_if_supported $optCmmP

Closes: https://gitlab.haskell.org/ghc/ghc/-/issues/24474

- - - - -
9b4129a5 by Andreas Klebinger at 2024-05-08T13:24:20-04:00
-fprof-late: Only insert cost centres on functions/non-workfree cafs.

They are usually useless and doing so for data values comes with
a large compile time/code size overhead.

Fixes #24103

- - - - -
259b63d3 by Sebastian Graf at 2024-05-08T13:24:57-04:00
Simplifier: Preserve OccInfo on DataAlt fields when case binder is dead (#24770)

See the adjusted `Note [DataAlt occ info]`.
This change also has a positive repercussion on
`Note [Combine case alts: awkward corner]`.

Fixes #24770.

We now try not to call `dataConRepStrictness` in `adjustFieldsIdInfo` when all
fields are lazy anyway, leading to a 2% ghc/alloc decrease in T9675.

Metric Decrease:
    T9675

- - - - -
31b28cdb by Sebastian Graf at 2024-05-08T13:24:57-04:00
Kill seqRule, discard dead seq# in Prep (#24334)

Discarding seq#s in Core land via `seqRule` was problematic; see #24334.
So instead we discard certain dead, discardable seq#s in Prep now.
See the updated `Note [seq# magic]`.

This fixes the symptoms of #24334.

- - - - -
b2682534 by Rodrigo Mesquita at 2024-05-10T01:47:51-04:00
Document NcgImpl methods

Fixes #19914

- - - - -
4d3acbcf by Zejun Wu at 2024-05-10T01:48:28-04:00
Make renamer to be more flexible with parens in the LHS of the rules

We used to reject LHS like `(f a) b` in RULES and requires it to be written as
`f a b`. It will be handy to allow both as the expression may be more
readable with extra parens in some cases when infix operator is involved.
Espceially when TemplateHaskell is used, extra parens may be added out of
user's control and result in "valid" rules being rejected and there
are not always ways to workaround it.

Fixes #24621

- - - - -
ab840ce6 by Ben Gamari at 2024-05-10T01:49:04-04:00
IPE: Eliminate dependency on Read

Instead of encoding the closure type as decimal string we now simply
represent it as an integer, eliminating the need for `Read` in
`GHC.Internal.InfoProv.Types.peekInfoProv`.

Closes #24504.

-------------------------
Metric Decrease:
    T24602_perf_size
    size_hello_artifact
-------------------------

- - - - -
a9979f55 by Cheng Shao at 2024-05-10T01:49:43-04:00
testsuite: fix testwsdeque with recent clang

This patch fixes compilation of testwsdeque.c with recent versions of
clang, which will fail with the error below:

```
testwsdeque.c:95:33: error:
     warning: format specifies type 'long' but the argument has type 'void *' [-Wformat]
       95 |         barf("FAIL: %ld %d %d", p, n, val);
          |                     ~~~         ^

testwsdeque.c:95:39: error:
     warning: format specifies type 'int' but the argument has type 'StgWord' (aka 'unsigned long') [-Wformat]
       95 |         barf("FAIL: %ld %d %d", p, n, val);
          |                            ~~         ^~~
          |                            %lu

testwsdeque.c:133:42: error:
     error: incompatible function pointer types passing 'void (void *)' to parameter of type 'OSThreadProc *' (aka 'void *(*)(void *)') [-Wincompatible-function-pointer-types]
      133 |         createOSThread(&ids[n], "thief", thief, (void*)(StgWord)n);
          |                                          ^~~~~

/workspace/ghc/_build/stage1/lib/../lib/x86_64-linux-ghc-9.11.20240502/rts-1.0.2/include/rts/OSThreads.h:193:51: error:
     note: passing argument to parameter 'startProc' here
      193 |                                     OSThreadProc *startProc, void *param);
          |                                                   ^

2 warnings and 1 error generated.
```

- - - - -
c2b33fc9 by Rodrigo Mesquita at 2024-05-10T01:50:20-04:00
Rename pre-processor invocation args

Small clean up. Uses proper names for the various groups of arguments
that make up the pre-processor invocation.

- - - - -
2b1af08b by Cheng Shao at 2024-05-10T01:50:55-04:00
ghc-heap: fix typo in ghc-heap cbits

- - - - -
fc2d6de1 by Jade at 2024-05-10T21:07:16-04:00
Improve performance of Data.List.sort(By)

This patch improves the algorithm to sort lists in base.
It does so using two strategies:

1) Use a four-way-merge instead of the 'default' two-way-merge.
This is able to save comparisons and allocations.

2) Use `(>) a b` over `compare a b == GT` and allow inlining and specialization.
This mainly benefits types with a fast (>).

Note that this *may* break instances with a *malformed* Ord instance
where `a > b` is *not* equal to `compare a b == GT`.

CLC proposal: https://github.com/haskell/core-libraries-committee/issues/236

Fixes #24280

-------------------------
Metric Decrease:
    MultiLayerModulesTH_Make
    T10421
    T13719
    T15164
    T18698a
    T18698b
    T1969
    T9872a
    T9961
    T18730
    WWRec
    T12425
    T15703
-------------------------

- - - - -
1012e8aa by Matthew Pickering at 2024-05-10T21:07:52-04:00
Revert "ghcup-metadata: Drop output_name field"

This reverts commit ecbf22a6ac397a791204590f94c0afa82e29e79f.

This breaks the ghcup metadata generation on the nightly jobs.

- - - - -
daff1e30 by Jannis at 2024-05-12T13:38:35-04:00
Division by constants optimization

- - - - -
413217ba by Andreas Klebinger at 2024-05-12T13:39:11-04:00
Tidy: Add flag to expose unfoldings if they take dictionary arguments.

Add the flag `-fexpose-overloaded-unfoldings` to be able to control this
behaviour.

For ghc's boot libraries file size grew by less than 1% when it was
enabled. However I refrained from enabling it by default for now.

I've also added a section on specialization more broadly to the users
guide.

-------------------------
Metric Decrease:
    MultiLayerModulesTH_OneShot
Metric Increase:
    T12425
    T13386
    hard_hole_fits
-------------------------

- - - - -
c5d89412 by Zubin Duggal at 2024-05-13T22:19:53-04:00
Don't store a GlobalRdrEnv in `mi_globals` for GHCi.

GHCi only needs the `mi_globals` field for modules imported with
:module +*SomeModule.

It uses this field to make the top level environment in `SomeModule` available
to the repl.

By default, only the first target in the command line parameters is
"star" loaded into GHCi. Other modules have to be manually "star" loaded
into the repl.

Storing the top level GlobalRdrEnv for each module is very wasteful, especially
given that we will most likely never need most of these environments.

Instead we store only the information needed to reconstruct the top level environment
in a module, which is the `IfaceTopEnv` data structure, consisting of all import statements
as well as all top level symbols defined in the module (not taking export lists into account)

When a particular module is "star-loaded" into GHCi (as the first commandline target, or via
an explicit `:module +*SomeModule`, we reconstruct the top level environment on demand using
the `IfaceTopEnv`.

- - - - -
d65bf4a2 by Fendor at 2024-05-13T22:20:30-04:00
Add perf regression test for `-fwrite-if-simplified-core`

- - - - -
2c0f8ddb by Andrei Borzenkov at 2024-05-13T22:21:07-04:00
Improve pattern to type pattern transformation (23739)

`pat_to_type_pat` function now can handle more patterns:
  - TuplePat
  - ListPat
  - LitPat
  - NPat
  - ConPat

Allowing these new constructors in type patterns significantly
increases possible shapes of type patterns without `type` keyword.

This patch also changes how lookups in `lookupOccRnConstr` are
performed, because we need to fall back into
types when we didn't find a constructor on data level to perform
`ConPat` to type transformation properly.

- - - - -
be514bb4 by Cheng Shao at 2024-05-13T22:21:43-04:00
hadrian: fix hadrian building with ghc-9.10.1

- - - - -
ad38e954 by Cheng Shao at 2024-05-13T22:21:43-04:00
linters: fix lint-whitespace compilation with ghc-9.10.1

- - - - -
9f14caa5 by Cheng Shao at 2024-05-15T09:27:07+00:00
rts: do not prefetch mark_closure bdescr in non-moving gc when ASSERTS_ENABLED

This commit fixes a small an oversight in !12148: the prefetch logic
in non-moving GC may trap in debug RTS because it calls Bdescr() for
mark_closure which may be a static one. It's fine in non-debug RTS
because even invalid bdescr addresses are prefetched, they will not
cause segfaults, so this commit implements the most straightforward
fix: don't prefetch mark_closure bdescr when assertions are enabled.

- - - - -
79813509 by Teo Camarasu at 2024-05-15T09:27:07+00:00
rts: Allocate non-moving segments with megablocks

Non-moving segments are 8 blocks long and need to be aligned.
Previously we serviced allocations by grabbing 15 blocks, finding
an aligned 8 block group in it and returning the rest.
This proved to lead to high levels of fragmentation as a de-allocating a segment
caused an 8 block gap to form, and this could not be reused for allocation.

This patch introduces a segment allocator based around using entire
megablocks to service segment allocations in bulk.

When there are no free segments, we grab an entire megablock and fill it
with aligned segments. As the megablock is free, we can easily guarantee
alignment. Any unused segments are placed on a free list.

It only makes sense to free segments in bulk when all of the segments in
a megablock are freeable. After sweeping, we grab the free list, sort it,
and find all groups of segments where they cover the megablock and free
them.
This introduces a period of time when free segments are not available to
the mutator, but the risk that this would lead to excessive allocation
is low. Right after sweep, we should have an abundance of partially full
segments, and this pruning step is relatively quick.

In implementing this we drop the logic that kept NONMOVING_MAX_FREE
segments on the free list.

See Note [Segment allocation strategy]

Resolves #24150

-------------------------
Metric Decrease:
    T18223
-------------------------

- - - - -


30 changed files:

- .gitignore
- .gitlab-ci.yml
- .gitlab/ci.sh
- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- .gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
- .gitlab/rel_eng/recompress-all
- .gitlab/rel_eng/upload.sh
- CODEOWNERS
- compiler/GHC.hs
- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Builtin/Names/TH.hs
- compiler/GHC/Builtin/PrimOps.hs
- compiler/GHC/Builtin/Types.hs
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Instr.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/Cmm.hs
- compiler/GHC/Cmm/Config.hs
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Opt.hs
- compiler/GHC/Cmm/Pipeline.hs
- compiler/GHC/Cmm/Sink.hs
- compiler/GHC/Cmm/ThreadSanitizer.hs
- compiler/GHC/CmmToAsm/AArch64.hs
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/f89868f8e954311ae3aeded02db8bbe54ada84a2...79813509c60f0b3f83902de3caee139012f356f4

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/f89868f8e954311ae3aeded02db8bbe54ada84a2...79813509c60f0b3f83902de3caee139012f356f4
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240515/68175bf5/attachment-0001.html>


More information about the ghc-commits mailing list