[Git][ghc/ghc][wip/t24277] 189 commits: add -fprof-late-overloaded and -fprof-late-overloaded-calls
Finley McIlwaine (@FinleyMcIlwaine)
gitlab at gitlab.haskell.org
Tue Apr 9 15:19:39 UTC 2024
Finley McIlwaine pushed to branch wip/t24277 at Glasgow Haskell Compiler / GHC
Commits:
61bb5ff6 by Finley McIlwaine at 2024-03-04T09:01:40-08:00
add -fprof-late-overloaded and -fprof-late-overloaded-calls
* Refactor late cost centre insertion for extensibility
* Add two more late cost centre insertion methods that add SCCs to overloaded
top level bindings and call sites with dictionary arguments.
* Some tests for the basic functionality of the new insertion methods
Resolves: #24500
- - - - -
82ccb801 by Andreas Klebinger at 2024-03-04T19:59:14-05:00
x86-ncg: Fix fma codegen when arguments are globals
Fix a bug in the x86 ncg where results would be wrong when the desired output
register and one of the input registers were the same global.
Also adds a tiny optimization to make use of the memory addressing
support when convenient.
Fixes #24496
- - - - -
18ad1077 by Matthew Pickering at 2024-03-05T14:22:31-05:00
rel_eng: Update hackage docs upload scripts
This adds the upload of ghc-internal and ghc-experimental to our scripts
which upload packages to hackage.
- - - - -
bf47c9ba by Matthew Pickering at 2024-03-05T14:22:31-05:00
docs: Remove stray module comment from GHC.Profiling.Eras
- - - - -
37d9b340 by Matthew Pickering at 2024-03-05T14:22:31-05:00
Fix ghc-internal cabal file
The file mentioned some artifacts relating to the base library. I have
renamed these to the new ghc-internal variants.
- - - - -
23f2a478 by Matthew Pickering at 2024-03-05T14:22:31-05:00
Fix haddock source links and hyperlinked source
There were a few issues with the hackage links:
1. We were using the package id rather than the package name for the
package links. This is fixed by now allowing the template to mention
%pkg% or %pkgid% and substituing both appropiatly.
2. The `--haddock-base-url` flag is renamed to `--haddock-for-hackage`
as the new base link works on a local or remote hackage server.
3. The "src" path including too much stuff, so cross-package source
links were broken as the template was getting double expanded.
Fixes #24086
- - - - -
2fa336a9 by Ben Gamari at 2024-03-05T14:23:07-05:00
filepath: Bump submodule to 1.5.2.0
- - - - -
31217944 by Ben Gamari at 2024-03-05T14:23:07-05:00
os-string: Bump submodule to 2.0.2
- - - - -
4074a3f2 by Matthew Pickering at 2024-03-05T21:44:35-05:00
base: Reflect new era profiling RTS flags in GHC.RTS.Flags
* -he profiling mode
* -he profiling selector
* --automatic-era-increment
CLC proposal #254 - https://github.com/haskell/core-libraries-committee/issues/254
- - - - -
a8c0e31b by Sylvain Henry at 2024-03-05T21:45:14-05:00
JS: faster implementation for some numeric primitives (#23597)
Use faster implementations for the following primitives in the JS
backend by not using JavaScript's BigInt:
- plusInt64
- minusInt64
- minusWord64
- timesWord64
- timesInt64
Co-authored-by: Josh Meredith <joshmeredith2008 at gmail.com>
- - - - -
21e3f325 by Cheng Shao at 2024-03-05T21:45:52-05:00
rts: add -xr option to control two step allocator reserved space size
This patch adds a -xr RTS option to control the size of virtual memory
address space reserved by the two step allocator on a 64-bit platform,
see added documentation for explanation. Closes #24498.
- - - - -
dedcf102 by Cheng Shao at 2024-03-06T13:39:04-05:00
rts: expose HeapAlloc.h as public header
This commit exposes HeapAlloc.h as a public header. The intention is
to expose HEAP_ALLOCED/HEAP_ALLOCED_GC, so they can be used in
assertions in other public headers, and they may also be useful for
user code.
- - - - -
d19441d7 by Cheng Shao at 2024-03-06T13:39:04-05:00
rts: assert pointer is indeed heap allocated in Bdescr()
This commit adds an assertion to Bdescr() to assert the pointer is
indeed heap allocated. This is useful to rule out RTS bugs that
attempt to access non-existent block descriptor of a static closure, #24492
being one such example.
- - - - -
9a656a04 by Ben Gamari at 2024-03-06T13:39:39-05:00
ghc-experimental: Add dummy dependencies to work around #23942
This is a temporary measure to improve CI reliability until a proper
solution is developed.
Works around #23942.
- - - - -
1e84b924 by Simon Peyton Jones at 2024-03-06T13:39:39-05:00
Three compile perf improvements with deep nesting
These were changes are all triggered by #24471.
1. Make GHC.Core.Opt.SetLevels.lvlMFE behave better when there are
many free variables. See Note [Large free-variable sets].
2. Make GHC.Core.Opt.Arity.floatIn a bit lazier in its Cost argument.
This benefits the common case where the ArityType turns out to
be nullary. See Note [Care with nested expressions]
3. Make GHC.CoreToStg.Prep.cpeArg behave for deeply-nested
expressions. See Note [Eta expansion of arguments in CorePrep]
wrinkle (EA2).
Compile times go down by up to 4.5%, and much more in artificial
cases. (Geo mean of compiler/perf changes is -0.4%.)
Metric Decrease:
CoOpt_Read
T10421
T12425
- - - - -
c4b13113 by Hécate Moonlight at 2024-03-06T13:40:17-05:00
Use "module" instead of "library" when applicable in base haddocks
- - - - -
9cd9efb4 by Vladislav Zavialov at 2024-03-07T13:01:54+03:00
Rephrase error message to say "visible arguments" (#24318)
* Main change: make the error message generated by mkFunTysMsg more
accurate by changing "value arguments" to "visible arguments".
* Refactor: define a new type synonym VisArity and use it instead of
Arity in a few places.
It might be the case that there other places in the compiler that should
talk about visible arguments rather than value arguments, but I haven't
tried to find them all, focusing only on the error message reported in
the ticket.
- - - - -
d523a6a7 by Ben Gamari at 2024-03-07T19:40:45-05:00
Bump array submodule
- - - - -
7e55003c by Ben Gamari at 2024-03-07T19:40:45-05:00
Bump stm submodule
- - - - -
32d337ef by Ben Gamari at 2024-03-07T19:40:45-05:00
Introduce exception context
Here we introduce the `ExceptionContext` type and `ExceptionAnnotation`
class, allowing dynamically-typed user-defined annotations to be
attached to exceptions.
CLC Proposal: https://github.com/haskell/core-libraries-committee/issues/199
GHC Proposal: https://github.com/ghc-proposals/ghc-proposals/pull/330
- - - - -
39f3d922 by Ben Gamari at 2024-03-07T19:40:46-05:00
testsuite/interface-stability: Update documentation
- - - - -
fdea7ada by Ben Gamari at 2024-03-07T19:40:46-05:00
ghc-internal: comment formatting
- - - - -
4fba42ef by Ben Gamari at 2024-03-07T19:40:46-05:00
compiler: Default and warn ExceptionContext constraints
- - - - -
3886a205 by Ben Gamari at 2024-03-07T19:40:46-05:00
base: Introduce exception backtraces
Here we introduce the `Backtraces` type and associated machinery for
attaching these via `ExceptionContext`. These has a few compile-time
regressions (`T15703` and `T9872d`) due to the additional dependencies
in the exception machinery.
As well, there is a surprisingly large regression in the
`size_hello_artifact` test. This appears to be due to various `Integer` and
`Read` bits now being reachable at link-time. I believe it should be
possible to avoid this but I have accepted the change for now to get the
feature merged.
CLC Proposal: https://github.com/haskell/core-libraries-committee/issues/199
GHC Proposal: https://github.com/ghc-proposals/ghc-proposals/pull/330
Metric Increase:
T15703
T9872d
size_hello_artifact
- - - - -
18c5409f by Ben Gamari at 2024-03-07T19:40:46-05:00
users guide: Release notes for exception backtrace work
- - - - -
f849c5fc by Ben Gamari at 2024-03-07T19:40:46-05:00
compiler: Don't show ExceptionContext of GhcExceptions
Most GhcExceptions are user-facing errors and therefore the
ExceptionContext has little value. Ideally we would enable
it in the DEBUG compiler but I am leaving this for future work.
- - - - -
dc646e6f by Sylvain Henry at 2024-03-07T19:40:46-05:00
Disable T9930fail for the JS target (cf #19174)
- - - - -
bfc09760 by Alan Zimmerman at 2024-03-07T19:41:22-05:00
Update showAstData to honour blanking of AnnParen
Also tweak rendering of SrcSpan to remove extra blank line.
- - - - -
50454a29 by Ben Gamari at 2024-03-08T03:32:42-05:00
ghc-internal: Eliminate GHC.Internal.Data.Kind
This was simply reexporting things from `ghc-prim`. Instead reexport
these directly from `Data.Kind`. Also add build ordering dependency to
work around #23942.
- - - - -
38a4b6ab by Ben Gamari at 2024-03-08T03:33:18-05:00
rts: Fix SET_HDR initialization of retainer set
This fixes a regression in retainer set profiling introduced by
b0293f78cb6acf2540389e22bdda420d0ab874da. Prior to that commit
the heap traversal word would be initialized by `SET_HDR` using
`LDV_RECORD_CREATE`. However, the commit added a `doingLDVProfiling`
check in `LDV_RECORD_CREATE`, meaning that this initialization no longer
happened.
Given that this initialization was awkwardly indirectly anyways, I have
fixed this by explicitly initializating the heap traversal word to
`NULL` in `SET_PROF_HDR`. This is equivalent to the previous behavior,
but much more direct.
Fixes #24513.
- - - - -
2859a637 by Ben Gamari at 2024-03-08T18:26:47-05:00
base: Use strerror_r instead of strerror
As noted by #24344, `strerror` is not necessarily thread-safe.
Thankfully, POSIX.1-2001 has long offered `strerror_r`, which is
safe to use.
Fixes #24344.
CLC discussion: https://github.com/haskell/core-libraries-committee/issues/249
- - - - -
edb9bf77 by Jade at 2024-03-09T03:39:38-05:00
Error messages: Improve Error messages for Data constructors in type signatures.
This patch improves the error messages from invalid type signatures by
trying to guess what the user did and suggesting an appropriate fix.
Partially fixes: #17879
- - - - -
cfb197e3 by Patrick at 2024-03-09T03:40:15-05:00
HieAst: add module name #24493
The main purpose of this is to tuck the module name `xxx` in `module xxx where` into the hieAst.
It should fix #24493.
The following have been done:
1. Renamed and update the `tcg_doc_hdr :: Maybe (LHsDoc GhcRn)` to `tcg_hdr_info :: (Maybe (LHsDoc GhcRn), Maybe (XRec GhcRn ModuleName))`
To store the located module name information.
2. update the `RenamedSource` and `RenamedStuff` with extra `Maybe (XRec GhcRn ModuleName)` located module name information.
3. add test `testsuite/tests/hiefile/should_compile/T24493.hs` to ensure the module name is added and update several relevent tests.
4. accompanied submodule haddoc test update MR in https://gitlab.haskell.org/ghc/haddock/-/merge_requests/53
- - - - -
2341d81e by Vaibhav Sagar at 2024-03-09T03:40:54-05:00
GHC.Utils.Binary: fix a couple of typos
- - - - -
5580e1bd by Ben Gamari at 2024-03-09T03:41:30-05:00
rts: Drop .wasm suffix from .prof file names
This replicates the behavior on Windows, where `Hi.exe` will produce
profiling output named `Hi.prof` instead of `Hi.exe.prof`.
While in the area I also fixed the extension-stripping logic, which
incorrectly rewrote `Hi.exefoo` to `Hi.foo`.
Closes #24515.
- - - - -
259495ee by Cheng Shao at 2024-03-09T03:41:30-05:00
testsuite: drop exe extension from .hp & .prof filenames
See #24515 for details.
- - - - -
c477a8d2 by Ben Gamari at 2024-03-09T03:42:05-05:00
rts/linker: Enable GOT support on all platforms
There is nothing platform-dependent about our GOT implementation and
GOT support is needed by `T24171` on i386.
- - - - -
2e592857 by Vladislav Zavialov at 2024-03-09T03:42:41-05:00
Drop outdated comment on TcRnIllformedTypePattern
This should have been done in 0f0c53a501b but I missed it.
- - - - -
c554b4da by Ben Gamari at 2024-03-09T09:39:20-05:00
rts/CloneStack: Bounds check array write
- - - - -
15c590a5 by Ben Gamari at 2024-03-09T09:39:20-05:00
rts/CloneStack: Don't expose helper functions in header
- - - - -
e831ce31 by Ben Gamari at 2024-03-09T09:39:20-05:00
base: Move internals of GHC.InfoProv into GHC.InfoProv.Types
Such that we can add new helpers into GHC.InfoProv.Types without
breakage.
- - - - -
6948e24d by Ben Gamari at 2024-03-09T09:39:20-05:00
rts: Lazily decode IPE tables
Previously we would eagerly allocate `InfoTableEnt`s for each
info table registered in the info table provenance map. However, this
costs considerable memory and initialization time. Instead we now
lazily decode these tables. This allows us to use one-third the memory
*and* opens the door to taking advantage of sharing opportunities within
a module.
This required considerable reworking since lookupIPE now must be passed
its result buffer.
- - - - -
9204a04e by Ben Gamari at 2024-03-09T09:39:20-05:00
rts/IPE: Don't expose helper in header
- - - - -
308926ff by Ben Gamari at 2024-03-09T09:39:20-05:00
rts/IPE: Share module_name within a Node
This allows us to shave a 64-bit word off of the packed IPE entry size.
- - - - -
bebdea05 by Ben Gamari at 2024-03-09T09:39:20-05:00
IPE: Expose unit ID in InfoTableProv
Here we add the unit ID to the info table provenance structure.
- - - - -
6519c9ad by Ben Gamari at 2024-03-09T09:39:35-05:00
rts: Refactor GHC.Stack.CloneStack.decode
Don't allocate a Ptr constructor per frame.
- - - - -
ed0b69dc by Ben Gamari at 2024-03-09T09:39:35-05:00
base: Do not expose whereFrom# from GHC.Exts
- - - - -
2b1faea9 by Vladislav Zavialov at 2024-03-09T17:38:21-05:00
docs: Update info on TypeAbstractions
* Mention TypeAbstractions in 9.10.1-notes.rst
* Set the status to "Experimental".
* Add a "Since: GHC 9.x" comment to each section.
- - - - -
f8b88918 by Ben Gamari at 2024-03-09T21:21:46-05:00
ci-images: Bump Alpine image to bootstrap with 9.8.2
- - - - -
705e6927 by Ben Gamari at 2024-03-09T21:21:46-05:00
testsuite: Mark T24171 as fragile due to #24512
I will fix this but not in time for 9.10.1-alpha1
- - - - -
c74196e1 by Ben Gamari at 2024-03-09T21:21:46-05:00
testsuite: Mark linker_unload_native as fragile
In particular this fails on platforms without `dlinfo`. I plan to
address this but not before 9.10.1-alpha1.
- - - - -
f4d87f7a by Ben Gamari at 2024-03-09T21:21:46-05:00
configure: Bump version to 9.10
- - - - -
88df9a5f by Ben Gamari at 2024-03-09T21:21:46-05:00
Bump transformers submodule to 0.6.1.1
- - - - -
8176d5e8 by Ben Gamari at 2024-03-09T21:21:46-05:00
testsuite: Increase ulimit for T18623
1 MByte was just too tight and failed intermittently on some platforms
(e.g. CentOS 7). Bumping the limit to 8 MByte should provide sufficient
headroom.
Fixes #23139.
- - - - -
c74b38a3 by Ben Gamari at 2024-03-09T21:21:46-05:00
base: Bump version to 4.20.0.0
- - - - -
b2937fc3 by Ben Gamari at 2024-03-09T21:21:46-05:00
ghc-internal: Set initial version at 9.1001.0
This provides PVP compliance while maintaining a clear correspondence
between GHC releases and `ghc-internal` versions.
- - - - -
4ae7d868 by Ben Gamari at 2024-03-09T21:21:46-05:00
ghc-prim: Bump version to 0.11.0
- - - - -
50798dc6 by Ben Gamari at 2024-03-09T21:21:46-05:00
template-haskell: Bump version to 2.22.0.0
- - - - -
8564f976 by Ben Gamari at 2024-03-09T21:21:46-05:00
base-exports: Accommodate spurious whitespace changes in 32-bit output
It appears that this was
- - - - -
9d4f0e98 by Ben Gamari at 2024-03-09T21:21:46-05:00
users-guide: Move exception backtrace relnotes to 9.10
This was previously mistakenly added to the GHC 9.8 release notes.
- - - - -
145eae60 by Ben Gamari at 2024-03-09T21:21:46-05:00
gitlab/rel_eng: Fix name of Rocky8 artifact
- - - - -
39c2a630 by Ben Gamari at 2024-03-09T21:21:46-05:00
gitlab/rel_eng: Fix path of generate_jobs_metadata
- - - - -
aed034de by Ben Gamari at 2024-03-09T21:21:46-05:00
gitlab/upload: Rework recompression
The old `combine` approach was quite fragile due to use of filename
globbing. Moreover, it didn't parallelize well. This refactoring
makes the goal more obvious, parallelizes better, and is more robust.
- - - - -
dc207d06 by Ben Gamari at 2024-03-10T08:56:08-04:00
configure: Bump GHC version to 9.11
Bumps haddock submodule.
- - - - -
8b2513e8 by Ben Gamari at 2024-03-11T01:20:03-04:00
rts/linker: Don't unload code when profiling is enabled
The heap census may contain references (e.g. `Counter.identity`) to
static data which must be available when the census is reported at the
end of execution.
Fixes #24512.
- - - - -
7810b4c3 by Ben Gamari at 2024-03-11T01:20:03-04:00
rts/linker: Don't unload native objects when dlinfo isn't available
To do so is unsafe as we have no way of identifying references to
symbols provided by the object.
Fixes #24513. Fixes #23993.
- - - - -
0590764c by Ben Gamari at 2024-03-11T01:20:39-04:00
rel_eng/upload: Purge both $rel_name/ and $ver/
This is necessary for prereleases, where GHCup accesses the release via
`$ver/`
- - - - -
b85a4631 by Brandon Chinn at 2024-03-12T19:25:56-04:00
Remove duplicate code normalising slashes
- - - - -
c91946f9 by Brandon Chinn at 2024-03-12T19:25:56-04:00
Simplify regexes with raw strings
- - - - -
1a5f53c6 by Brandon Chinn at 2024-03-12T19:25:57-04:00
Don't normalize backslashes in characters
- - - - -
7ea971d3 by Andrei Borzenkov at 2024-03-12T19:26:32-04:00
Fix compiler crash caused by implicit RHS quantification in type synonyms (#24470)
- - - - -
39f3ac3e by Cheng Shao at 2024-03-12T19:27:11-04:00
Revert "compiler: make genSym use C-based atomic increment on non-JS 32-bit platforms"
This reverts commit 615eb855416ce536e02ed935ecc5a6f25519ae16. It was
originally intended to fix #24449, but it was merely sweeping the bug
under the rug. 3836a110577b5c9343915fd96c1b2c64217e0082 has properly
fixed the fragile test, and we no longer need the C version of genSym.
Furthermore, the C implementation causes trouble when compiling with
clang that targets i386 due to alignment warning and libatomic linking
issue, so it makes sense to revert it.
- - - - -
e6bfb85c by Cheng Shao at 2024-03-12T19:27:11-04:00
compiler: fix out-of-bound memory access of genSym on 32-bit
This commit fixes an unnoticed out-of-bound memory access of genSym on
32-bit. ghc_unique_inc is 32-bit sized/aligned on 32-bit platforms,
but we mistakenly treat it as a Word64 pointer in genSym, and
therefore will accidentally load 2 garbage higher bytes, or with a
small but non-zero chance, overwrite something else in the data
section depends on how the linker places the data segments. This
regression was introduced in !11802 and fixed here.
- - - - -
77171cd1 by Ben Orchard at 2024-03-14T09:00:40-04:00
Note mutability of array and address access primops
Without an understanding of immutable vs. mutable memory, the index
primop family have a potentially non-intuitive type signature:
indexOffAddr :: Addr# -> Int# -> a
readOffAddr :: Addr# -> Int# -> State# d -> (# State# d, a #)
indexOffAddr# might seem like a free generality improvement, which it
certainly is not!
This change adds a brief note on mutability expectations for most
index/read/write access primops.
- - - - -
7da7f8f6 by Alan Zimmerman at 2024-03-14T09:01:15-04:00
EPA: Fix regression discarding comments in contexts
Closes #24533
- - - - -
73be65ab by Fendor at 2024-03-19T01:42:53-04:00
Fix sharing of 'IfaceTyConInfo' during core to iface type translation
During heap analysis, we noticed that during generation of
'mi_extra_decls' we have lots of duplicates for the instances:
* `IfaceTyConInfo NotPromoted IfaceNormalTyCon`
* `IfaceTyConInfo IsPromoted IfaceNormalTyCon`
which should be shared instead of duplicated. This duplication increased
the number of live bytes by around 200MB while loading the agda codebase
into GHCi.
These instances are created during `CoreToIface` translation, in
particular `toIfaceTyCon`.
The generated core looks like:
toIfaceTyCon
= \ tc_sjJw ->
case $wtoIfaceTyCon tc_sjJw of
{ (# ww_sjJz, ww1_sjNL, ww2_sjNM #) ->
IfaceTyCon ww_sjJz (IfaceTyConInfo ww1_sjNL ww2_sjNM)
}
whichs removes causes the sharing to work propery.
Adding explicit sharing, with NOINLINE annotations, changes the core to:
toIfaceTyCon
= \ tc_sjJq ->
case $wtoIfaceTyCon tc_sjJq of { (# ww_sjNB, ww1_sjNC #) ->
IfaceTyCon ww_sjNB ww1_sjNC
}
which looks much more like sharing is happening.
We confirmed via ghc-debug that all duplications were eliminated and the
number of live bytes are noticeably reduced.
- - - - -
bd8209eb by Alan Zimmerman at 2024-03-19T01:43:28-04:00
EPA: Address more 9.10.1-alpha1 regressions from recent changes
Closes #24533
Hopefully for good this time
- - - - -
31bf85ee by Fendor at 2024-03-19T14:48:08-04:00
Escape multiple arguments in the settings file
Uses responseFile syntax.
The issue arises when GHC is installed on windows into a location that
has a space, for example the user name is 'Fake User'.
The $topdir will also contain a space, consequentially.
When we resolve the top dir in the string `-I$topdir/mingw/include`,
then `words` will turn this single argument into `-I/C/Users/Fake` and
`User/.../mingw/include` which trips up the flag argument parser of
various tools such as gcc or clang.
We avoid this by escaping the $topdir before replacing it in
`initSettngs`.
Additionally, we allow to escape spaces and quotation marks for
arguments in `settings` file.
Add regression test case to count the number of options after variable
expansion and argument escaping took place.
Additionally, we check that escaped spaces and double quotation marks are
correctly parsed.
- - - - -
f45f700e by Matthew Pickering at 2024-03-19T14:48:44-04:00
Read global package database from settings file
Before this patch, the global package database was always assumed to be
in libdir </> package.conf.d.
This causes issues in GHC's build system because there are sometimes
situations where the package database you need to use is not located in
the same place as the settings file.
* The stage1 compiler needs to use stage1 libraries, so we should set
"Global Package DB" for the stage1 compiler to the stage1 package
database.
* Stage 2 cross compilers need to use stage2 libraries, so likewise, we
should set the package database path to `_build/stage2/lib/`
* The normal situation is where the stage2 compiler uses stage1
libraries. Then everything lines up.
* When installing we have rearranged everything so that the settings
file and package database line up properly, so then everything should
continue to work as before. In this case we set the relative package
db path to `package.conf.d`, so it resolves the same as before.
* ghc-pkg needs to be modified as well to look in the settings file fo
the package database rather than assuming the global package database
location relative to the lib folder.
* Cabal/cabal-install will work correctly because they query the global
package database using `--print-global-package-db`.
A reasonable question is why not generate the "right" settings files in
the right places in GHC's build system. In order to do this you would
need to engineer wrappers for all executables to point to a specific
libdir. There are also situations where the same package db is used by
two different compilers with two different settings files (think stage2
cross compiler and stage3 compiler).
In short, this 10 line patch allows for some reasonable simplifications
in Hadrian at very little cost to anything else.
Fixes #24502
- - - - -
4c8f1794 by Matthew Pickering at 2024-03-19T14:48:44-04:00
hadrian: Remove stage1 testsuite wrappers logic
Now instead of producing wrappers which pass the global package database
argument to ghc and ghc-pkg, we write the location of the correct
package database into the settings file so you can just use the intree
compiler directly.
- - - - -
da0d8ba5 by Matthew Craven at 2024-03-19T14:49:20-04:00
Remove unused ghc-internal module "GHC.Internal.Constants"
- - - - -
b56d2761 by Matthew Craven at 2024-03-19T14:49:20-04:00
CorePrep: Rework lowering of BigNat# literals
Don't use bigNatFromWord#, because that's terrible:
* We shouldn't have to traverse a linked list at run-time
to build a BigNat# literal. That's just silly!
* The static List object we have to create is much larger
than the actual BigNat#'s contents, bloating code size.
* We have to read the corresponding interface file,
which causes un-tracked implicit dependencies. (#23942)
Instead, encode them into the appropriate platform-dependent
sequence of bytes, and generate code that copies these bytes
at run-time from an Addr# literal into a new ByteArray#.
A ByteArray# literal would be the correct thing to generate,
but these are not yet supported; see also #17747.
Somewhat surprisingly, this change results in a slight
reduction in compiler allocations, averaging around 0.5%
on ghc's compiler performance tests, including when compiling
programs that contain no bignum literals to begin with.
The specific cause of this has not been investigated.
Since this lowering no longer reads the interface file for
GHC.Num.BigNat, the reasoning in Note [Depend on GHC.Num.Integer]
is obsoleted. But the story of un-tracked built-in dependencies
remains complex, and Note [Tracking dependencies on primitives]
now exists to explain this complexity.
Additionally, many empty imports have been modified to refer to
this new note and comply with its guidance. Several empty imports
necessary for other reasons have also been given brief explanations.
Metric Decrease:
MultiLayerModulesTH_OneShot
- - - - -
349ea330 by Fendor at 2024-03-19T14:50:00-04:00
Eliminate thunk in 'IfaceTyCon'
Heap analysis showed that `IfaceTyCon` retains a thunk to
`IfaceTyConInfo`, defeating the sharing of the most common instances of
`IfaceTyConInfo`.
We make sure the indirection is removed by adding bang patterns to
`IfaceTyCon`.
Experimental results on the agda code base, where the `mi_extra_decls`
were read from disk:
Before this change, we observe around 8654045 instances of:
`IfaceTyCon[Name,THUNK_1_0]`
But these thunks almost exclusively point to a shared value!
Forcing the thunk a little bit more, leads to `ghc-debug` reporting:
`IfaceTyCon[Name:Name,IfaceTyConInfo]`
and a noticeable reduction of live bytes (on agda ~10%).
- - - - -
594bee0b by Krzysztof Gogolewski at 2024-03-19T14:50:36-04:00
Minor misc cleanups
- GHC.HsToCore.Foreign.JavaScript: remove dropRuntimeRepArgs;
boxed tuples don't take RuntimeRep args
- GHC.HsToCore.Foreign.Call: avoid partial pattern matching
- GHC.Stg.Unarise: strengthen the assertion; we can assert that
non-rubbish literals are unary rather than just non-void
- GHC.Tc.Gen.HsType: make sure the fsLit "literal" rule fires
- users_guide/using-warnings.rst: remove -Wforall-identifier,
now deprecated and does nothing
- users_guide/using.rst: fix formatting
- andy_cherry/test.T: remove expect_broken_for(23272...), 23272 is fixed
The rest are simple cleanups.
- - - - -
cf55a54b by Ben Gamari at 2024-03-19T14:51:12-04:00
mk/relpath: Fix quoting
Previously there were two instances in this script which lacked proper
quoting. This resulted in `relpath` invocations in the binary
distribution Makefile producing incorrect results on Windows, leading to
confusing failures from `sed` and the production of empty package
registrations.
Fixes #24538.
- - - - -
5ff88389 by Bryan Richter at 2024-03-19T14:51:48-04:00
testsuite: Disable T21336a on wasm
- - - - -
60023351 by Ben Gamari at 2024-03-19T22:33:10-04:00
hadrian/bindist: Eliminate extraneous `dirname` invocation
Previously we would call `dirname` twice per installed library file.
We now instead reuse this result. This helps appreciably on Windows, where
processes are quite expensive.
- - - - -
616ac300 by Ben Gamari at 2024-03-19T22:33:10-04:00
hadrian: Package mingw toolchain in expected location
This fixes #24525, a regression due to 41cbaf44a6ab5eb9fa676d65d32df8377898dc89.
Specifically, GHC expects to find the mingw32 toolchain in the binary distribution
root. However, after this patch it was packaged in the `lib/` directory.
- - - - -
de9daade by Ben Gamari at 2024-03-19T22:33:11-04:00
gitlab/rel_eng: More upload.sh tweaks
- - - - -
1dfe12db by Ben Gamari at 2024-03-19T22:33:11-04:00
rel_eng: Drop dead prepare_docs codepath
- - - - -
dd2d748b by Ben Gamari at 2024-03-19T22:33:11-04:00
rel_env/recompress_all: unxz before recompressing
Previously we would rather compress the xz *again*, before in addition
compressing it with the desired scheme.
Fixes #24545.
- - - - -
9d936c57 by Ben Gamari at 2024-03-19T22:33:11-04:00
mk-ghcup-metadata: Fix directory of testsuite tarball
As reported in #24546, the `dlTest` artifact should be extracted into
the `testsuite` directory.
- - - - -
6d398066 by Ben Gamari at 2024-03-19T22:33:11-04:00
ghcup-metadata: Don't populate dlOutput unless necessary
ghcup can apparently infer the output name of an artifact from its URL.
Consequently, we should only include the `dlOutput` field when it would
differ from the filename of `dlUri`.
Fixes #24547.
- - - - -
576f8b7e by Zubin Duggal at 2024-03-19T22:33:46-04:00
Revert "Apply shellcheck suggestion to SUBST_TOOLDIR"
This reverts commit c82770f57977a2b5add6e1378f234f8dd6153392.
The shellcheck suggestion is spurious and results in SUBST_TOOLDIR being a
no-op. `set` sets positional arguments for bash, but we want to set the variable
given as the first autoconf argument.
Fixes #24542
Metric decreases because the paths in the settings file are now shorter,
so we allocate less when we read the settings file.
-------------------------
Metric Decrease:
T12425
T13035
T9198
-------------------------
- - - - -
cdfe6e01 by Fendor at 2024-03-19T22:34:22-04:00
Compact serialisation of IfaceAppArgs
In #24563, we identified that IfaceAppArgs serialisation tags each
cons cell element with a discriminator byte. These bytes add up
quickly, blowing up interface files considerably when
'-fwrite-if-simplified-core' is enabled.
We compact the serialisation by writing out the length of
'IfaceAppArgs', followed by serialising the elements directly without
any discriminator byte.
This improvement can decrease the size of some interface files by up
to 35%.
- - - - -
97a2bb1c by Simon Peyton Jones at 2024-03-20T17:11:29+00:00
Expand untyped splices in tcPolyExprCheck
Fixes #24559
- - - - -
5f275176 by Alan Zimmerman at 2024-03-20T22:44:12-04:00
EPA: Clean up Exactprint helper functions a bit
- Introduce a helper lens to compose on `EpAnn a` vs `a` versions
- Rename some prime versions of functions back to non-prime
They were renamed during the rework
- - - - -
da2a10ce by Vladislav Zavialov at 2024-03-20T22:44:48-04:00
Type operators in promoteOccName (#24570)
Type operators differ from term operators in that they are lexically
classified as (type) constructors, not as (type) variables.
Prior to this change, promoteOccName did not account for this
difference, causing a scoping issue that affected RequiredTypeArguments.
type (!@#) = Bool
f = idee (!@#) -- Not in scope: ‘!@#’ (BUG)
Now we have a special case in promoteOccName to account for this.
- - - - -
247fc0fa by Preetham Gujjula at 2024-03-21T10:19:18-04:00
docs: Remove mention of non-existent Ord instance for Complex
The documentation for Data.Complex says that the Ord instance for Complex Float
is deficient, but there is no Ord instance for Complex a. The Eq instance for
Complex Float is similarly deficient, so we use that as an example instead.
- - - - -
6fafc51e by Andrei Borzenkov at 2024-03-21T10:19:54-04:00
Fix TH handling in `pat_to_type_pat` function (#24571)
There was missing case for `SplicePat` in `pat_to_type_at` function,
hence patterns with splicing that checked against `forall->` doesn't work
properly because they fall into the "illegal pattern" case.
Code example that is now accepted:
g :: forall a -> ()
g $([p| a |]) = ()
- - - - -
52072f8e by Sylvain Henry at 2024-03-21T21:01:59-04:00
Type-check default declarations before deriving clauses (#24566)
See added Note and #24566. Default declarations must be type-checked
before deriving clauses.
- - - - -
7dfdf3d9 by Sylvain Henry at 2024-03-21T21:02:40-04:00
Lexer: small perf changes
- Use unsafeChr because we know our values to be valid
- Remove some unnecessary use of `ord` (return Word8 values directly)
- - - - -
864922ef by Sylvain Henry at 2024-03-21T21:02:40-04:00
JS: fix some comments
- - - - -
3e0b2b1f by Sebastian Graf at 2024-03-21T21:03:16-04:00
Simplifier: Re-do dependency analysis in abstractFloats (#24551)
In #24551, we abstracted a string literal binding over a type variable,
triggering a CoreLint error when that binding floated to top-level.
The solution implemented in this patch fixes this by re-doing dependency
analysis on a simplified recursive let binding that is about to be type
abstracted, in order to find the minimal set of type variables to abstract over.
See wrinkle (AB5) of Note [Floating and type abstraction] for more details.
Fixes #24551
- - - - -
8a8ac65a by Matthew Craven at 2024-03-23T00:20:52-04:00
Improve toInteger @Word32 on 64-bit platforms
On 64-bit platforms, every Word32 fits in an Int, so we can
convert to Int# without having to perform the overflow check
integerFromWord# uses internally.
- - - - -
0c48f2b9 by Apoorv Ingle at 2024-03-23T00:21:28-04:00
Fix for #24552 (see testcase T24552)
Fixes for a bug in desugaring pattern synonyms matches, introduced
while working on on expanding `do`-blocks in #18324
The `matchWrapper` unecessarily (and incorrectly) filtered out the
default wild patterns in a match. Now the wild pattern alternative is
simply ignored by the pm check as its origin is `Generated`.
The current code now matches the expected semantics according to the language spec.
- - - - -
b72705e9 by Simon Peyton Jones at 2024-03-23T00:22:04-04:00
Print more info about kinds in error messages
This fixes #24553, where GHC unhelpfully said
error: [GHC-83865]
• Expected kind ‘* -> * -> *’, but ‘Foo’ has kind ‘* -> * -> *’
See Note [Showing invisible bits of types in error messages]
- - - - -
8f7cfc7e by Tristan Cacqueray at 2024-03-23T00:22:44-04:00
docs: remove the don't use float hint
This hint is outdated, ``Complex Float`` are now specialised,
and the heap space suggestion needs more nuance so it should
be explained in the unboxed/storable array documentation.
- - - - -
5bd8ed53 by Andreas Klebinger at 2024-03-23T16:18:33-04:00
NCG: Fix a bug in jump shortcutting.
When checking if a jump has more than one destination account for the
possibility of some jumps not being representable by a BlockId.
We do so by having isJumpishInstr return a `Maybe BlockId` where Nothing
represents non-BlockId jump destinations.
Fixes #24507
- - - - -
8d67f247 by Ben Gamari at 2024-03-23T16:19:09-04:00
docs: Drop old release notes, add for 9.12.1
- - - - -
7db8c992 by Cheng Shao at 2024-03-25T13:45:46-04:00
rts: fix clang compilation on aarch64
This patch fixes function prototypes in ARMOutlineAtomicsSymbols.h
which causes "error: address argument to atomic operation must be a
pointer to _Atomic type" when compiling with clang on aarch64.
- - - - -
237194ce by Sylvain Henry at 2024-03-25T13:46:27-04:00
Lexer: fix imports for Alex 3.5.1 (#24583)
- - - - -
810660b7 by Cheng Shao at 2024-03-25T22:19:16-04:00
libffi-tarballs: bump libffi-tarballs submodule to libffi 3.4.6
This commit bumps the libffi-tarballs submodule to libffi 3.4.6, which
includes numerous upstream libffi fixes, especially
https://github.com/libffi/libffi/issues/760.
- - - - -
d2ba41e8 by Alan Zimmerman at 2024-03-25T22:19:51-04:00
EPA: do not duplicate comments in signature RHS
- - - - -
32a8103f by Rodrigo Mesquita at 2024-03-26T21:16:12-04:00
configure: Use LDFLAGS when trying linkers
A user may configure `LDFLAGS` but not `LD`. When choosing a linker, we
will prefer `ldd`, then `ld.gold`, then `ld.bfd` -- however, we have to
check for a working linker. If either of these fail, we try the next in
line.
However, we were not considering the `$LDFLAGS` when checking if these
linkers worked. So we would pick a linker that does not support the
current $LDFLAGS and fail further down the line when we used that linker
with those flags.
Fixes #24565, where `LDFLAGS=-Wl,-z,pack-relative-relocs` is not
supported by `ld.gold` but that was being picked still.
- - - - -
bf65a7c3 by Rodrigo Mesquita at 2024-03-26T21:16:48-04:00
bindist: Clean xattrs of bin and lib at configure time
For issue #21506, we started cleaning the extended attributes of
binaries and libraries from the bindist *after* they were installed to
workaround notarisation (#17418), as part of `make install`.
However, the `ghc-toolchain` binary that is now shipped with the bindist
must be run at `./configure` time. Since we only cleaned the xattributes
of the binaries and libs after they were installed, in some situations
users would be unable to run `ghc-toolchain` from the bindist, failing
at configure time (#24554).
In this commit we move the xattr cleaning logic to the configure script.
Fixes #24554
- - - - -
cfeb70d3 by Rodrigo Mesquita at 2024-03-26T21:17:24-04:00
Revert "NCG: Fix a bug in jump shortcutting."
This reverts commit 5bd8ed53dcefe10b72acb5729789e19ceb22df66.
Fixes #24586
- - - - -
13223f6d by Serge S. Gulin at 2024-03-27T07:28:51-04:00
JS: `h$rts_isProfiled` is removed from `profiling` and left its version at
`rts/js/config.js`
- - - - -
0acfe391 by Alan Zimmerman at 2024-03-27T07:29:27-04:00
EPA: Do not extend declaration range for trailine zero len semi
The lexer inserts virtual semicolons having zero width.
Do not use them to extend the list span of items in a list.
- - - - -
cd0fb82f by Alan Zimmerman at 2024-03-27T19:33:08+00:00
EPA: Fix FamDecl range
The span was incorrect if opt_datafam_kind_sig was empty
- - - - -
f8f384a8 by Ben Gamari at 2024-03-29T01:23:03-04:00
Fix type of _get_osfhandle foreign import
Fixes #24601.
- - - - -
00d3ecf0 by Alan Zimmerman at 2024-03-29T12:19:10+00:00
EPA: Extend StringLiteral range to include trailing commas
This goes slightly against the exact printing philosophy where
trailing decorations should be in an annotation, but the
practicalities of adding it to the WarningTxt environment, and the
problems caused by deviating do not make a more principles approach
worthwhile.
- - - - -
efab3649 by brandon s allbery kf8nh at 2024-03-31T20:04:01-04:00
clarify Note [Preproccesing invocations]
- - - - -
c8a4c050 by Ben Gamari at 2024-04-02T12:50:35-04:00
rts: Fix TSAN_ENABLED CPP guard
This should be `#if defined(TSAN_ENABLED)`, not `#if TSAN_ENABLED`,
lest we suffer warnings.
- - - - -
e91dad93 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix errors when compiling with TSAN
This commit fixes rts compilation errors when compiling with TSAN:
- xxx_FENCE macros are redefined and trigger CPP warnings.
- Use SIZEOF_W. WORD_SIZE_IN_BITS is provided by MachDeps.h which
Cmm.h doesn't include by default.
- - - - -
a9ab9455 by Cheng Shao at 2024-04-02T12:50:35-04:00
rts: fix clang-specific errors when compiling with TSAN
This commit fixes clang-specific rts compilation errors when compiling
with TSAN:
- clang doesn't have -Wtsan flag
- Fix prototype of ghc_tsan_* helper functions
- __tsan_atomic_* functions aren't clang built-ins and
sanitizer/tsan_interface_atomic.h needs to be included
- On macOS, TSAN runtime library is
libclang_rt.tsan_osx_dynamic.dylib, not libtsan. -fsanitize-thread
as a link-time flag will take care of linking the TSAN runtime
library anyway so remove tsan as an rts extra library
- - - - -
865bd717 by Cheng Shao at 2024-04-02T12:50:35-04:00
compiler: fix github link to __tsan_memory_order in a comment
- - - - -
07cb627c by Cheng Shao at 2024-04-02T12:50:35-04:00
ci: improve TSAN CI jobs
- Run TSAN jobs with +thread_sanitizer_cmm which enables Cmm
instrumentation as well.
- Run TSAN jobs in deb12 which ships gcc-12, a reasonably recent gcc
that @bgamari confirms he's using in #GHC:matrix.org. Ideally we
should be using latest clang release for latest improvements in
sanitizers, though that's left as future work.
- Mark TSAN jobs as manual+allow_failure in validate pipelines. The
purpose is to demonstrate that we have indeed at least fixed
building of TSAN mode in CI without blocking the patch to land, and
once merged other people can begin playing with TSAN using their own
dev setups and feature branches.
- - - - -
a1c18c7b by Andrei Borzenkov at 2024-04-02T12:51:11-04:00
Merge tc_infer_hs_type and tc_hs_type into one function using ExpType philosophy (#24299, #23639)
This patch implements refactoring which is a prerequisite to
updating kind checking of type patterns. This is a huge simplification
of the main worker that checks kind of HsType.
It also fixes the issues caused by previous code duplication, e.g.
that we didn't add module finalizers from splices in inference mode.
- - - - -
817e8936 by Rodrigo Mesquita at 2024-04-02T20:13:05-04:00
th: Hide the Language.Haskell.TH.Lib.Internal module from haddock
Fixes #24562
- - - - -
b36ee57b by Sylvain Henry at 2024-04-02T20:13:46-04:00
JS: reenable h$appendToHsString optimization (#24495)
The optimization introducing h$appendToHsString wasn't kicking in
anymore (while it did in 9.8.1) because of the changes introduced in #23270 (7e0c8b3bab30).
This patch reenables the optimization by matching on case-expression, as
done in Cmm for unpackCString# standard thunks.
The test is also T24495 added in the next commits (two commits for ease
of backporting to 9.8).
- - - - -
527616e9 by Sylvain Henry at 2024-04-02T20:13:46-04:00
JS: fix h$appendToHsString implementation (#24495)
h$appendToHsString needs to wrap its argument in an updatable thunk
to behave like unpackAppendCString#. Otherwise if a SingleEntry thunk is
passed, it is stored as-is in a CONS cell, making the resulting list
impossible to deepseq (forcing the thunk doesn't update the contents of
the CONS cell)!
The added test checks that the optimization kicks in and that
h$appendToHsString works as intended.
Fix #24495
- - - - -
faa30b41 by Simon Peyton Jones at 2024-04-02T20:14:22-04:00
Deal with duplicate tyvars in type declarations
GHC was outright crashing before this fix: #24604
- - - - -
e0b0c717 by Simon Peyton Jones at 2024-04-02T20:14:58-04:00
Try using MCoercion in exprIsConApp_maybe
This is just a simple refactor that makes exprIsConApp_maybe
a little bit more direct, simple, and efficient.
Metrics: compile_time/bytes allocated
geo. mean -0.1%
minimum -2.0%
maximum -0.0%
Not a big gain, but worthwhile given that the code is, if anything,
easier to grok.
- - - - -
15f4d867 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Initial ./configure support for selecting I/O managers
In this patch we just define new CPP vars, but don't yet use them
or replace the existing approach. That will follow.
The intention here is that every I/O manager can be enabled/disabled at
GHC build time (subject to some constraints). More than one I/O manager
can be enabled to be built. At least one I/O manager supporting the
non-threaded RTS must be enabled as well as at least one supporting the
non-threaded RTS. The I/O managers enabled here will become the choices
available at runtime at RTS startup (in later patches). The choice can
be made with RTS flags. There are separate sets of choices for the
threaded and non-threaded RTS ways, because most I/O managers are
specific to these ways. Furthermore we must establish a default I/O
manager for the threaded and non-threaded RTS.
Most I/O managers are platform-specific so there are checks to ensure
each one can be enabled on the platform. Such checks are also where (in
future) any system dependencies (e.g. libraries) can be checked.
The output is a set of CPP flags (in the mk/config.h file), with one
flag per named I/O manager:
* IOMGR_BUILD_<name> : which ones should be built (some)
* IOMGR_DEFAULT_NON_THREADED_<name> : which one is default (exactly one)
* IOMGR_DEFAULT_THREADED_<name> : which one is default (exactly one)
and a set of derived flags in IOManager.h
* IOMGR_ENABLED_<name> : enabled for the current RTS way
Note that IOMGR_BUILD_<name> just says that an I/O manager will be
built for _some_ RTS way (i.e. threaded or non-threaded). The derived
flags IOMGR_ENABLED_<name> in IOManager.h say if each I/O manager is
enabled in the "current" RTS way. These are the ones that can be used
for conditional compilation of the I/O manager code.
Co-authored-by: Pi Delport <pi at well-typed.com>
- - - - -
85b0f87a by Duncan Coutts at 2024-04-03T01:27:17-04:00
Change the handling of the RTS flag --io-manager=
Now instead of it being just used on Windows to select between the WinIO
vs the MIO or Win32-legacy I/O managers, it is now used on all platforms
for selecting the I/O manager to use.
Right now it remains the case that there is only an actual choice on
Windows, but that will change later.
Document the --io-manager flag in the user guide.
This change is also reflected in the RTS flags types in the base
library. Deprecate the export of IoSubSystem from GHC.RTS.Flags with a
message to import it from GHC.IO.Subsystem.
The way the 'IoSubSystem' is detected also changes. Instead of looking
at the RTS flag, there is now a C bool global var in the RTS which gets
set on startup when the I/O manager is selected. This bool var says
whether the selected I/O manager classifies as "native" on Windows,
which in practice means the WinIO I/O manager has been selected.
Similarly, the is_io_mng_native_p RTS helper function is re-implemented
in terms of the selected I/O manager, rather than based on the RTS
flags.
We do however remove the ./configure --native-io-manager flag because
we're bringing the WinIO/MIO/Win32-legacy choice under the new general
scheme for selecting I/O managers, and that new scheme involves no
./configure time user choices, just runtime RTS flag choices.
- - - - -
1a8f020f by Duncan Coutts at 2024-04-03T01:27:17-04:00
Convert {init,stop,exit}IOManager to switch style
Rather than ad-hoc cpp conitionals on THREADED_RTS and mingw32_HOST_OS,
we use a style where we switch on the I/O manager impl, with cases for
each I/O manager impl.
- - - - -
a5bad3d2 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Split up the CapIOManager content by I/O manager
Using the new IOMGR_ENABLED_<name> CPP defines.
- - - - -
1d36e609 by Duncan Coutts at 2024-04-03T01:27:17-04:00
Convert initIOManagerAfterFork and wakeupIOManager to switch style
- - - - -
c2f26f36 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move most of waitRead#/Write# from cmm to C
Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.
- - - - -
457705a8 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move most of the delay# impl from cmm to C
Moves it into the IOManager.c where we can follow the new pattern of
switching on the selected I/O manager.
Uses a new IOManager API: syncDelay, following the naming convention of
sync* for thread-synchronous I/O & timer/delay operations.
As part of porting from cmm to C, we maintain the rule that the
why_blocked gets accessed using load acquire and store release atomic
memory operations. There was one exception to this rule: in the delay#
primop cmm code on posix (not win32), the why_blocked was being updated
using a store relaxed, not a store release. I've no idea why. In this
convesion I'm playing it safe here and using store release consistently.
- - - - -
e93058e0 by Duncan Coutts at 2024-04-03T01:27:18-04:00
insertIntoSleepingQueue is no longer public
No longer defined in IOManager.h, just a private function in
IOManager.c. Since it is no longer called from cmm code, just from
syncDelay. It ought to get moved further into the select() I/O manager
impl, rather than living in IOManager.c.
On the other hand appendToIOBlockedQueue is still called from cmm code
in the win32-legacy I/O manager primops async{Read,Write}#, and it is
also used by the select() I/O manager. Update the CPP and comments to
reflect this.
- - - - -
60ce9910 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move anyPendingTimeoutsOrIO impl from .h to .c
The implementation is eventually going to need to use more private
things, which will drag in unwanted includes into IOManager.h, so it's
better to move the impl out of the header file and into the .c file, at
the slight cost of it no longer being inline.
At the same time, change to the "switch (iomgr_type)" style.
- - - - -
f70b8108 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Take a simpler approach to gcc warnings in IOManager.c
We have lots of functions with conditional implementations for
different I/O managers. Some functions, for some I/O managers,
naturally have implementations that do nothing or barf. When only one
such I/O manager is enabled then the whole function implementation will
have an implementation that does nothing or barfs. This then results in
warnings from gcc that parameters are unused, or that the function
should be marked with attribute noreturn (since barf does not return).
The USED_IF_THREADS trick for fine-grained warning supression is fine
for just two cases, but an equivalent here would need
USED_IF_THE_ONLY_ENABLED_IOMGR_IS_X_OR_Y which would have combinitorial
blowup. So we take a coarse grained approach and simply disable these
two warnings for the whole file.
So we use a GCC pragma, with its handy push/pop support:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wsuggest-attribute=noreturn"
#pragma GCC diagnostic ignored "-Wunused-parameter"
...
#pragma GCC diagnostic pop
- - - - -
b48805b9 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add a new trace class for the iomanager
It makes sense now for it to be separate from the scheduler class of
tracers.
Enabled with +RTS -Do. Document the -Do debug flag in the user guide.
- - - - -
f0c1f862 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Have the throwTo impl go via (new) IOManager APIs
rather than directly operating on the IO manager's data structures.
Specifically, when thowing an async exception to a thread that is
blocked waiting for I/O or waiting for a timer, then we want to cancel
that I/O waiting or cancel the timer. Currently this is done directly in
removeFromQueues() in RaiseAsync.c. We want it to go via proper APIs
both for modularity but also to let us support multiple I/O managers.
So add sync{IO,Delay}Cancel, which is the cancellation for the
corresponding sync{IO,Delay}. The implementations of these use the usual
"switch (iomgr_type)" style.
- - - - -
4f9e9c4e by Duncan Coutts at 2024-04-03T01:27:18-04:00
Move awaitEvent into a proper IOManager API
and have the scheduler use it.
Previously the scheduler calls awaitEvent directly, and awaitEvent is
implemented directly in the RTS I/O managers (select, win32). This
relies on the old scheme where there's a single active I/O manager for
each platform and RTS way.
We want to move that to go via an API in IOManager.{h,c} which can then
call out to the active I/O manager.
Also take the opportunity to split awaitEvent into two. The existing
awaitEvent has a bool wait parameter, to say if the call should be
blocking or non-blocking. We split this into two separate functions:
pollCompletedTimeoutsOrIO and awaitCompletedTimeoutsOrIO. We split them
for a few reasons: they have different post-conditions (specifically the
await version is supposed to guarantee that there are threads runnable
when it completes). Secondly, it is also anticipated that in future I/O
managers the implementations of the two cases will be simpler if they
are separated.
- - - - -
5ad4b30f by Duncan Coutts at 2024-04-03T01:27:18-04:00
Rename awaitEvent in select and win32 I/O managers
These are now just called from IOManager.c and are the per-I/O manager
backend impls (whereas previously awaitEvent was the entry point).
Follow the new naming convention in the IOManager.{h,c} of
awaitCompletedTimeoutsOrIO, with the I/O manager's name as a suffix:
so awaitCompletedTimeoutsOrIO{Select,Win32}.
- - - - -
d30c6bc6 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Tidy up a couple things in Select.{h,c}
Use the standard #include {Begin,End}Private.h style rather than
RTS_PRIVATE on individual decls.
And conditionally build the code for the select I/O manager based on
the new CPP IOMGR_ENABLED_SELECT rather than on THREADED_RTS.
- - - - -
4161f516 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add an IOManager API for scavenging TSO blocked_info
When the GC scavenges a TSO it needs to scavenge the tso->blocked_info
but the blocked_info is a big union and what lives there depends on the
two->why_blocked, which for I/O-related reasons is something that in
principle is the responsibility of the I/O manager and not the GC. So
the right thing to do is for the GC to ask the I/O manager to sscavenge
the blocked_info if it encounters any I/O-related why_blocked reasons.
So we add scavengeTSOIOManager in IOManager.{h,c} with the usual style.
Now as it happens, right now, there is no special scavenging to do, so
the implementation of scavengeTSOIOManager is a fancy no-op. That's
because the select I/O manager uses only the fd and target members,
which are not GC pointers, and the win32-legacy I/O manager _ought_ to
be using GC-managed heap objects for the StgAsyncIOResult but it is
actually usingthe C heap, so again no GC pointers. If the win32-legacy
were doing this more sensibly, then scavengeTSOIOManager would be the
right place to do the GC magic.
Future I/O managers will need GC heap objects in the tso->blocked_info
and will make use of this functionality.
- - - - -
94a87d21 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add I/O manager API notifyIOManagerCapabilitiesChanged
Used in setNumCapabilities.
It only does anything for MIO on Posix.
Previously it always invoked Haskell code, but that code only did
anything on non-Windows (and non-JS), and only threaded. That currently
effectively means the MIO I/O manager on Posix.
So now it only invokes it for the MIO Posix case.
- - - - -
3be6d591 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Select an I/O manager early in RTS startup
We need to select the I/O manager to use during startup before the
per-cap I/O manager initialisation.
- - - - -
aaa294d0 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Make struct CapIOManager be fully opaque
Provide an opaque (forward) definition in Capability.h (since the cap
contains a *CapIOManager) and then only provide a full definition in
a new file IOManagerInternals.h. This new file is only supposed to be
included by the IOManager implementation, not by its users. So that
means IOManager.c and individual I/O manager implementations.
The posix/Signals.c still needs direct access, but that should be
eliminated. Anything that needs direct access either needs to be clearly
part of an I/O manager (e.g. the sleect() one) or go via a proper API.
- - - - -
877a2a80 by Duncan Coutts at 2024-04-03T01:27:18-04:00
The select() I/O manager does have some global initialisation
It's just to make sure an exception CAF is a GC root.
- - - - -
9c51473b by Duncan Coutts at 2024-04-03T01:27:18-04:00
Add tracing for the main I/O manager actions
Using the new tracer class.
Note: The unconditional definition of showIOManager should be
compatible with the debugTrace change in 7c7d1f6.
Co-authored-by: Pi Delport <pi at well-typed.com>
- - - - -
c7d3e3a3 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Include the default I/O manager in the +RTS --info output
Document the extra +RTS --info output in the user guide
- - - - -
8023bad4 by Duncan Coutts at 2024-04-03T01:27:18-04:00
waitRead# / waitWrite# do not work for win32-legacy I/O manager
Previously it was unclear that they did not work because the code path
was shared with other I/O managers (in particular select()).
Following the code carefully shows that what actually happens is that
the calling thread would block forever: the thread will be put into the
blocked queue, but no other action is scheduled that will ever result in
it getting unblocked.
It's better to just fail loudly in case anyone accidentally calls it,
also it's less confusing code.
- - - - -
83a74d20 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Conditionally ignore some GCC warnings
Some GCC versions don't know about some warnings, and they complain
that we're ignoring unknown warnings. So we try to ignore the warning
based on the GCC version.
- - - - -
1adc6fa4 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Accept changes to base-exports
All the changes are in fact not changes at all.
Previously, the IoSubSystem data type was defined in GHC.RTS.Flags and
exported from both GHC.RTS.Flags and GHC.IO.SubSystem. Now, the data
type is defined in GHC.IO.SubSystem and still exported from both
modules.
Therefore, the same exports and same instances are still available from
both modules. But the base-exports records only the defining module, and
so it looks like a change when it is fully compatible.
Related: we do add a deprecation to the export of the type via
GHC.RTS.Flags, telling people to use the export from GHC.IO.SubSystem.
Also the sort order for some unrelated Show instances changed. No idea
why.
The same changes apply in the other versions, with a few more changes
due to sort order weirdness.
- - - - -
8d950968 by Duncan Coutts at 2024-04-03T01:27:18-04:00
Accept metric decrease in T12227
I can't think of any good reason that anything in this MR should have
changed the number of allocations, up or down.
(Yes this is an empty commit.)
Metric Decrease:
T12227
- - - - -
e869605e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Several improvements to the handling of coercions
* Make `mkSymCo` and `mkInstCo` smarter
Fixes #23642
* Fix return role of `SelCo` in the coercion optimiser.
Fixes #23617
* Make the coercion optimiser `opt_trans_rule` work better for newtypes
Fixes #23619
- - - - -
1efd0714 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
FloatOut: improve floating for join point
See the new Note [Floating join point bindings].
* Completely get rid of the complicated join_ceiling nonsense, which
I have never understood.
* Do not float join points at all, except perhaps to top level.
* Some refactoring around wantToFloat, to treat Rec and NonRec more
uniformly
- - - - -
9c00154d by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve eta-expansion through call stacks
See Note [Eta expanding through CallStacks] in GHC.Core.Opt.Arity
This is a one-line change, that fixes an inconsistency
- || isCallStackPredTy ty
+ || isCallStackPredTy ty || isCallStackTy ty
- - - - -
95a9a172 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Spelling, layout, pretty-printing only
- - - - -
bdf1660f by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Improve exprIsConApp_maybe a little
Eliminate a redundant case at birth. This sometimes reduces
Simplifier iterations.
See Note [Case elim in exprIsConApp_maybe].
- - - - -
609cd32c by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Inline GHC.HsToCore.Pmc.Solver.Types.trvVarInfo
When exploring compile-time regressions after meddling with the Simplifier, I
discovered that GHC.HsToCore.Pmc.Solver.Types.trvVarInfo was very delicately
balanced. It's a small, heavily used, overloaded function and it's important
that it inlines. By a fluke it was before, but at various times in my journey it
stopped doing so. So I just added an INLINE pragma to it; no sense in depending
on a delicately-balanced fluke.
- - - - -
ae24c9bc by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Slight improvement in WorkWrap
Ensure that WorkWrap preserves lambda binders, in case of join points. Sadly I
have forgotten why I made this change (it was while I was doing a lot of
meddling in the Simplifier, but
* it does no harm,
* it is slightly more efficient, and
* presumably it made something better!
Anyway I have kept it in a separate commit.
- - - - -
e9297181 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Use named record fields for the CastIt { ... } data constructor
This is a pure refactor
- - - - -
b4581e23 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Remove a long-commented-out line
Pure refactoring
- - - - -
e026bdf2 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Simplifier improvements
This MR started as: allow the simplifer to do more in one pass,
arising from places I could see the simplifier taking two iterations
where one would do. But it turned into a larger project, because
these changes unexpectedly made inlining blow up, especially join
points in deeply-nested cases.
The main changes are below. There are also many new or rewritten Notes.
Avoiding simplifying repeatedly
~~~~~~~~~~~~~~~
See Note [Avoiding simplifying repeatedly]
* The SimplEnv now has a seInlineDepth field, which says how deep
in unfoldings we are. See Note [Inline depth] in Simplify.Env.
Currently used only for the next point: avoiding repeatedly
simplifying coercions.
* Avoid repeatedly simplifying coercions.
see Note [Avoid re-simplifying coercions] in Simplify.Iteration
As you'll see from the Note, this makes use of the seInlineDepth.
* Allow Simplify.Iteration.simplAuxBind to inline used-once things.
This is another part of Note [Post-inline for single-use things], and
is really good for reducing simplifier iterations in situations like
case K e of { K x -> blah }
wher x is used once in blah.
* Make GHC.Core.SimpleOpt.exprIsConApp_maybe do some simple case
elimination. Note [Case elim in exprIsConApp_maybe]
* Improve the case-merge transformation:
- Move the main code to `GHC.Core.Utils.mergeCaseAlts`, to join `filterAlts`
and friends. See Note [Merge Nested Cases] in GHC.Core.Utils.
- Add a new case for `tagToEnum#`; see wrinkle (MC3).
- Add a new case to look through join points: see wrinkle (MC4)
postInlineUnconditionally
~~~~~~~~~~~~~~~~~~~~~~~~~
* Allow Simplify.Utils.postInlineUnconditionally to inline variables
that are used exactly once. See Note [Post-inline for single-use things].
* Do not postInlineUnconditionally join point, ever.
Doing so does not reduce allocation, which is the main point,
and with join points that are used a lot it can bloat code.
See point (1) of Note [Duplicating join points] in
GHC.Core.Opt.Simplify.Iteration.
* Do not postInlineUnconditionally a strict (demanded) binding.
It will not allocate a thunk (it'll turn into a case instead)
so again the main point of inlining it doesn't hold. Better
to check per-call-site.
* Improve occurrence analyis for bottoming function calls, to help
postInlineUnconditionally. See Note [Bottoming function calls]
in GHC.Core.Opt.OccurAnal
Inlining generally
~~~~~~~~~~~~~~~~~~
* In GHC.Core.Opt.Simplify.Utils.interestingCallContext,
use RhsCtxt NonRecursive (not BoringCtxt) for a plain-seq case.
See Note [Seq is boring] Also, wrinkle (SB1), inline in that
`seq` context only for INLINE functions (UnfWhen guidance).
* In GHC.Core.Opt.Simplify.Utils.interestingArg,
- return ValueArg for OtherCon [c1,c2, ...], but
- return NonTrivArg for OtherCon []
This makes a function a little less likely to inline if all we
know is that the argument is evaluated, but nothing else.
* isConLikeUnfolding is no longer true for OtherCon {}.
This propagates to exprIsConLike. Con-like-ness has /positive/
information.
Join points
~~~~~~~~~~~
* Be very careful about inlining join points.
See these two long Notes
Note [Duplicating join points] in GHC.Core.Opt.Simplify.Iteration
Note [Inlining join points] in GHC.Core.Opt.Simplify.Inline
* When making join points, don't do so if the join point is so small
it will immediately be inlined; check uncondInlineJoin.
* In GHC.Core.Opt.Simplify.Inline.tryUnfolding, improve the inlining
heuristics for join points. In general we /do not/ want to inline
join points /even if they are small/. See Note [Duplicating join points]
GHC.Core.Opt.Simplify.Iteration.
But sometimes we do: see Note [Inlining join points] in
GHC.Core.Opt.Simplify.Inline; and the new `isBetterUnfoldingThan` function.
* Do not add an unfolding to a join point at birth. This is a tricky one
and has a long Note [Do not add unfoldings to join points at birth]
It shows up in two places
- In `mkDupableAlt` do not add an inlining
- (trickier) In `simplLetUnfolding` don't add an unfolding for a
fresh join point
I am not fully satisifed with this, but it works and is well documented.
* In GHC.Core.Unfold.sizeExpr, make jumps small, so that we don't penalise
having a non-inlined join point.
Performance changes
~~~~~~~~~~~~~~~~~~~
* Binary sizes fall by around 2.6%, according to nofib.
* Compile times improve slightly. Here are the figures over 1%.
I investiate the biggest differnce in T18304. It's a very small module, just
a few hundred nodes. The large percentage difffence is due to a single
function that didn't quite inline before, and does now, making code size a
bit bigger. I decided gains outweighed the losses.
Metrics: compile_time/bytes allocated (changes over +/- 1%)
------------------------------------------------
CoOpt_Singletons(normal) -9.2% GOOD
LargeRecord(normal) -23.5% GOOD
MultiComponentModulesRecomp(normal) +1.2%
MultiLayerModulesTH_OneShot(normal) +4.1% BAD
PmSeriesS(normal) -3.8%
PmSeriesV(normal) -1.5%
T11195(normal) -1.3%
T12227(normal) -20.4% GOOD
T12545(normal) -3.2%
T12707(normal) -2.1% GOOD
T13253(normal) -1.2%
T13253-spj(normal) +8.1% BAD
T13386(normal) -3.1% GOOD
T14766(normal) -2.6% GOOD
T15164(normal) -1.4%
T15304(normal) +1.2%
T15630(normal) -8.2%
T15630a(normal) NEW
T15703(normal) -14.7% GOOD
T16577(normal) -2.3% GOOD
T17516(normal) -39.7% GOOD
T18140(normal) +1.2%
T18223(normal) -17.1% GOOD
T18282(normal) -5.0% GOOD
T18304(normal) +10.8% BAD
T18923(normal) -2.9% GOOD
T1969(normal) +1.0%
T19695(normal) -1.5%
T20049(normal) -12.7% GOOD
T21839c(normal) -4.1% GOOD
T3064(normal) -1.5%
T3294(normal) +1.2% BAD
T4801(normal) +1.2%
T5030(normal) -15.2% GOOD
T5321Fun(normal) -2.2% GOOD
T6048(optasm) -16.8% GOOD
T783(normal) -1.2%
T8095(normal) -6.0% GOOD
T9630(normal) -4.7% GOOD
T9961(normal) +1.9% BAD
WWRec(normal) -1.4%
info_table_map_perf(normal) -1.3%
parsing001(normal) +1.5%
geo. mean -2.0%
minimum -39.7%
maximum +10.8%
* Runtimes generally improve. In the testsuite perf/should_run gives:
Metrics: runtime/bytes allocated
------------------------------------------
Conversions(normal) -0.3%
T13536a(optasm) -41.7% GOOD
T4830(normal) -0.1%
haddock.Cabal(normal) -0.1%
haddock.base(normal) -0.1%
haddock.compiler(normal) -0.1%
geo. mean -0.8%
minimum -41.7%
maximum +0.0%
* For runtime, nofib is a better test. The news is mostly good.
Here are the number more than +/- 0.1%:
# bytes allocated
==========================++==========
imaginary/digits-of-e1 || -14.40%
imaginary/digits-of-e2 || -4.41%
imaginary/paraffins || -0.17%
imaginary/rfib || -0.15%
imaginary/wheel-sieve2 || -0.10%
real/compress || -0.47%
real/fluid || -0.10%
real/fulsom || +0.14%
real/gamteb || -1.47%
real/gg || -0.20%
real/infer || +0.24%
real/pic || -0.23%
real/prolog || -0.36%
real/scs || -0.46%
real/smallpt || +4.03%
shootout/k-nucleotide || -20.23%
shootout/n-body || -0.42%
shootout/spectral-norm || -0.13%
spectral/boyer2 || -3.80%
spectral/constraints || -0.27%
spectral/hartel/ida || -0.82%
spectral/mate || -20.34%
spectral/para || +0.46%
spectral/rewrite || +1.30%
spectral/sphere || -0.14%
==========================++==========
geom mean || -0.59%
real/smallpt has a huge nest of local definitions, and I
could not pin down a reason for a regression. But there are
three big wins!
Metric Decrease:
CoOpt_Singletons
LargeRecord
T12227
T12707
T13386
T13536a
T14766
T15703
T16577
T17516
T18223
T18282
T18923
T21839c
T20049
T5321Fun
T5030
T6048
T8095
T9630
T783
Metric Increase:
MultiLayerModulesTH_OneShot
T13253-spj
T18304
T18698a
T9961
T3294
- - - - -
27db3c5e by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Testsuite message changes from simplifier improvements
- - - - -
271a7812 by Simon Peyton Jones at 2024-04-03T01:27:55-04:00
Account for bottoming functions in OccurAnal
This fixes #24582, a small but long-standing bug
- - - - -
0fde229f by Ben Gamari at 2024-04-04T07:04:58-04:00
testsuite: Introduce template-haskell-exports test
- - - - -
0c4a9686 by Luite Stegeman at 2024-04-04T07:05:39-04:00
Update correct counter in bumpTickyAllocd
- - - - -
5f085d3a by Fendor at 2024-04-04T14:47:33-04:00
Replace `SizedSeq` with `FlatBag` for flattened structure
LinkedLists are notoriously memory ineffiecient when all we do is
traversing a structure.
As 'UnlinkedBCO' has been identified as a data structure that impacts
the overall memory usage of GHCi sessions, we avoid linked lists and
prefer flattened structure for storing.
We introduce a new memory efficient representation of sequential
elements that has special support for the cases:
* Empty
* Singleton
* Tuple Elements
This improves sharing in the 'Empty' case and avoids the overhead of
'Array' until its constant overhead is justified.
- - - - -
82cfe10c by Fendor at 2024-04-04T14:47:33-04:00
Compact FlatBag array representation
`Array` contains three additional `Word`'s we do not need in `FlatBag`. Move
`FlatBag` to `SmallArray`.
Expand the API of SmallArray by `sizeofSmallArray` and add common
traversal functions, such as `mapSmallArray` and `foldMapSmallArray`.
Additionally, allow users to force the elements of a `SmallArray`
via `rnfSmallArray`.
- - - - -
36a75b80 by Andrei Borzenkov at 2024-04-04T14:48:10-04:00
Change how invisible patterns represented in haskell syntax and TH AST (#24557)
Before this patch:
data ArgPat p
= InvisPat (LHsType p)
| VisPat (LPat p)
With this patch:
data Pat p
= ...
| InvisPat (LHsType p)
...
And the same transformation in the TH land. The rest of the
changes is just updating code to handle new AST and writing tests
to check if it is possible to create invalid states using TH.
Metric Increase:
MultiLayerModulesTH_OneShot
- - - - -
28009fbc by Matthew Pickering at 2024-04-04T14:48:46-04:00
Fix off by one error in seekBinNoExpand and seekBin
- - - - -
9b9e031b by Ben Gamari at 2024-04-04T21:30:08-04:00
compiler: Allow more types in GHCForeignImportPrim
For many, many years `GHCForeignImportPrim` has suffered from the rather
restrictive limitation of not allowing any non-trivial types in arguments
or results. This limitation was justified by the code generator allegely
barfing in the presence of such types.
However, this restriction appears to originate well before the NCG
rewrite and the new NCG does not appear to have any trouble with such
types (see the added `T24598` test). Lift this restriction.
Fixes #24598.
- - - - -
1324b862 by Alan Zimmerman at 2024-04-04T21:30:44-04:00
EPA: Use EpaLocation not SrcSpan in ForeignDecls
This allows us to update them for makeDeltaAst in ghc-exactprint
- - - - -
19883a23 by Alan Zimmerman at 2024-04-05T16:58:17-04:00
EPA: Use EpaLocation for RecFieldsDotDot
So we can update it to a delta position in makeDeltaAst if needed.
- - - - -
e8724327 by Matthew Pickering at 2024-04-05T16:58:53-04:00
Remove accidentally committed test.hs
- - - - -
88cb3e10 by Fendor at 2024-04-08T09:03:34-04:00
Avoid UArray when indexing is not required
`UnlinkedBCO`'s can occur many times in the heap. Each `UnlinkedBCO`
references two `UArray`'s but never indexes them. They are only needed
to encode the elements into a `ByteArray#`. The three words for
the lower bound, upper bound and number of elements are essentially
unused, thus we replace `UArray` with a wrapper around `ByteArray#`.
This saves us up to three words for each `UnlinkedBCO`.
Further, to avoid re-allocating these words for `ResolvedBCO`, we repeat
the procedure for `ResolvedBCO` and add custom `Binary` and `Show` instances.
For example, agda's repl session has around 360_000 UnlinkedBCO's,
so avoiding these three words is already saving us around 8MB residency.
- - - - -
f2cc1107 by Fendor at 2024-04-08T09:04:11-04:00
Never UNPACK `FastMutInt` for counting z-encoded `FastString`s
In `FastStringTable`, we count the number of z-encoded FastStrings
that exist in a GHC session.
We used to UNPACK the counters to not waste memory, but live retainer
analysis showed that we allocate a lot of `FastMutInt`s, retained by
`mkFastZString`.
We lazily compute the `FastZString`, only incrementing the counter when the `FastZString` is
forced.
The function `mkFastStringWith` calls `mkZFastString` and boxes the
`FastMutInt`, leading to the following core:
mkFastStringWith
= \ mk_fs _ ->
= case stringTable of
{ FastStringTable _ n_zencs segments# _ ->
...
case ((mk_fs (I# ...) (FastMutInt n_zencs))
`cast` <Co:2> :: ...)
...
Marking this field as `NOUNPACK` avoids this reboxing, eliminating the
allocation of a fresh `FastMutInt` on every `FastString` allocation.
- - - - -
c6def949 by Matthew Pickering at 2024-04-08T16:06:51-04:00
Force in_multi to avoid retaining entire hsc_env
- - - - -
fbb91a63 by Fendor at 2024-04-08T16:06:51-04:00
Eliminate name thunk in declaration fingerprinting
Thunk analysis showed that we have about 100_000 thunks (in agda and
`-fwrite-simplified-core`) pointing to the name of the name decl.
Forcing this thunk fixes this issue.
The thunk created here is retained by the thunk created by forkM, it is
better to eagerly force this because the result (a `Name`) is already
retained indirectly via the `IfaceDecl`.
- - - - -
3b7b0c1c by Alan Zimmerman at 2024-04-08T16:07:27-04:00
EPA: Use EpaLocation in WarningTxt
This allows us to use an EpDelta if needed when using makeDeltaAst.
- - - - -
12b997df by Alan Zimmerman at 2024-04-08T16:07:27-04:00
EPA: Move DeltaPos and EpaLocation' into GHC.Types.SrcLoc
This allows us to use a NoCommentsLocation for the possibly trailing
comma location in a StringLiteral.
This in turn allows us to correctly roundtrip via makeDeltaAst.
- - - - -
18ade573 by Finley McIlwaine at 2024-04-08T16:05:14-07:00
base: Add CostCentreId, currentCallStackIds, ccsToIds, ccId
Add functions for gettings the IDs of cost centres to the interface of
`GHC.Stack`, `GHC.Stack.CCS`, and `GHC.Exts`. Also add an opaque type for cost
center ids, `CostCentreId`, with appropriate instances.
Implements CLC proposal 235.
Resolves #24277
- - - - -
30 changed files:
- .gitlab-ci.yml
- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- .gitlab/rel_eng/default.nix
- .gitlab/rel_eng/fetch-gitlab-artifacts/fetch_gitlab.py
- .gitlab/rel_eng/mk-ghcup-metadata/README.mkd
- .gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
- + .gitlab/rel_eng/recompress-all
- .gitlab/rel_eng/upload.sh
- .gitlab/rel_eng/upload_ghc_libs.py
- compiler/GHC.hs
- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Builtin/Names/TH.hs
- compiler/GHC/Builtin/PrimOps.hs-boot
- compiler/GHC/Builtin/Types/Prim.hs
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/ByteCode/Types.hs
- compiler/GHC/Cmm/Dominators.hs
- compiler/GHC/Cmm/ThreadSanitizer.hs
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/LateCC.hs
- + compiler/GHC/Core/LateCC/OverloadedCalls.hs
- + compiler/GHC/Core/LateCC/TopLevelBinds.hs
- + compiler/GHC/Core/LateCC/Types.hs
- + compiler/GHC/Core/LateCC/Utils.hs
The diff was not included because it is too large.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/f0f9cb82b786a7007604a6abcc3b603f56458393...18ade57396eaa7e6bf0b4c053c25e6c932e034ce
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/f0f9cb82b786a7007604a6abcc3b603f56458393...18ade57396eaa7e6bf0b4c053c25e6c932e034ce
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240409/e30eeddb/attachment-0001.html>
More information about the ghc-commits
mailing list