[Git][ghc/ghc][wip/andreask/9.10-backports] 20 commits: Hadrian: use / when making filepaths absolute

Tue Feb 11 12:58:32 UTC 2025

Andreas Klebinger pushed to branch wip/andreask/9.10-backports at Glasgow Haskell Compiler / GHC


Commits:
1b3dc557 by sheaf at 2025-02-06T14:50:22+01:00
Hadrian: use / when making filepaths absolute

In Hadrian, we are careful to use -/- rather than </>, in order to use
/ instead of \ in filepaths. However, this gets ruined by the use of
makeAbsolute from System.Directory, which, on Windows, changes back
forward slashes to backslashes.

(cherry picked from commit f813c8d70e41f0d4663d894db2fee593c71a9772)

- - - - -
ec264a6a by Andreas Klebinger at 2025-02-07T12:22:26+01:00
Revert "AArch64: Implement switch/jump tables (#19912)"

This reverts commit 9b4326cf4180eb70f7474aa143b46a49cb69306f.

- - - - -
19e75b80 by Ben Gamari at 2025-02-07T13:09:55+01:00
rts/linker: Fix out-of-bounds mapping logic

Previously the structure of `mmapInRegion` concealed a subtle bug
concerning handling of `mmap` returning mappings below the beginning of
the desired region. Specifically, we would reset `p = result + bytes`
and then again reset `p = region->start` before looping around for
another iteration. This resulted in an infinite loop on FreeBSD.

Fixes #25492.

(cherry picked from commit 292ed74ea908b64490e91346b890cbebdcde37d0)

- - - - -
2bfa796f by Ben Gamari at 2025-02-07T13:10:04+01:00
rts/linker: Clarify debug output

(cherry picked from commit 20912f5bac6fe4146172accc1849d9b762eb45e3)

- - - - -
26c4086e by Ben Gamari at 2025-02-07T13:10:50+01:00
hadrian: Mitigate mktexfmt race

At least some versions of Texlive's `mktexfmt` utility cannot be invoked
concurrently in their initial run since they fail to handle failure of
`mkdir` due to racing. Specifically, we see

```
| Run Xelatex: users_guide.tex => /tmp/extra-dir-9616886274866
| Run Xelatex: Haddock.tex => /tmp/extra-dir-9616886274869
This is XeTeX, Version 3.14159265-2.6-0.999992 (TeX Live 2020) (preloaded format=xelatex)
 restricted \write18 enabled.
kpathsea: Running mktexfmt xelatex.fmt
mktexfmt: mktexfmt is using the following fmtutil.cnf files (in precedence order):
mktexfmt:   /usr/share/texlive/texmf-dist/web2c/fmtutil.cnf
mktexfmt: mktexfmt is using the following fmtutil.cnf file for writing changes:
mktexfmt:   /builds/ghc/ghc/tmp-home/.texlive2020/texmf-config/web2c/fmtutil.cnf
/usr/bin/mktexfmt: mkdir(/builds/ghc/ghc/tmp-home/.texlive2020/texmf-var/web2c/) failed for tree /builds/ghc/ghc/tmp-home/.texlive2020/texmf-var/web2c: File exists at /usr/share/texlive/tlpkg/TeXLive/TLUtils.pm line 937.
I can't find the format file `xelatex.fmt'!
```

That is two `mktexfmt` invocations (for the user's guide and haddock
builds) attempted to create `$HOME/texlive2020/texmf-var/web2c` and
raced. One of the two `mkdir`'s consequently failed, bringing down the
entire build.

We avoid this by ensuring that the first `xelatex` invocation is always
performed serially.

Fixes #25564.

(cherry picked from commit 41dae5b86955094aa4c5647f63f1f52f1a8a5519)

- - - - -
fb9948f2 by Ben Gamari at 2025-02-07T13:11:04+01:00
rts/CheckUnload: Reset old_objects if unload is skipped

Previously `checkUnload` failed to reset `old_objects` when it decided
not to unload (e.g. due to heap profiling being enabled).

Fixes #24935.

(cherry picked from commit 9efbc51f99118e8f9c3abf2bcb6dc3295893ded6)

- - - - -
6f89c0b7 by Ben Gamari at 2025-02-07T13:12:01+01:00
rts(setNumCapabilities): Assert that n_caps < MAX_N_CAPS

It was noticed in #25560 that this would previously be allowed,
resulting in a segfault.

I will add a proper exception in `base` in a future commit.

(cherry picked from commit f08a72eb484193934c56e34366b277f4d7247a24)

- - - - -
b0bdc3d9 by Ben Gamari at 2025-02-07T13:12:08+01:00
ghc-internal: Fix inconsistent FFI import types

The foreign imports of `enabled_capabilities` and
`getNumberOfProcessors` were declared as `CInt` whereas they are defined
as `uint32_t`.

(cherry picked from commit e10d31ad849b5f7c1f052e7c93f7c7aaf85918c9)

- - - - -
859a4300 by Ben Gamari at 2025-02-07T13:12:14+01:00
rts: Mention maximum capability count in users guide

Addresses #25560.

(cherry picked from commit 06265655bfe6b48cde6923a933d81c9889a443a3)

- - - - -
e683e293 by Ben Gamari at 2025-02-07T13:12:22+01:00
rts/Capability: Move induction variable declaration into `for`s

Just a stylistic change.

(cherry picked from commit d488470ba302760cfd2f3515d9338d1d75f84dd5)

- - - - -
47a4ff98 by Ben Gamari at 2025-02-07T13:12:28+01:00
rts: Determine max_n_capabilities at RTS startup

Previously the maximum number of capabilities supported by the RTS was
statically capped at 256. However, this bound is uncomfortably low given
the size of today's machine.

While supporting unbounded, fully-dynamic adjustment would be nice, it
is complex and so instead we do something simpler: Probe the logical
core count at RTS startup and use this as the static bound for the rest
of our execution.

This should avoid users running into the capability limit on large
machines while avoiding wasting memory on a large capabilities array for
most users and keeping complexity at bay.

Addresses #25560.

(cherry picked from commit 71f050b74eaa2fdc2ca5da53f85497ac94ab6a2a)

- - - - -
482c1f86 by Ben Gamari at 2025-02-07T13:12:36+01:00
testsuite: Introduce req_c_rts

As suggested by @hsyl20, this is intended to mark tests that rely on the
behavior of the C RTS.

(cherry picked from commit 1e84b41108d96cb721dd11281105fdf621105a12)

- - - - -
d70eafac by Ben Gamari at 2025-02-07T13:13:57+01:00
testsuite: Add test for #25560

(cherry picked from commit 683115a40fd989a287fa51efe140af9448526098)

- - - - -
fadf2fb3 by Ben Gamari at 2025-02-07T13:14:09+01:00
rts/CheckUnload: Don't prepare to unload if we can't unload

Previously `prepareUnloadCheck` would move the `objects` list to
`old_objects` even when profiling (where we cannot unload). This caused
us to vacate the `objects` list during major GCs, losing track of loaded
objects. Fix this by ensuring that `prepareUnloadCheck` and
`checkUnload` both use the same short-cutting logic.

(cherry picked from commit 34d3e8e69b62b92cc438514f7fb8e37ce639efea)

- - - - -
2d8c8094 by Matthew Pickering at 2025-02-07T13:14:37+01:00
typechecker: Perform type family consistency checks in topological order

Consider a module M importing modules A, B and C.

We can waste a lot of work depending on the order that the modules are
checked for family consistency.

Consider that C imports A and B. When compiling C we must have already
checked A and B for consistency, therefore if C is processed first then
A and B will not need to be checked for consistency again.

If A and B are compared first, then the consistency checks will be
performed against (wasted as we already performed them for C).

At the moment the order which modules are checked is non-deterministic.

Clearly we should engineer that C is checked before B and A, but by what
scheme?

A simple one is to observe that if a module M is in the transitive
closure of X then the size of the consistent family set of M is less
than or equal to size of the consistent family set of X.

Therefore by sorting the imports by the size of the consistent family
set and processing the largest first, you make sure to process modules
in topological order.

In practice we have observed that this strategy has reduced the amount
of consistency checks performed.

One solution to #25554

(cherry picked from commit 13fe48d40004d9cdf3c73300a18f144bdc5191d9)

- - - - -
f5265c6c by Ben Gamari at 2025-02-07T13:17:08+01:00
rts: Fix incorrect format specifiers in era profiling

Fixes #25581.

(cherry picked from commit 430d965a176d6c9e629d169fa0606923275c8332)

- - - - -
73ed2394 by Cheng Shao at 2025-02-11T13:34:51+01:00
ci: minor nix-in-docker improvements

This patch makes some minor improvements re nix-in-docker logic in the
ci configuration:

- Update `nixos/nix` to the latest version
- Apply $CPUS to `cores`/`max-jobs` to avoid oversubscribing while
  allowing a reasonable degree of parallelism
- Remove redundant `--extra-experimental-features nix-command` in
  later `nix shell` invocations, it's already configured in
  `/etc/nix/nix.conf`

(cherry picked from commit 291388e18b5b61a02a43cc153a642fd67d6127c4)

- - - - -
bebda0d8 by Zubin Duggal at 2025-02-11T13:34:57+01:00
ghcup metadata: output metadata fragment in CI

(cherry picked from commit 52b58a660e735b20961d792d8fa9267f01247a50)
(cherry picked from commit 04433916cdedab80114cbed1dc399ae816bada91)

- - - - -
77af77d5 by Zubin Duggal at 2025-02-11T13:34:57+01:00
ghcup metatdata: use fedora33 for redhat

Redhat 9 doesn't have libtinfo.so.5 anymore

(cherry picked from commit dc86785eb43afd1bd292287c064fb5ad94fe8c7f)
(cherry picked from commit 7c78804e3b25d2c0246cd1f3ce0d4015431e5831)

- - - - -
459075c5 by Zubin Duggal at 2025-02-11T13:34:57+01:00
ghcup metadata: still use centos for redhat <9

(cherry picked from commit 1d72cfb2c1054bc8a399855d5c68443c969d2f66)

- - - - -


28 changed files:

- .gitlab-ci.yml
- .gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/Tc/Instance/Family.hs
- compiler/GHC/Tc/Module.hs
- docs/users_guide/using-concurrent.rst
- hadrian/src/Hadrian/Utilities.hs
- hadrian/src/Rules/CabalReinstall.hs
- hadrian/src/Rules/Documentation.hs
- hadrian/src/Rules/Test.hs
- hadrian/src/Settings/Builders/Cabal.hs
- hadrian/src/Settings/Builders/Ghc.hs
- libraries/ghc-internal/src/GHC/Internal/Conc/Sync.hs
- rts/Capability.c
- rts/Capability.h
- rts/CheckUnload.c
- rts/Linker.c
- rts/ProfHeap.c
- rts/RtsSymbols.c
- rts/Schedule.c
- rts/include/rts/Config.h
- rts/include/rts/Threads.h
- rts/linker/MMap.c
- testsuite/driver/testlib.py
- + testsuite/tests/rts/T25560.hs
- testsuite/tests/rts/all.T


Changes:

=====================================
.gitlab-ci.yml
=====================================
@@ -289,7 +289,7 @@ lint-author:
     - *drafts-can-fail-lint
 
 lint-ci-config:
-  image: nixos/nix:2.14.1
+  image: nixos/nix:2.25.2
   extends: .lint
   # We don't need history/submodules in this job
   variables:
@@ -297,6 +297,18 @@ lint-ci-config:
     GIT_SUBMODULE_STRATEGY: none
   before_script:
     - echo "experimental-features = nix-command flakes" >> /etc/nix/nix.conf
+    # Note [Nix-in-Docker]
+    # ~~~~~~~~~~~~~~~~~~~~
+    # The nixos/nix default config is max-jobs=1 and cores=$(logical
+    # cores num) which doesn't play nice with our $CPUS convention. We
+    # fix it before invoking any nix build to avoid oversubscribing
+    # while allowing a reasonable degree of parallelism.
+    # FIXME: Disabling build-users-group=nixbld is a workaround for a Nix-in-Docker issue. See
+    # https://gitlab.haskell.org/ghc/head.hackage/-/issues/38#note_560487 for
+    # discussion.
+    - echo "cores = $CPUS" >> /etc/nix/nix.conf
+    - echo "max-jobs = $CPUS" >> /etc/nix/nix.conf
+    - nix run nixpkgs#gnused -- -i -e 's/ nixbld//' /etc/nix/nix.conf
   script:
     - nix run .gitlab/generate-ci#generate-jobs
     # 1 if .gitlab/generate_jobs changed the output of the generated config
@@ -1096,7 +1108,7 @@ project-version:
 
 .ghcup-metadata:
   stage: deploy
-  image: nixos/nix:2.14.1
+  image: nixos/nix:2.25.2
   dependencies: null
   tags:
     - x86_64-linux
@@ -1105,6 +1117,10 @@ project-version:
     GIT_SUBMODULE_STRATEGY: "none"
   before_script:
     - echo "experimental-features = nix-command flakes" >> /etc/nix/nix.conf
+    # FIXME: See Note [Nix-in-Docker]
+    - echo "cores = $CPUS" >> /etc/nix/nix.conf
+    - echo "max-jobs = $CPUS" >> /etc/nix/nix.conf
+    - nix run nixpkgs#gnused -- -i -e 's/ nixbld//' /etc/nix/nix.conf
     - nix-channel --update
     - cat version.sh
     # Calculate the project version
@@ -1157,7 +1173,7 @@ ghcup-metadata-nightly:
       artifacts: false
     - job: project-version
   script:
-    - nix shell --extra-experimental-features nix-command -f .gitlab/rel_eng -c ghcup-metadata --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" > "metadata_test.yaml"
+    - nix shell -f .gitlab/rel_eng -c ghcup-metadata --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" > "metadata_test.yaml"
   rules:
     - if: $NIGHTLY
 
@@ -1195,7 +1211,8 @@ ghcup-metadata-release:
   # No explicit needs for release pipeline as we assume we need everything and everything will pass.
   extends: .ghcup-metadata
   script:
-    - nix shell --extra-experimental-features nix-command -f .gitlab/rel_eng -c ghcup-metadata --release-mode --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" > "metadata_test.yaml"
+    - nix shell -f .gitlab/rel_eng -c ghcup-metadata --release-mode --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" --fragment
+    - nix shell -f .gitlab/rel_eng -c ghcup-metadata --release-mode --metadata ghcup-0.0.7.yaml --date="$(date -d $CI_PIPELINE_CREATED_AT +%Y-%m-%d)" --pipeline-id="$CI_PIPELINE_ID" --version="$ProjectVersion" > "metadata_test.yaml"
   rules:
     - if: '$RELEASE_JOB == "yes"'
 


=====================================
.gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
=====================================
@@ -232,7 +232,9 @@ def mk_new_yaml(release_mode, version, date, pipeline_type, job_map):
                               , "unknown_versioning" : centos7  }
           , "Linux_Fedora"  : { ">= 33": fedora33
                               , "unknown_versioning": centos7 }
-          , "Linux_RedHat"  : { "unknown_versioning": centos7 }
+          , "Linux_RedHat"  : { "< 9": centos7
+                              , ">= 9": fedora33
+                              , "unknown_versioning": fedora33 }
           , "Linux_UnknownLinux" : { "unknown_versioning": rocky8 }
           , "Darwin" : { "unknown_versioning" : darwin_x86 }
           , "Windows" : { "unknown_versioning" :  windows }


=====================================
compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
=====================================
@@ -23,7 +23,7 @@ import GHC.Cmm.DebugBlock
 import GHC.CmmToAsm.Monad
    ( NatM, getNewRegNat
    , getPicBaseMaybeNat, getPlatform, getConfig
-   , getDebugBlock, getFileId, getNewLabelNat
+   , getDebugBlock, getFileId
    )
 -- import GHC.CmmToAsm.Instr
 import GHC.CmmToAsm.PIC
@@ -50,7 +50,7 @@ import GHC.Types.Unique.Supply
 import GHC.Data.OrdList
 import GHC.Utils.Outputable
 
-import Control.Monad    ( mapAndUnzipM )
+import Control.Monad    ( mapAndUnzipM, foldM )
 import Data.Maybe
 import GHC.Float
 
@@ -210,79 +210,43 @@ annExpr e instr {- debugIsOn -} = ANN (text . show $ e) instr
 -- -----------------------------------------------------------------------------
 -- Generating a table-branch

--- | Generate jump to jump table target
+-- TODO jump tables would be a lot faster, but we'll use bare bones for now.
+-- this is usually done by sticking the jump table ids into an instruction
+-- and then have the @generateJumpTableForInstr@ callback produce the jump
+-- table as a static.
 --
--- The index into the jump table is calulated by evaluating @expr at . The
--- corresponding table entry contains the relative address to jump to (relative
--- to the jump table's first entry / the table's own label).
-genSwitch :: NCGConfig -> CmmExpr -> SwitchTargets -> NatM InstrBlock
-genSwitch config expr targets = do
-  (reg, fmt1, e_code) <- getSomeReg indexExpr
-  let fmt = II64
-  targetReg <- getNewRegNat fmt
-  lbl <- getNewLabelNat
-  dynRef <- cmmMakeDynamicReference config DataReference lbl
-  (tableReg, fmt2, t_code) <- getSomeReg dynRef
-  let code =
-        toOL
-          [ COMMENT (text "indexExpr" <+> (text . show) indexExpr),
-            COMMENT (text "dynRef" <+> (text . show) dynRef)
-          ]
-          `appOL` e_code
-          `appOL` t_code
-          `appOL` toOL
-            [ COMMENT (ftext "Jump table for switch"),
-              -- index to offset into the table (relative to tableReg)
-              annExpr expr (LSL (OpReg (formatToWidth fmt1) reg) (OpReg (formatToWidth fmt1) reg) (OpImm (ImmInt 3))),
-              -- calculate table entry address
-              ADD (OpReg W64 targetReg) (OpReg (formatToWidth fmt1) reg) (OpReg (formatToWidth fmt2) tableReg),
-              -- load table entry (relative offset from tableReg (first entry) to target label)
-              LDR II64 (OpReg W64 targetReg) (OpAddr (AddrRegImm targetReg (ImmInt 0))),
-              -- calculate absolute address of the target label
-              ADD (OpReg W64 targetReg) (OpReg W64 targetReg) (OpReg W64 tableReg),
-              -- prepare jump to target label
-              J_TBL ids (Just lbl) targetReg
-            ]
-  return code
-  where
-    -- See Note [Sub-word subtlety during jump-table indexing] in
-    -- GHC.CmmToAsm.X86.CodeGen for why we must first offset, then widen.
-    indexExpr0 = cmmOffset platform expr offset
-    -- We widen to a native-width register to sanitize the high bits
-    indexExpr =
-      CmmMachOp
-        (MO_UU_Conv expr_w (platformWordWidth platform))
-        [indexExpr0]
-    expr_w = cmmExprWidth platform expr
-    (offset, ids) = switchTargetsToTable targets
-    platform = ncgPlatform config
-
--- | Generate jump table data (if required)
+-- See Ticket 19912
+--
+-- data SwitchTargets =
+--    SwitchTargets
+--        Bool                       -- Signed values
+--        (Integer, Integer)         -- Range
+--        (Maybe Label)              -- Default value
+--        (M.Map Integer Label)      -- The branches
 --
--- The idea is to emit one table entry per case. The entry is the relative
--- address of the block to jump to (relative to the table's first entry /
--- table's own label.) The calculation itself is done by the linker.
-generateJumpTableForInstr ::
-  NCGConfig ->
-  Instr ->
-  Maybe (NatCmmDecl RawCmmStatics Instr)
-generateJumpTableForInstr config (J_TBL ids (Just lbl) _) =
-  let jumpTable =
-        map jumpTableEntryRel ids
-        where
-          jumpTableEntryRel Nothing =
-            CmmStaticLit (CmmInt 0 (ncgWordWidth config))
-          jumpTableEntryRel (Just blockid) =
-            CmmStaticLit
-              ( CmmLabelDiffOff
-                  blockLabel
-                  lbl
-                  0
-                  (ncgWordWidth config)
-              )
-            where
-              blockLabel = blockLbl blockid
-   in Just (CmmData (Section ReadOnlyData lbl) (CmmStaticsRaw lbl jumpTable))
+-- Non Jumptable plan:
+-- xE <- expr
+--
+genSwitch :: CmmExpr -> SwitchTargets -> NatM InstrBlock
+genSwitch expr targets = do -- pprPanic "genSwitch" (ppr expr)
+  (reg, format, code) <- getSomeReg expr
+  let w = formatToWidth format
+  let mkbranch acc (key, bid) = do
+        (keyReg, _format, code) <- getSomeReg (CmmLit (CmmInt key w))
+        return $ code `appOL`
+                 toOL [ CMP (OpReg w reg) (OpReg w keyReg)
+                      , BCOND EQ (TBlock bid)
+                      ] `appOL` acc
+      def_code = case switchTargetsDefault targets of
+        Just bid -> unitOL (B (TBlock bid))
+        Nothing  -> nilOL
+
+  switch_code <- foldM mkbranch nilOL (switchTargetsCases targets)
+  return $ code `appOL` switch_code `appOL` def_code
+
+-- We don't do jump tables for now, see Ticket 19912
+generateJumpTableForInstr :: NCGConfig -> Instr
+  -> Maybe (NatCmmDecl RawCmmStatics Instr)
 generateJumpTableForInstr _ _ = Nothing
 
 -- -----------------------------------------------------------------------------
@@ -314,7 +278,6 @@ stmtToInstrs :: BlockId -- ^ Basic block this statement will start to be placed
 stmtToInstrs bid stmt = do
   -- traceM $ "-- -------------------------- stmtToInstrs -------------------------- --\n"
   --     ++ showSDocUnsafe (ppr stmt)
-  config <- getConfig
   platform <- getPlatform
   case stmt of
     CmmUnsafeForeignCall target result_regs args
@@ -343,7 +306,7 @@ stmtToInstrs bid stmt = do
       CmmCondBranch arg true false _prediction ->
           genCondBranch bid true false arg
 
-      CmmSwitch arg ids -> genSwitch config arg ids
+      CmmSwitch arg ids -> genSwitch arg ids
 
       CmmCall { cml_target = arg } -> genJump arg
 
@@ -388,6 +351,12 @@ getRegisterReg platform (CmmGlobal reg@(GlobalRegUse mid _))
         -- ones which map to a real machine register on this
         -- platform.  Hence if it's not mapped to a registers something
         -- went wrong earlier in the pipeline.
+-- | Convert a BlockId to some CmmStatic data
+-- TODO: Add JumpTable Logic, see Ticket 19912
+-- jumpTableEntry :: NCGConfig -> Maybe BlockId -> CmmStatic
+-- jumpTableEntry config Nothing   = CmmStaticLit (CmmInt 0 (ncgWordWidth config))
+-- jumpTableEntry _ (Just blockid) = CmmStaticLit (CmmLabel blockLabel)
+--     where blockLabel = blockLbl blockid
 
 -- -----------------------------------------------------------------------------
 -- General things for putting together code sequences


=====================================
compiler/GHC/CmmToAsm/AArch64/Instr.hs
=====================================
@@ -27,7 +27,7 @@ import GHC.Types.Unique.Supply
 
 import GHC.Utils.Panic
 
-import Data.Maybe (fromMaybe, catMaybes)
+import Data.Maybe (fromMaybe)
 
 import GHC.Stack
 
@@ -118,7 +118,6 @@ regUsageOfInstr platform instr = case instr of
   ORR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   -- 4. Branch Instructions ----------------------------------------------------
   J t                      -> usage (regTarget t, [])
-  J_TBL _ _ t              -> usage ([t], [])
   B t                      -> usage (regTarget t, [])
   BCOND _ t                -> usage (regTarget t, [])
   BL t ps                  -> usage (regTarget t ++ ps, callerSavedRegisters)
@@ -265,11 +264,10 @@ patchRegsOfInstr instr env = case instr of
     ORR o1 o2 o3   -> ORR  (patchOp o1) (patchOp o2) (patchOp o3)
 
     -- 4. Branch Instructions --------------------------------------------------
-    J t               -> J (patchTarget t)
-    J_TBL ids mbLbl t -> J_TBL ids mbLbl (env t)
-    B t               -> B (patchTarget t)
-    BL t rs           -> BL (patchTarget t) rs
-    BCOND c t         -> BCOND c (patchTarget t)
+    J t            -> J (patchTarget t)
+    B t            -> B (patchTarget t)
+    BL t rs        -> BL (patchTarget t) rs
+    BCOND c t      -> BCOND c (patchTarget t)
 
     -- 5. Atomic Instructions --------------------------------------------------
     -- 6. Conditional Instructions ---------------------------------------------
@@ -320,7 +318,6 @@ isJumpishInstr instr = case instr of
     CBZ{} -> True
     CBNZ{} -> True
     J{} -> True
-    J_TBL{} -> True
     B{} -> True
     BL{} -> True
     BCOND{} -> True
@@ -334,7 +331,6 @@ jumpDestsOfInstr (ANN _ i) = jumpDestsOfInstr i
 jumpDestsOfInstr (CBZ _ t) = [ id | TBlock id <- [t]]
 jumpDestsOfInstr (CBNZ _ t) = [ id | TBlock id <- [t]]
 jumpDestsOfInstr (J t) = [id | TBlock id <- [t]]
-jumpDestsOfInstr (J_TBL ids _mbLbl _r) = catMaybes ids
 jumpDestsOfInstr (B t) = [id | TBlock id <- [t]]
 jumpDestsOfInstr (BL t _) = [ id | TBlock id <- [t]]
 jumpDestsOfInstr (BCOND _ t) = [ id | TBlock id <- [t]]
@@ -343,11 +339,6 @@ jumpDestsOfInstr _ = []
 canFallthroughTo :: Instr -> BlockId -> Bool
 canFallthroughTo (ANN _ i) bid = canFallthroughTo i bid
 canFallthroughTo (J (TBlock target)) bid = bid == target
-canFallthroughTo (J_TBL targets _ _) bid = all isTargetBid targets
-  where
-    isTargetBid target = case target of
-      Nothing -> True
-      Just target -> target == bid
 canFallthroughTo (B (TBlock target)) bid = bid == target
 canFallthroughTo _ _ = False
 
@@ -361,7 +352,6 @@ patchJumpInstr instr patchF
         CBZ r (TBlock bid) -> CBZ r (TBlock (patchF bid))
         CBNZ r (TBlock bid) -> CBNZ r (TBlock (patchF bid))
         J (TBlock bid) -> J (TBlock (patchF bid))
-        J_TBL ids mbLbl r -> J_TBL (map (fmap patchF) ids) mbLbl r
         B (TBlock bid) -> B (TBlock (patchF bid))
         BL (TBlock bid) ps -> BL (TBlock (patchF bid)) ps
         BCOND c (TBlock bid) -> BCOND c (TBlock (patchF bid))
@@ -525,7 +515,6 @@ allocMoreStack platform slots proc@(CmmProc info lbl live (ListGraph code)) = do
 
       insert_dealloc insn r = case insn of
         J _ -> dealloc ++ (insn : r)
-        J_TBL {} -> dealloc ++ (insn : r)
         ANN _ (J _) -> dealloc ++ (insn : r)
         _other | jumpDestsOfInstr insn /= []
             -> patchJumpInstr insn retarget : r
@@ -654,7 +643,6 @@ data Instr
     | CBNZ Operand Target -- if op /= 0, then branch.
     -- Branching.
     | J Target            -- like B, but only generated from genJump. Used to distinguish genJumps from others.
-    | J_TBL [Maybe BlockId] (Maybe CLabel) Reg -- A jump instruction with data for switch/jump tables
     | B Target            -- unconditional branching b/br. (To a blockid, label or register)
     | BL Target [Reg] -- branch and link (e.g. set x30 to next pc, and branch)
     | BCOND Cond Target   -- branch with condition. b.<cond>
@@ -736,7 +724,6 @@ instrCon i =
       CBZ{} -> "CBZ"
       CBNZ{} -> "CBNZ"
       J{} -> "J"
-      J_TBL {} -> "J_TBL"
       B{} -> "B"
       BL{} -> "BL"
       BCOND{} -> "BCOND"


=====================================
compiler/GHC/CmmToAsm/AArch64/Ppr.hs
=====================================
@@ -426,7 +426,6 @@ pprInstr platform instr = case instr of
 
   -- 4. Branch Instructions ----------------------------------------------------
   J t            -> pprInstr platform (B t)
-  J_TBL _ _ r    -> pprInstr platform (B (TReg r))
   B (TBlock bid) -> line $ text "\tb" <+> pprAsmLabel platform (mkLocalBlockLabel (getUnique bid))
   B (TLabel lbl) -> line $ text "\tb" <+> pprAsmLabel platform lbl
   B (TReg r)     -> line $ text "\tbr" <+> pprReg W64 r


=====================================
compiler/GHC/Tc/Instance/Family.hs
=====================================
@@ -58,6 +58,7 @@ import Data.Function ( on )
 
 import qualified GHC.LanguageExtensions  as LangExt
 import GHC.Unit.Env (unitEnv_hpts)
+import Data.List (sortOn)
 
 {- Note [The type family instance consistency story]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -239,6 +240,49 @@ That situation should be pretty common in practice, there's usually
 a set of utility modules that every module imports directly or indirectly.
 
 This is basically the idea from #13092, comment:14.
+
+Note [Order of type family consistency checks]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider a module M which imports modules A, B and C, all defining (open) type
+family instances.
+
+We can waste a lot of work in type family consistency checking depending on the
+order in which the modules are processed.
+
+Suppose for example that C imports A and B. When we compiled C, we will have
+checked A and B for consistency against eachother. This means that, when
+processing the imports of M to check type family instance consistency:
+
+* if C is processed first, then A and B will not need to be checked for
+  consistency against eachother again,
+* if we process A and B before C,then the
+  consistency checks between A and B will be performed again. This is wasted
+  work, as we already performed them for C.
+
+This can make a significant difference. Keeping the nomenclature of the above
+example for illustration, we have observed situations in practice in which the
+compilation time of M goes from 1 second (the "processing A and B first" case)
+down to 80 milliseconds (the "processing C first" case).
+
+Clearly we should engineer that C is checked before B and A, but by what scheme?
+
+A simple one is to observe that if a module M is in the transitive closure of X
+then the size of the consistent family set of M is less than or equal to size
+of the consistent family set of X.
+
+Therefore, by sorting the imports by the size of the consistent family set and
+processing the largest first, we make sure to process modules in topological
+order.
+
+For a particular project, without this change we did 40 million checks and with
+this change we did 22.9 million checks. This is significant as before this change
+type family consistency checks accounted for 26% of total type checker allocations which
+was reduced to 15%.
+
+See tickets #25554 for discussion about this exact issue and #25555 for
+why we still do redundant checks.
+
 -}
 
 -- We don't need to check the current module, this is done in
@@ -267,6 +311,12 @@ checkFamInstConsistency directlyImpMods
                  where
                  deps = dep_finsts . mi_deps . modIface $ mod
 
+             ; debug_consistent_set = map (\x -> (x, length (modConsistent x))) directlyImpMods
+
+             -- Sorting the list by size has the effect of performing a topological sort.
+             -- See Note [Order of type family consistency checks]
+             ; init_consistent_set = reverse (sortOn (length . modConsistent) directlyImpMods)
+
              ; hmiModule     = mi_module . hm_iface
              ; hmiFamInstEnv = extendFamInstEnvList emptyFamInstEnv
                                . md_fam_insts . hm_details
@@ -276,7 +326,8 @@ checkFamInstConsistency directlyImpMods
 
              }
 
-       ; checkMany hpt_fam_insts modConsistent directlyImpMods
+       ; traceTc "init_consistent_set" (ppr debug_consistent_set)
+       ; checkMany hpt_fam_insts modConsistent init_consistent_set
        }
   where
     -- See Note [Checking family instance optimization]
@@ -294,6 +345,11 @@ checkFamInstConsistency directlyImpMods
          -> TcM ()
       go _ _ [] = return ()
       go consistent consistent_set (mod:mods) = do
+        traceTc "checkManySize" (vcat [text "mod:" <+> ppr mod
+                                      , text "m1:" <+> ppr (length to_check_from_mod)
+                                      , text "m2:" <+> ppr (length (to_check_from_consistent))
+                                      , text "product:" <+> ppr (length to_check_from_mod * length to_check_from_consistent)
+                                      ])
         sequence_
           [ check hpt_fam_insts m1 m2
           | m1 <- to_check_from_mod


=====================================
compiler/GHC/Tc/Module.hs
=====================================
@@ -434,7 +434,9 @@ tcRnImports hsc_env import_decls
         ; let { dir_imp_mods = moduleEnvKeys
                              . imp_mods
                              $ imports }
-        ; checkFamInstConsistency dir_imp_mods
+        ; logger <- getLogger
+        ; withTiming logger (text "ConsistencyCheck"<+>brackets (ppr this_mod)) (const ())
+            $ checkFamInstConsistency dir_imp_mods
         ; traceRn "rn1: } checking family instance consistency" empty
 
         ; getGblEnv } }


=====================================
docs/users_guide/using-concurrent.rst
=====================================
@@ -153,6 +153,14 @@ use the RTS :rts-flag:`-N ⟨x⟩` options.
     changed while the program is running by calling
     ``Control.Concurrent.setNumCapabilities``.
 
+
+.. note::
+
+    The maximum number of capabilities supported by the GHC runtime system is
+    determined when at RTS startup to be either 256, the value given by
+    :rts-flag:`-N ⟨x⟩`, or the number of logical CPU cores, whichever is
+    greater.
+
 The following options affect the way the runtime schedules threads on
 CPUs:
 


=====================================
hadrian/src/Hadrian/Utilities.hs
=====================================
@@ -7,7 +7,7 @@ module Hadrian.Utilities (
     quote, yesNo, parseYesNo, zeroOne,
 
     -- * FilePath manipulation
-    unifyPath, (-/-), makeRelativeNoSysLink,
+    unifyPath, (-/-), makeRelativeNoSysLink, makeAbsolute,
 
     -- * Accessing Shake's type-indexed map
     insertExtra, lookupExtra, userSetting,
@@ -56,6 +56,7 @@ import qualified System.Directory.Extra as IO
 import qualified System.Info.Extra      as IO
 import qualified System.IO              as IO
 import System.IO.Error (isPermissionError)
+import qualified System.FilePath.Posix as Posix
 
 -- | Extract a value from a singleton list, or terminate with an error message
 -- if the list does not contain exactly one value.
@@ -135,7 +136,17 @@ zeroOne True  = "1"
 unifyPath :: FilePath -> FilePath
 unifyPath = toStandard . normaliseEx
 
+{- Note [Absolute paths and MSYS]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When dealing with absolute paths in Hadrian, we opt to always use Unix-style
+forward slashes for separating paths.
+This is because, on Windows, the MSYS toolchain can reliably handle paths such
+as /c/foo, while it occasionally falls over on paths of the form C:\foo.
+-}
+
 -- | Combine paths with a forward slash regardless of platform.
+--
+-- See Note [Absolute paths and MSYS].
 (-/-) :: FilePath -> FilePath -> FilePath
 _  -/- b | isAbsolute b && not (isAbsolute $ tail b) = b
 "" -/- b = b
@@ -145,6 +156,16 @@ a  -/- b
 
 infixr 6 -/-
 
+-- | Like 'System.Directory.makeAbsolute' from @directory@, but always
+-- using forward slashes.
+--
+-- See Note [Absolute paths and MSYS].
+makeAbsolute :: FilePath -> IO FilePath
+makeAbsolute fp = do
+  cwd <- IO.getCurrentDirectory
+  let fp' = cwd -/- fp
+  return $ Posix.normalise fp'
+
 -- | This is like Posix makeRelative, but assumes no sys links in the input
 -- paths. This allows the result to start with possibly many "../"s. Input
 -- paths must both be relative, or be on the same drive


=====================================
hadrian/src/Rules/CabalReinstall.hs
=====================================
@@ -61,10 +61,10 @@ cabalBuildRules = do
             rtsIncludeDir    = ghcBuildDir -/- "lib" -/- distDir -/- rtsDir
                                -/- "include"
 
-        libdir  <- liftIO . IO.makeAbsolute =<< stageLibPath Stage1
-        work_dir <- liftIO $ IO.makeAbsolute $ root -/- "stage-cabal"
+        libdir  <- liftIO . makeAbsolute =<< stageLibPath Stage1
+        work_dir <- liftIO $ makeAbsolute $ root -/- "stage-cabal"
         let outputDir = work_dir -/- "bin"
-        includeDir <- liftIO $ IO.makeAbsolute rtsIncludeDir
+        includeDir <- liftIO $ makeAbsolute rtsIncludeDir
 
         createDirectory outputDir
 
@@ -101,7 +101,7 @@ cabalBuildRules = do
         -- Just symlink these for now
         -- TODO: build these with cabal as well
         forM_ iserv_targets $ \(_bin_pkg,bin_path') -> do
-            bin_path <- liftIO $ IO.makeAbsolute bin_path'
+            bin_path <- liftIO $ makeAbsolute bin_path'
             let orig_filename = takeFileName bin_path
                 output_file = outputDir -/- orig_filename
             liftIO $ do


=====================================
hadrian/src/Rules/Documentation.hs
=====================================
@@ -334,11 +334,27 @@ getPkgDocTarget root path =
 
 -- | Build all PDF documentation
 buildPdfDocumentation :: Rules ()
-buildPdfDocumentation = mapM_ buildSphinxPdf docPaths
+buildPdfDocumentation = do
+    -- Note [Avoiding mktexfmt race]
+    -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    -- We must ensure that the *first* xelatex invocation in the
+    -- build is performed serially (that is, not concurrently with
+    -- any other xelatex invocations) as mktexfmt does not handle
+    -- racing `mkdir` calls gracefully. However, we assume that
+    -- subsequent invocations are safe to run concurrently since the
+    -- initial call will have created the requisite directories (namely
+    -- $HOME/.texlive2020/texmf-var/web2c).
+    --
+    -- Fixes #25564.
+    let maxConcurrentReaders = 1000
+    xelatexMutex <- newResource "xelatex-mutex" maxConcurrentReaders
+    let rs = [(xelatexMutex, 1)]
+
+    mapM_ (buildSphinxPdf rs) docPaths
 
 -- | Compile a Sphinx ReStructured Text package to LaTeX
-buildSphinxPdf :: FilePath -> Rules ()
-buildSphinxPdf path = do
+buildSphinxPdf :: [(Resource, Int)] -> FilePath -> Rules ()
+buildSphinxPdf rs path = do
     root <- buildRootRules
     root -/- pdfRoot -/- path <.> "pdf" %> \file -> do
 
@@ -352,7 +368,8 @@ buildSphinxPdf path = do
             checkSphinxWarnings dir
 
             -- LaTeX "fixed point"
-            build $ target docContext Xelatex [path <.> "tex"] [dir]
+            -- See Note [Avoiding mktexfmt race] above.
+            buildWithResources rs $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Makeindex [path <.> "idx"] [dir]


=====================================
hadrian/src/Rules/Test.hs
=====================================
@@ -217,7 +217,7 @@ testRules = do
         -- get relative path for the given program in the given stage
         let relative_path_stage s p = programPath =<< programContext s p
         let make_absolute rel_path = do
-              abs_path <- liftIO (IO.makeAbsolute rel_path)
+              abs_path <- liftIO (makeAbsolute rel_path)
               fixAbsolutePathOnWindows abs_path
 
         rel_ghc_pkg     <- relative_path_stage Stage1 ghcPkg


=====================================
hadrian/src/Settings/Builders/Cabal.hs
=====================================
@@ -11,7 +11,6 @@ import Settings.Builders.Common
 import qualified Settings.Builders.Common as S
 import Control.Exception (assert)
 import qualified Data.Set as Set
-import System.Directory
 import Settings.Program (programContext)
 import GHC.Toolchain (ccLinkProgram, tgtCCompilerLink)
 import GHC.Toolchain.Program (prgFlags)


=====================================
hadrian/src/Settings/Builders/Ghc.hs
=====================================
@@ -12,7 +12,6 @@ import Settings.Warnings
 import qualified Context as Context
 import Rules.Libffi (libffiName)
 import qualified Data.Set as Set
-import System.Directory
 import Data.Version.Extra
 
 ghcBuilderArgs :: Args


=====================================
libraries/ghc-internal/src/GHC/Internal/Conc/Sync.hs
=====================================
@@ -1,4 +1,3 @@
-{-# LANGUAGE CPP #-}
 {-# LANGUAGE MagicHash #-}
 {-# LANGUAGE NoImplicitPrelude #-}
 {-# LANGUAGE RankNTypes #-}
@@ -394,13 +393,14 @@ getNumProcessors :: IO Int
 getNumProcessors = fmap fromIntegral c_getNumberOfProcessors
 
 foreign import ccall unsafe "getNumberOfProcessors"
-  c_getNumberOfProcessors :: IO CUInt
+  c_getNumberOfProcessors :: IO Word32
 
 -- | Returns the number of sparks currently in the local spark pool
 numSparks :: IO Int
 numSparks = IO $ \s -> case numSparks# s of (# s', n #) -> (# s', I# n #)
 
-foreign import ccall "&enabled_capabilities" enabled_capabilities :: Ptr CInt
+foreign import ccall "&enabled_capabilities"
+  enabled_capabilities :: Ptr Word32
 
 childHandler :: SomeException -> IO ()
 childHandler err = catch (real_handler err) childHandler


=====================================
rts/Capability.c
=====================================
@@ -16,6 +16,7 @@
  *
  * --------------------------------------------------------------------------*/
 
+#include "rts/Config.h"
 #include "rts/PosixSource.h"
 #include "Rts.h"
 
@@ -40,12 +41,16 @@ Capability MainCapability;
 uint32_t n_capabilities = 0;
 uint32_t enabled_capabilities = 0;
 
+// The size of the `capabilities` array initialized at RTS startup. Referenced
+// by GHC.Internal.Conc.Sync
+uint32_t max_n_capabilities = MAX_N_CAPABILITIES;
+
 // The array of Capabilities.  It's important that when we need
 // to allocate more Capabilities we don't have to move the existing
 // Capabilities, because there may be pointers to them in use
 // (e.g. threads in waitForCapability(), see #8209), so this is
 // an array of Capability* rather than an array of Capability.
-Capability *capabilities[MAX_N_CAPABILITIES];
+Capability **capabilities;
 
 // Holds the Capability which last became free.  This is used so that
 // an in-call has a chance of quickly finding a free Capability.
@@ -344,8 +349,6 @@ initCapability (Capability *cap, uint32_t i)
  * ------------------------------------------------------------------------- */
 void initCapabilities (void)
 {
-    uint32_t i;
-
     /* Declare a couple capability sets representing the process and
        clock domain. Each capability will get added to these capsets. */
     traceCapsetCreate(CAPSET_OSPROCESS_DEFAULT, CapsetTypeOsProcess);
@@ -354,7 +357,7 @@ void initCapabilities (void)
     // Initialise NUMA
     if (!RtsFlags.GcFlags.numa) {
         n_numa_nodes = 1;
-        for (i = 0; i < MAX_NUMA_NODES; i++) {
+        for (uint32_t i = 0; i < MAX_NUMA_NODES; i++) {
             numa_map[i] = 0;
         }
     } else if (RtsFlags.DebugFlags.numa) {
@@ -388,12 +391,30 @@ void initCapabilities (void)
     }
 #endif
 
-    if (RtsFlags.ParFlags.nCapabilities > MAX_N_CAPABILITIES) {
-        errorBelch("warning: this GHC runtime system only supports up to %d capabilities",
-                   MAX_N_CAPABILITIES);
-        RtsFlags.ParFlags.nCapabilities = MAX_N_CAPABILITIES;
+    /*
+     * Note [Capabilities array sizing]
+     * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+     * Determine the size of the capabilities array as the maximum of:
+     *   * the static lower bound, `MAX_N_CAPABILITIES`
+     *   * the logical core count
+     *   * the users's choice of `+RTS -N`
+     * This will serve as the upper bound on the capability count for the rest
+     * of execution. Calls to `setNumCapabilities` exceeding this bound will
+     * issue a warning and otherwise have no effect.
+     *
+     * See #25560.
+     */
+    uint32_t core_count = getNumberOfProcessors();
+    if (core_count > max_n_capabilities) {
+        max_n_capabilities = core_count;
     }
 
+    if (RtsFlags.ParFlags.nCapabilities > max_n_capabilities) {
+        max_n_capabilities = RtsFlags.ParFlags.nCapabilities;
+    }
+
+    capabilities = stgMallocBytes(sizeof(Capability) * max_n_capabilities, "initCapabilities");
+
     n_capabilities = 0;
     moreCapabilities(0, RtsFlags.ParFlags.nCapabilities);
     n_capabilities = RtsFlags.ParFlags.nCapabilities;
@@ -401,6 +422,7 @@ void initCapabilities (void)
 #else /* !THREADED_RTS */
 
     n_capabilities = 1;
+    capabilities = stgMallocBytes(sizeof(Capability), "initCapabilities");
     capabilities[0] = &MainCapability;
 
     initCapability(&MainCapability, 0);
@@ -412,7 +434,7 @@ void initCapabilities (void)
     // There are no free capabilities to begin with.  We will start
     // a worker Task to each Capability, which will quickly put the
     // Capability on the free list when it finds nothing to do.
-    for (i = 0; i < n_numa_nodes; i++) {
+    for (uint32_t i = 0; i < n_numa_nodes; i++) {
         last_free_capability[i] = getCapability(0);
     }
 }


=====================================
rts/Capability.h
=====================================
@@ -266,11 +266,13 @@ INLINE_HEADER void releaseCapability_ (Capability* cap STG_UNUSED,
 // extern Capability MainCapability;
 
 // declared in rts/include/rts/Threads.h:
+// extern uint32_t max_n_capabilities;
 // extern uint32_t n_capabilities;
 // extern uint32_t enabled_capabilities;
 
-// Array of all the capabilities
-extern Capability *capabilities[MAX_N_CAPABILITIES];
+// Array of all the capabilities, of size max_n_capabilities
+// See Note [Capabilities array sizing] in rts/Capability.c.
+extern Capability **capabilities;
 
 INLINE_HEADER Capability *getCapability(uint32_t i)
 {


=====================================
rts/CheckUnload.c
=====================================
@@ -166,7 +166,7 @@ ObjectCode *loaded_objects;
 static OCSectionIndices *global_s_indices = NULL;
 
 // Is it safe for us to unload code?
-static bool safeToUnload(void)
+static bool tryToUnload(void)
 {
     if (RtsFlags.ProfFlags.doHeapProfile != NO_HEAP_PROFILING) {
         // We mustn't unload anything as the heap census may contain
@@ -174,7 +174,8 @@ static bool safeToUnload(void)
         // See #24512.
         return false;
     }
-    return true;
+
+    return global_s_indices != NULL;
 }
 
 static OCSectionIndices *createOCSectionIndices(void)
@@ -432,7 +433,7 @@ static bool markObjectLive(void *data STG_UNUSED, StgWord key, const void *value
 
 void markObjectCode(const void *addr)
 {
-    if (global_s_indices == NULL) {
+    if (!tryToUnload()) {
         return;
     }
 
@@ -450,7 +451,7 @@ void markObjectCode(const void *addr)
 // unloading.
 bool prepareUnloadCheck(void)
 {
-    if (global_s_indices == NULL) {
+    if (!tryToUnload()) {
         return false;
     }
 
@@ -467,48 +468,44 @@ bool prepareUnloadCheck(void)
 
 void checkUnload(void)
 {
-    if (global_s_indices == NULL) {
-        return;
-    } else if (!safeToUnload()) {
-        return;
-    }
-
     // At this point we've marked all dynamically loaded static objects
     // (including their dependencies) during GC, but not the root set of object
     // code (loaded_objects). Mark the roots first, then unload any unmarked
     // objects.
 
-    OCSectionIndices *s_indices = global_s_indices;
-    ASSERT(s_indices->sorted);
+    if (tryToUnload()) {
+        OCSectionIndices *s_indices = global_s_indices;
+        ASSERT(s_indices->sorted);
 
-    // Mark roots
-    for (ObjectCode *oc = loaded_objects; oc != NULL; oc = oc->next_loaded_object) {
-        markObjectLive(NULL, (W_)oc, NULL);
-    }
+        // Mark roots
+        for (ObjectCode *oc = loaded_objects; oc != NULL; oc = oc->next_loaded_object) {
+            markObjectLive(NULL, (W_)oc, NULL);
+        }
 
-    // Free unmarked objects
-    ObjectCode *next = NULL;
-    for (ObjectCode *oc = old_objects; oc != NULL; oc = next) {
-        next = oc->next;
-        ASSERT(oc->status == OBJECT_UNLOADED);
-
-        // Symbols should be removed by unloadObj_.
-        // NB (osa): If this assertion doesn't hold then freeObjectCode below
-        // will corrupt symhash as keys of that table live in ObjectCodes. If
-        // you see a segfault in a hash table operation in linker (in non-debug
-        // RTS) then it's probably because this assertion did not hold.
-        ASSERT(oc->symbols == NULL);
-
-        if (oc->unloadable) {
-            removeOCSectionIndices(s_indices, oc);
-            freeObjectCode(oc);
-            n_unloaded_objects -= 1;
-        } else {
-            // If we don't have enough information to
-            // accurately determine the reachability of
-            // the object then hold onto it.
-            oc->next = objects;
-            objects = oc;
+        // Free unmarked objects
+        ObjectCode *next = NULL;
+        for (ObjectCode *oc = old_objects; oc != NULL; oc = next) {
+            next = oc->next;
+            ASSERT(oc->status == OBJECT_UNLOADED);
+
+            // Symbols should be removed by unloadObj_.
+            // NB (osa): If this assertion doesn't hold then freeObjectCode below
+            // will corrupt symhash as keys of that table live in ObjectCodes. If
+            // you see a segfault in a hash table operation in linker (in non-debug
+            // RTS) then it's probably because this assertion did not hold.
+            ASSERT(oc->symbols == NULL);
+
+            if (oc->unloadable) {
+                removeOCSectionIndices(s_indices, oc);
+                freeObjectCode(oc);
+                n_unloaded_objects -= 1;
+            } else {
+                // If we don't have enough information to
+                // accurately determine the reachability of
+                // the object then hold onto it.
+                oc->next = objects;
+                objects = oc;
+            }
         }
     }
 


=====================================
rts/Linker.c
=====================================
@@ -1076,7 +1076,7 @@ freePreloadObjectFile (ObjectCode *oc)
  */
 void freeObjectCode (ObjectCode *oc)
 {
-    IF_DEBUG(linker, ocDebugBelch(oc, "start\n"));
+    IF_DEBUG(linker, ocDebugBelch(oc, "freeObjectCode: start\n"));
 
     // Run finalizers
     if (oc->type == STATIC_OBJECT &&


=====================================
rts/ProfHeap.c
=====================================
@@ -958,9 +958,9 @@ dumpCensus( Census *census )
                                           count * sizeof(W_));
             break;
         case HEAP_BY_ERA:
-            fprintf(hp_file, "%lu", (StgWord)ctr->identity);
+            fprintf(hp_file, "%" FMT_Word, (StgWord)ctr->identity);
             char str_era[100];
-            sprintf(str_era, "%lu", (StgWord)ctr->identity);
+            sprintf(str_era, "%" FMT_Word, (StgWord)ctr->identity);
             traceHeapProfSampleString(0, str_era, count * sizeof(W_));
             break;
         case HEAP_BY_MOD:


=====================================
rts/RtsSymbols.c
=====================================
@@ -911,6 +911,7 @@ extern char **environ;
       SymI_NeedsDataProto(rts_stop_on_exception)                        \
       SymI_HasProto(stopTimer)                                          \
       SymI_HasProto(n_capabilities)                                     \
+      SymI_HasProto(max_n_capabilities)                                 \
       SymI_HasProto(enabled_capabilities)                               \
       SymI_HasDataProto(stg_traceEventzh)                                   \
       SymI_HasDataProto(stg_traceMarkerzh)                                  \


=====================================
rts/Schedule.c
=====================================
@@ -2276,9 +2276,12 @@ setNumCapabilities (uint32_t new_n_capabilities USED_IF_THREADS)
     } else if (new_n_capabilities <= 0) {
         errorBelch("setNumCapabilities: Capability count must be positive");
         return;
+    } else if (new_n_capabilities > max_n_capabilities) {
+        // See Note [Capabilities array sizing] in rts/Capability.c.
+        errorBelch("setNumCapabilities: Attempt to increase capability count beyond maximum capability count %" PRIu32 "; clamping...\n", max_n_capabilities);
+        new_n_capabilities = max_n_capabilities;
     }
 
-
     debugTrace(DEBUG_sched, "changing the number of Capabilities from %d to %d",
                enabled_capabilities, new_n_capabilities);
 


=====================================
rts/include/rts/Config.h
=====================================
@@ -78,6 +78,10 @@ code.
 #endif
 
 #if defined(THREADED_RTS)
+/*
+ * See Note [Capabilities array sizing] in rts/Capability.c.
+ * Update the note in docs/users_guide/using-concurrent.rst when updating this.
+ */
 #define MAX_N_CAPABILITIES 256
 #else
 #define MAX_N_CAPABILITIES 1


=====================================
rts/include/rts/Threads.h
=====================================
@@ -69,7 +69,7 @@ HsBool rtsSupportsBoundThreads (void);
 // The number of Capabilities.
 // TODO: Ideally we would only provide getNumCapabilities
 // but this is used in compiler/cbits/genSym.c
-extern unsigned int n_capabilities;
+extern uint32_t n_capabilities;
 
 INLINE_HEADER unsigned int getNumCapabilities(void)
 { return RELAXED_LOAD(&n_capabilities); }
@@ -77,6 +77,10 @@ INLINE_HEADER unsigned int getNumCapabilities(void)
 // The number of Capabilities that are not disabled
 extern uint32_t enabled_capabilities;
 
+// The maximum number of Capabilities supported by the RTS.
+// See Note [Capabilities array sizing] in rts/Capability.c.
+extern uint32_t max_n_capabilities;
+
 #if !IN_STG_CODE
 extern Capability MainCapability;
 #endif


=====================================
rts/linker/MMap.c
=====================================
@@ -344,12 +344,7 @@ mmapInRegion (
         if (result == NULL) {
             // The mapping failed
             return NULL;
-        } else if (result < region->start) {
-            // Uh oh, we assume that mmap() will only give us a
-            // an address at or after the requested address.
-            // Try again.
-            p = (uint8_t *) result + bytes;
-        } else if (result < region->end) {
+        } else if (result >= region->start && result < region->end) {
             // Success!
             region->last = (uint8_t *) result + bytes;
             return result;
@@ -357,17 +352,23 @@ mmapInRegion (
             // We failed to find a suitable mapping
             munmap(result, bytes);
             reportMemoryMap();
-            errorBelch("mmapForLinker: failed to mmap() memory below 2Gb; "
+            errorBelch("mmapForLinker: failed to mmap() memory between %p and %p; "
                        "asked for %zu bytes at %p. "
                        "Try specifying an address with +RTS -xm<addr> -RTS",
-                       bytes, p);
+                       region->start, region->end, bytes, p);
             return NULL;
-        }
+        } else if (result < region->start) {
+            // Uh oh, we assume that mmap() will only give us a
+            // an address at or after the requested address.
+            // Try bump forward by a bit and try again.
+            p = (uint8_t *) p + bytes;
+        } else if (result >= region->end) {
+            // mmap() gave us too high an address; wrap around and try again
+            wrapped = true;
+            p = region->start;
+       }
 
-        // mmap() gave us too high an address; wrap around and try again
         munmap(result, bytes);
-        wrapped = true;
-        p = region->start;
     }
 }
 


=====================================
testsuite/driver/testlib.py
=====================================
@@ -275,6 +275,10 @@ def req_bco( name, opts ):
     # JS backend doesn't support ByteCode
     js_skip(name, opts)
 
+def req_c_rts( name, opts ):
+    """ Require the C runtime system (rather than, e.g. the Javascript RTS). """
+    js_skip(name, opts)
+
 def req_rts_linker( name, opts ):
     if not config.have_RTS_linker:
         opts.expect = 'fail'


=====================================
testsuite/tests/rts/T25560.hs
=====================================
@@ -0,0 +1,4 @@
+import GHC.Conc
+
+main :: IO ()
+main = setNumCapabilities 100000


=====================================
testsuite/tests/rts/all.T
=====================================
@@ -616,3 +616,7 @@ test('T23400', [], compile_and_run, ['-with-rtsopts -A8k'])
 test('IOManager', [js_skip, when(arch('wasm32'), skip), when(opsys('mingw32'), skip),
                    pre_cmd('$MAKE -s --no-print-directory IOManager.hs')],
                   compile_and_run, [''])
+
+# N.B. This will likely issue a warning on stderr but we merely care that the
+# program doesn't crash.
+test('T25560', [req_c_rts, ignore_stderr], compile_and_run, [''])
\ No newline at end of file



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/15ca7b2c06f9727b7a8c5ca663c3b6779489daa6...459075c552d7d1f3d06658c5ab0d4fba096aa819

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/15ca7b2c06f9727b7a8c5ca663c3b6779489daa6...459075c552d7d1f3d06658c5ab0d4fba096aa819
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20250211/04cdb0d9/attachment-0001.html>