[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 12 commits: Add missing @since documentation for (!?) function

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Wed Dec 11 00:37:31 UTC 2024



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
a83b7ed6 by Matthew Stephenson at 2024-12-10T14:01:22-05:00
Add missing @since documentation for (!?) function
- - - - -
e745e3a3 by Ben Gamari at 2024-12-10T14:01:59-05:00
compiler: Don't attempt to TSAN-instrument SIMD operations

TSAN only provides instrumentation for 8, 16, 32, and 64-bit memory
loads/stores. Don't attempt to instrument wider operations.

Fixes #25563.

- - - - -
684c0018 by Ben Gamari at 2024-12-10T14:02:35-05:00
gitlab/ci: Don't clobber RUNTEST_ARGS

Previously the logic handling `IGNORE_PERF_FAILURES` clobbered the
user's `RUNTEST_ARGS`. Fix this.

- - - - -
41dae5b8 by Ben Gamari at 2024-12-10T14:03:11-05:00
hadrian: Mitigate mktexfmt race

At least some versions of Texlive's `mktexfmt` utility cannot be invoked
concurrently in their initial run since they fail to handle failure of
`mkdir` due to racing. Specifically, we see

```
| Run Xelatex: users_guide.tex => /tmp/extra-dir-9616886274866
| Run Xelatex: Haddock.tex => /tmp/extra-dir-9616886274869
This is XeTeX, Version 3.14159265-2.6-0.999992 (TeX Live 2020) (preloaded format=xelatex)
 restricted \write18 enabled.
kpathsea: Running mktexfmt xelatex.fmt
mktexfmt: mktexfmt is using the following fmtutil.cnf files (in precedence order):
mktexfmt:   /usr/share/texlive/texmf-dist/web2c/fmtutil.cnf
mktexfmt: mktexfmt is using the following fmtutil.cnf file for writing changes:
mktexfmt:   /builds/ghc/ghc/tmp-home/.texlive2020/texmf-config/web2c/fmtutil.cnf
/usr/bin/mktexfmt: mkdir(/builds/ghc/ghc/tmp-home/.texlive2020/texmf-var/web2c/) failed for tree /builds/ghc/ghc/tmp-home/.texlive2020/texmf-var/web2c: File exists at /usr/share/texlive/tlpkg/TeXLive/TLUtils.pm line 937.
I can't find the format file `xelatex.fmt'!
```

That is two `mktexfmt` invocations (for the user's guide and haddock
builds) attempted to create `$HOME/texlive2020/texmf-var/web2c` and
raced. One of the two `mkdir`'s consequently failed, bringing down the
entire build.

We avoid this by ensuring that the first `xelatex` invocation is always
performed serially.

Fixes #25564.

- - - - -
9efbc51f by Ben Gamari at 2024-12-10T14:03:48-05:00
rts/CheckUnload: Reset old_objects if unload is skipped

Previously `checkUnload` failed to reset `old_objects` when it decided
not to unload (e.g. due to heap profiling being enabled).

Fixes #24935.

- - - - -
733153f3 by Ben Gamari at 2024-12-10T19:37:07-05:00
rts: Annotate BCOs with their Name

This introduces a new bytecode instruction, `BCO_NAME`, to aid in debugging
bytecode execution. This instruction is injected by `mkProtoBCO` and
captures the Haskell name of the BCO. It is then printed by the
disassembler, allowing ready correlation with STG dumps.

- - - - -
775881bb by Ben Gamari at 2024-12-10T19:37:07-05:00
rts(setNumCapabilities): Assert that n_caps < MAX_N_CAPS

It was noticed in #25560 that this would previously be allowed,
resulting in a segfault.

I will add a proper exception in `base` in a future commit.

- - - - -
dabdcc96 by Ben Gamari at 2024-12-10T19:37:07-05:00
ghc-internal: Fix inconsistent FFI import types

The foreign imports of `enabled_capabilities` and
`getNumberOfProcessors` were declared as `CInt` whereas they are defined
as `uint32_t`.

- - - - -
dd484a69 by Ben Gamari at 2024-12-10T19:37:07-05:00
rts: Mention maximum capability count in users guide

Addresses #25560.

- - - - -
957fc6e5 by Ben Gamari at 2024-12-10T19:37:07-05:00
rts/Capability: Move induction variable declaration into `for`s

Just a stylistic change.

- - - - -
da55325e by Ben Gamari at 2024-12-10T19:37:07-05:00
rts: Determine max_n_capabilities at RTS startup

Previously the maximum number of capabilities supported by the RTS was
statically capped at 256. However, this bound is uncomfortably low given
the size of today's machine.

While supporting unbounded, fully-dynamic adjustment would be nice, it
is complex and so instead we do something simpler: Probe the logical
core count at RTS startup and use this as the static bound for the rest
of our execution.

This should avoid users running into the capability limit on large
machines while avoiding wasting memory on a large capabilities array for
most users and keeping complexity at bay.

Addresses #25560.

- - - - -
e129a40b by Ben Gamari at 2024-12-10T19:37:07-05:00
testsuite: Add test for #25560

- - - - -


30 changed files:

- .gitlab/ci.sh
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Instr.hs
- compiler/GHC/Cmm/ThreadSanitizer.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/StgToByteCode.hs
- compiler/GHC/Unit/Info.hs
- compiler/ghc.cabal.in
- docs/users_guide/debugging.rst
- docs/users_guide/using-concurrent.rst
- hadrian/src/Rules/Documentation.hs
- hadrian/src/Rules/Rts.hs
- libraries/ghc-internal/src/GHC/Internal/Conc/Sync.hs
- libraries/ghc-internal/src/GHC/Internal/List.hs
- rts/Capability.c
- rts/Capability.h
- rts/CheckUnload.c
- rts/Disassembler.c
- rts/Interpreter.c
- rts/RtsSymbols.c
- rts/Schedule.c
- rts/include/rts/Bytecodes.h
- rts/include/rts/Config.h
- rts/include/rts/Threads.h
- rts/rts.cabal
- + testsuite/tests/rts/T25560.hs
- testsuite/tests/rts/all.T
- testsuite/tests/th/T10279.hs
- testsuite/tests/th/T10279.stderr


Changes:

=====================================
.gitlab/ci.sh
=====================================
@@ -954,7 +954,7 @@ if [ "${CI_COMMIT_BRANCH:-}" == "master" ] &&  [ "${CI_PROJECT_PATH:-}" == "ghc/
   fi
 fi
 if [ -n "${IGNORE_PERF_FAILURES:-}" ]; then
-  RUNTEST_ARGS="--ignore-perf-failures=$IGNORE_PERF_FAILURES"
+  RUNTEST_ARGS=( "${RUNTEST_ARGS[@]:-}" "--ignore-perf-failures=$IGNORE_PERF_FAILURES" )
 fi
 
 if [[ -z ${BIGNUM_BACKEND:-} ]]; then BIGNUM_BACKEND=gmp; fi


=====================================
compiler/GHC/ByteCode/Asm.hs
=====================================
@@ -106,7 +106,7 @@ assembleBCOs interp profile proto_bcos tycons top_strs modbreaks spt_entries = d
   bcos'   <- mallocStrings interp bcos
   return CompiledByteCode
     { bc_bcos = bcos'
-    , bc_itbls =  itblenv
+    , bc_itbls = itblenv
     , bc_ffis = concatMap protoBCOFFIs proto_bcos
     , bc_strs = top_strs
     , bc_breaks = modbreaks
@@ -178,11 +178,12 @@ assembleOneBCO interp profile pbco = do
   return ubco'
 
 assembleBCO :: Platform -> ProtoBCO Name -> IO UnlinkedBCO
-assembleBCO platform (ProtoBCO { protoBCOName       = nm
-                             , protoBCOInstrs     = instrs
-                             , protoBCOBitmap     = bitmap
-                             , protoBCOBitmapSize = bsize
-                             , protoBCOArity      = arity }) = do
+assembleBCO platform
+            (ProtoBCO { protoBCOName       = nm
+                      , protoBCOInstrs     = instrs
+                      , protoBCOBitmap     = bitmap
+                      , protoBCOBitmapSize = bsize
+                      , protoBCOArity      = arity }) = do
   -- pass 1: collect up the offsets of the local labels.
   let asm = mapM_ (assembleI platform) instrs
 
@@ -527,6 +528,10 @@ assembleI platform i = case i of
                                                   , SmallOp tickx, SmallOp infox
                                                   , Op np
                                                   ]
+#if MIN_VERSION_rts(1,0,3)
+  BCO_NAME name            -> do np <- lit [BCONPtrStr name]
+                                 emit bci_BCO_NAME [Op np]
+#endif
 
   where
     literal (LitLabel fs _)   = litlabel fs


=====================================
compiler/GHC/ByteCode/Instr.hs
=====================================
@@ -1,4 +1,4 @@
-
+{-# LANGUAGE CPP #-}
 {-# LANGUAGE FlexibleContexts #-}
 {-# LANGUAGE LambdaCase #-}
 {-# OPTIONS_GHC -funbox-strict-fields #-}
@@ -27,6 +27,10 @@ import GHC.Runtime.Heap.Layout ( StgWord )
 import Data.Int
 import Data.Word
 
+#if MIN_VERSION_rts(1,0,3)
+import Data.ByteString (ByteString)
+#endif
+
 import GHC.Stack.CCS (CostCentre)
 
 import GHC.Stg.Syntax
@@ -229,6 +233,22 @@ data BCInstr
                       !Word16                -- breakpoint info index
                       (RemotePtr CostCentre)
 
+#if MIN_VERSION_rts(1,0,3)
+   -- | A "meta"-instruction for recording the name of a BCO for debugging purposes.
+   -- These are ignored by the interpreter but helpfully printed by the disassmbler.
+   | BCO_NAME         !ByteString
+#endif
+
+
+{- Note [BCO_NAME]
+   ~~~~~~~~~~~~~~~
+   The BCO_NAME instruction is a debugging-aid enabled with the -fadd-bco-name flag.
+   When enabled the bytecode assembler will prepend a BCO_NAME instruction to every
+   generated bytecode object capturing the STG name of the binding the BCO implements.
+   This is then printed by the bytecode disassembler, allowing bytecode objects to be
+   readily correlated with their STG and Core source.
+ -}
+
 -- -----------------------------------------------------------------------------
 -- Printing bytecode instructions
 
@@ -383,6 +403,9 @@ instance Outputable BCInstr where
                                <+> text "<tick_module>" <+> ppr tickx
                                <+> text "<info_module>" <+> ppr infox
                                <+> text "<cc>"
+#if MIN_VERSION_rts(1,0,3)
+   ppr (BCO_NAME nm)         = text "BCO_NAME" <+> text (show nm)
+#endif
 
 
 
@@ -487,3 +510,6 @@ bciStackUse SLIDE{}               = 0
 bciStackUse MKAP{}                = 0
 bciStackUse MKPAP{}               = 0
 bciStackUse PACK{}                = 1 -- worst case is PACK 0 words
+#if MIN_VERSION_rts(1,0,3)
+bciStackUse BCO_NAME{}            = 0
+#endif


=====================================
compiler/GHC/Cmm/ThreadSanitizer.hs
=====================================
@@ -209,8 +209,9 @@ tsanTarget fn formals args =
 tsanStore :: Env
           -> CmmType -> CmmExpr
           -> Block CmmNode O O
-tsanStore env ty addr =
-    mkUnsafeCall env ftarget [] [addr]
+tsanStore env ty addr
+  | typeWidth ty < W128 = mkUnsafeCall env ftarget [] [addr]
+  | otherwise           = emptyBlock
   where
     ftarget = tsanTarget fn [] [AddrHint]
     w = widthInBytes (typeWidth ty)
@@ -219,8 +220,9 @@ tsanStore env ty addr =
 tsanLoad :: Env
          -> AlignmentSpec -> CmmType -> CmmExpr
          -> Block CmmNode O O
-tsanLoad env align ty addr =
-    mkUnsafeCall env ftarget [] [addr]
+tsanLoad env align ty addr
+  | typeWidth ty < W128  = mkUnsafeCall env ftarget [] [addr]
+  | otherwise            = emptyBlock
   where
     ftarget = tsanTarget fn [] [AddrHint]
     w = widthInBytes (typeWidth ty)


=====================================
compiler/GHC/Driver/Flags.hs
=====================================
@@ -581,6 +581,7 @@ data GeneralFlag
    | Opt_DoAsmLinting
    | Opt_DoAnnotationLinting
    | Opt_DoBoundsChecking
+   | Opt_AddBcoName
    | Opt_NoLlvmMangler                  -- hidden flag
    | Opt_FastLlvm                       -- hidden flag
    | Opt_NoTypeableBinds


=====================================
compiler/GHC/Driver/Session.hs
=====================================
@@ -2540,6 +2540,7 @@ fFlagsDeps = [
   flagSpec "catch-nonexhaustive-cases"        Opt_CatchNonexhaustiveCases,
   flagSpec "alignment-sanitisation"           Opt_AlignmentSanitisation,
   flagSpec "check-prim-bounds"                Opt_DoBoundsChecking,
+  flagSpec "add-bco-name"                     Opt_AddBcoName,
   flagSpec "num-constant-folding"             Opt_NumConstantFolding,
   flagSpec "core-constant-folding"            Opt_CoreConstantFolding,
   flagSpec "fast-pap-calls"                   Opt_FastPAPCalls,


=====================================
compiler/GHC/StgToByteCode.hs
=====================================
@@ -1,4 +1,4 @@
-
+{-# LANGUAGE CPP                        #-}
 {-# LANGUAGE DeriveFunctor              #-}
 {-# LANGUAGE GeneralizedNewtypeDeriving #-}
 {-# LANGUAGE RecordWildCards            #-}
@@ -81,6 +81,9 @@ import GHC.Unit.Home.ModInfo (lookupHpt)
 import Data.Array
 import Data.Coerce (coerce)
 import Data.ByteString (ByteString)
+#if MIN_VERSION_rts(1,0,3)
+import qualified Data.ByteString.Char8 as BS
+#endif
 import Data.Map (Map)
 import Data.IntMap (IntMap)
 import qualified Data.Map as Map
@@ -236,7 +239,10 @@ ppBCEnv p
 -- Create a BCO and do a spot of peephole optimisation on the insns
 -- at the same time.
 mkProtoBCO
-   :: Platform
+   :: (Outputable name)
+   => Platform
+   -> Bool      -- ^ True <=> label with @BCO_NAME@ instruction
+                -- see Note [BCO_NAME]
    -> name
    -> BCInstrList
    -> Either  [CgStgAlt] (CgStgRhs)
@@ -247,10 +253,10 @@ mkProtoBCO
    -> Bool      -- ^ True <=> is a return point, rather than a function
    -> [FFIInfo]
    -> ProtoBCO name
-mkProtoBCO platform nm instrs_ordlist origin arity bitmap_size bitmap is_ret ffis
+mkProtoBCO platform _add_bco_name nm instrs_ordlist origin arity bitmap_size bitmap is_ret ffis
    = ProtoBCO {
         protoBCOName = nm,
-        protoBCOInstrs = maybe_with_stack_check,
+        protoBCOInstrs = maybe_add_bco_name $ maybe_add_stack_check peep_d,
         protoBCOBitmap = bitmap,
         protoBCOBitmapSize = fromIntegral bitmap_size,
         protoBCOArity = arity,
@@ -258,6 +264,14 @@ mkProtoBCO platform nm instrs_ordlist origin arity bitmap_size bitmap is_ret ffi
         protoBCOFFIs = ffis
       }
      where
+#if MIN_VERSION_rts(1,0,3)
+        maybe_add_bco_name instrs
+          | _add_bco_name = BCO_NAME str : instrs
+          where
+            str = BS.pack $ showSDocOneLine defaultSDocContext (ppr nm)
+#endif
+        maybe_add_bco_name instrs = instrs
+
         -- Overestimate the stack usage (in words) of this BCO,
         -- and if >= iNTERP_STACK_CHECK_THRESH, add an explicit
         -- stack check.  (The interpreter always does a stack check
@@ -265,17 +279,17 @@ mkProtoBCO platform nm instrs_ordlist origin arity bitmap_size bitmap is_ret ffi
         -- BCO anyway, so we only need to add an explicit one in the
         -- (hopefully rare) cases when the (overestimated) stack use
         -- exceeds iNTERP_STACK_CHECK_THRESH.
-        maybe_with_stack_check
-           | is_ret && stack_usage < fromIntegral (pc_AP_STACK_SPLIM (platformConstants platform)) = peep_d
+        maybe_add_stack_check instrs
+           | is_ret && stack_usage < fromIntegral (pc_AP_STACK_SPLIM (platformConstants platform)) = instrs
                 -- don't do stack checks at return points,
                 -- everything is aggregated up to the top BCO
                 -- (which must be a function).
                 -- That is, unless the stack usage is >= AP_STACK_SPLIM,
                 -- see bug #1466.
            | stack_usage >= fromIntegral iNTERP_STACK_CHECK_THRESH
-           = STKCHECK stack_usage : peep_d
+           = STKCHECK stack_usage : instrs
            | otherwise
-           = peep_d     -- the supposedly common case
+           = instrs     -- the supposedly common case
 
         -- We assume that this sum doesn't wrap
         stack_usage = sum (map bciStackUse peep_d)
@@ -308,6 +322,7 @@ schemeTopBind (id, rhs)
   | Just data_con <- isDataConWorkId_maybe id,
     isNullaryRepDataCon data_con = do
     platform <- profilePlatform <$> getProfile
+    add_bco_name <- shouldAddBcoName
         -- Special case for the worker of a nullary data con.
         -- It'll look like this:        Nil = /\a -> Nil a
         -- If we feed it into schemeR, we'll get
@@ -316,7 +331,8 @@ schemeTopBind (id, rhs)
         -- by just re-using the single top-level definition.  So
         -- for the worker itself, we must allocate it directly.
     -- ioToBc (putStrLn $ "top level BCO")
-    emitBc (mkProtoBCO platform (getName id) (toOL [PACK data_con 0, RETURN P])
+    emitBc (mkProtoBCO platform add_bco_name
+                       (getName id) (toOL [PACK data_con 0, RETURN P])
                        (Right rhs) 0 0 [{-no bitmap-}] False{-not alts-})
 
   | otherwise
@@ -358,6 +374,7 @@ schemeR_wrk
     -> BcM (ProtoBCO Name)
 schemeR_wrk fvs nm original_body (args, body)
    = do
+     add_bco_name <- shouldAddBcoName
      profile <- getProfile
      let
          platform  = profilePlatform profile
@@ -379,7 +396,7 @@ schemeR_wrk fvs nm original_body (args, body)
          bitmap = mkBitmap platform bits
      body_code <- schemeER_wrk sum_szsb_args p_init body
 
-     emitBc (mkProtoBCO platform nm body_code (Right original_body)
+     emitBc (mkProtoBCO platform add_bco_name nm body_code (Right original_body)
                  arity bitmap_size bitmap False{-not alts-})
 
 -- | Introduce break instructions for ticked expressions.
@@ -1069,9 +1086,10 @@ doCase d s p scrut bndr alts
            | ubx_tuple_frame    = SLIDE 0 2 `consOL` alt_final0
            | otherwise          = alt_final0
 
+     add_bco_name <- shouldAddBcoName
      let
          alt_bco_name = getName bndr
-         alt_bco = mkProtoBCO platform alt_bco_name alt_final (Left alts)
+         alt_bco = mkProtoBCO platform add_bco_name alt_bco_name alt_final (Left alts)
                        0{-no arity-} bitmap_size bitmap True{-is alts-}
      scrut_code <- schemeE (d + ret_frame_size_b + save_ccs_size_b)
                            (d + ret_frame_size_b + save_ccs_size_b)
@@ -1379,7 +1397,7 @@ Note [unboxed tuple bytecodes and tuple_BCO]
 
 tupleBCO :: Platform -> NativeCallInfo -> [(PrimRep, ByteOff)] -> [FFIInfo] -> ProtoBCO Name
 tupleBCO platform args_info args =
-  mkProtoBCO platform invented_name body_code (Left [])
+  mkProtoBCO platform False invented_name body_code (Left [])
              0{-no arity-} bitmap_size bitmap False{-is alts-}
   where
     {-
@@ -1398,9 +1416,9 @@ tupleBCO platform args_info args =
     body_code = mkSlideW 0 1          -- pop frame header
                 `snocOL` RETURN_TUPLE -- and add it again
 
-primCallBCO ::  Platform -> NativeCallInfo -> [(PrimRep, ByteOff)] -> [FFIInfo] -> ProtoBCO Name
+primCallBCO :: Platform -> NativeCallInfo -> [(PrimRep, ByteOff)] -> [FFIInfo] -> ProtoBCO Name
 primCallBCO platform args_info args =
-  mkProtoBCO platform invented_name body_code (Left [])
+  mkProtoBCO platform False invented_name body_code (Left [])
              0{-no arity-} bitmap_size bitmap False{-is alts-}
   where
     {-
@@ -2337,6 +2355,9 @@ getHscEnv = BcM $ \st -> return (st, bcm_hsc_env st)
 getProfile :: BcM Profile
 getProfile = targetProfile <$> getDynFlags
 
+shouldAddBcoName :: BcM Bool
+shouldAddBcoName = gopt Opt_AddBcoName <$> getDynFlags
+
 emitBc :: ([FFIInfo] -> ProtoBCO Name) -> BcM (ProtoBCO Name)
 emitBc bco
   = BcM $ \st -> return (st{ffis=[]}, bco (ffis st))


=====================================
compiler/GHC/Unit/Info.hs
=====================================
@@ -236,7 +236,7 @@ unitHsLibs namever ways0 p = map (mkDynName . addSuffix . ST.unpack) (unitLibrar
         -- This change elevates the need to add custom hooks
         -- and handling specifically for the `rts` package.
         addSuffix rts@"HSrts"       = rts       ++ (expandTag rts_tag)
-        addSuffix rts@"HSrts-1.0.2" = rts       ++ (expandTag rts_tag)
+        addSuffix rts@"HSrts-1.0.3" = rts       ++ (expandTag rts_tag)
         addSuffix other_lib         = other_lib ++ (expandTag tag)
 
         expandTag t | null t = ""


=====================================
compiler/ghc.cabal.in
=====================================
@@ -130,6 +130,7 @@ Library
                    exceptions == 0.10.*,
                    semaphore-compat,
                    stm,
+                   rts,
                    ghc-boot   == @ProjectVersionMunged@,
                    ghc-heap   == @ProjectVersionMunged@,
                    ghci == @ProjectVersionMunged@


=====================================
docs/users_guide/debugging.rst
=====================================
@@ -1240,8 +1240,7 @@ Other
     :type: dynamic
 
     :since: 9.8.1
-
-    default: enabled
+    :default: enabled
 
     At the moment, ghci disables optimizations, because not all passes
     are compatible with the interpreter.
@@ -1254,3 +1253,16 @@ Other
     expressions.
     Those cannot be stored in breakpoints, so any free variable that refers to
     optimized code will not be inspectable when this flag is enabled.
+
+.. ghc-flag:: -fadd-bco-name
+    :shortdesc: Add ``BCO_NAME`` instructions in generated bytecode.
+    :reverse: -fno-add-bco-name
+    :type: dynamic
+
+    :since: 9.14.1
+
+    Prefix every generated bytecode object with a ``BCO_NAME`` instruction
+    containing the STG name of the binding from which the BCO originated.
+    These are printed by the bytecode disassembler, aiding in correlating
+    bytecode with STG.
+


=====================================
docs/users_guide/using-concurrent.rst
=====================================
@@ -153,6 +153,14 @@ use the RTS :rts-flag:`-N ⟨x⟩` options.
     changed while the program is running by calling
     ``Control.Concurrent.setNumCapabilities``.
 
+
+.. note::
+
+    The maximum number of capabilities supported by the GHC runtime system is
+    determined when at RTS startup to be either 256, the value given by
+    :rts-flag:`-N ⟨x⟩`, or the number of logical CPU cores, whichever is
+    greater.
+
 The following options affect the way the runtime schedules threads on
 CPUs:
 


=====================================
hadrian/src/Rules/Documentation.hs
=====================================
@@ -326,11 +326,27 @@ getPkgDocTarget root path =
 
 -- | Build all PDF documentation
 buildPdfDocumentation :: Rules ()
-buildPdfDocumentation = mapM_ buildSphinxPdf docPaths
+buildPdfDocumentation = do
+    -- Note [Avoiding mktexfmt race]
+    -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    -- We must ensure that the *first* xelatex invocation in the
+    -- build is performed serially (that is, not concurrently with
+    -- any other xelatex invocations) as mktexfmt does not handle
+    -- racing `mkdir` calls gracefully. However, we assume that
+    -- subsequent invocations are safe to run concurrently since the
+    -- initial call will have created the requisite directories (namely
+    -- $HOME/.texlive2020/texmf-var/web2c).
+    --
+    -- Fixes #25564.
+    let maxConcurrentReaders = 1000
+    xelatexMutex <- newResource "xelatex-mutex" maxConcurrentReaders
+    let rs = [(xelatexMutex, 1)]
+
+    mapM_ (buildSphinxPdf rs) docPaths
 
 -- | Compile a Sphinx ReStructured Text package to LaTeX
-buildSphinxPdf :: FilePath -> Rules ()
-buildSphinxPdf path = do
+buildSphinxPdf :: [(Resource, Int)] -> FilePath -> Rules ()
+buildSphinxPdf rs path = do
     root <- buildRootRules
     root -/- pdfRoot -/- path <.> "pdf" %> \file -> do
 
@@ -344,7 +360,8 @@ buildSphinxPdf path = do
             checkSphinxWarnings dir
 
             -- LaTeX "fixed point"
-            build $ target docContext Xelatex [path <.> "tex"] [dir]
+            -- See Note [Avoiding mktexfmt race] above.
+            buildWithResources rs $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Xelatex [path <.> "tex"] [dir]
             build $ target docContext Makeindex [path <.> "idx"] [dir]


=====================================
hadrian/src/Rules/Rts.hs
=====================================
@@ -165,7 +165,7 @@ needRtsSymLinks stage rtsWays
 
 prefix, versionlessPrefix :: String
 versionlessPrefix = "libHSrts"
-prefix = versionlessPrefix ++ "-1.0.2"
+prefix = versionlessPrefix ++ "-1.0.3"
 
 -- removeRtsDummyVersion "a/libHSrts-1.0-ghc1.2.3.4.so"
 --                    == "a/libHSrts-ghc1.2.3.4.so"


=====================================
libraries/ghc-internal/src/GHC/Internal/Conc/Sync.hs
=====================================
@@ -1,4 +1,3 @@
-{-# LANGUAGE CPP #-}
 {-# LANGUAGE MagicHash #-}
 {-# LANGUAGE NoImplicitPrelude #-}
 {-# LANGUAGE RankNTypes #-}
@@ -394,13 +393,14 @@ getNumProcessors :: IO Int
 getNumProcessors = fmap fromIntegral c_getNumberOfProcessors
 
 foreign import ccall unsafe "getNumberOfProcessors"
-  c_getNumberOfProcessors :: IO CUInt
+  c_getNumberOfProcessors :: IO Word32
 
 -- | Returns the number of sparks currently in the local spark pool
 numSparks :: IO Int
 numSparks = IO $ \s -> case numSparks# s of (# s', n #) -> (# s', I# n #)
 
-foreign import ccall "&enabled_capabilities" enabled_capabilities :: Ptr CInt
+foreign import ccall "&enabled_capabilities"
+  enabled_capabilities :: Ptr Word32
 
 childHandler :: SomeException -> IO ()
 childHandler err = catch (real_handler err) childHandler


=====================================
libraries/ghc-internal/src/GHC/Internal/List.hs
=====================================
@@ -1643,6 +1643,8 @@ xs !! n
 --
 -- WARNING: This function takes linear time in the index.
 --
+-- @since base-4.19.0.0
+--
 -- ==== __Examples__
 --
 -- >>> ['a', 'b', 'c'] !? 0


=====================================
rts/Capability.c
=====================================
@@ -16,6 +16,7 @@
  *
  * --------------------------------------------------------------------------*/
 
+#include "rts/Config.h"
 #include "rts/PosixSource.h"
 #include "Rts.h"
 
@@ -40,12 +41,16 @@ Capability MainCapability;
 uint32_t n_capabilities = 0;
 uint32_t enabled_capabilities = 0;
 
+// The size of the `capabilities` array initialized at RTS startup. Referenced
+// by GHC.Internal.Conc.Sync
+uint32_t max_n_capabilities = MAX_N_CAPABILITIES;
+
 // The array of Capabilities.  It's important that when we need
 // to allocate more Capabilities we don't have to move the existing
 // Capabilities, because there may be pointers to them in use
 // (e.g. threads in waitForCapability(), see #8209), so this is
 // an array of Capability* rather than an array of Capability.
-Capability *capabilities[MAX_N_CAPABILITIES];
+Capability **capabilities;
 
 // Holds the Capability which last became free.  This is used so that
 // an in-call has a chance of quickly finding a free Capability.
@@ -344,8 +349,6 @@ initCapability (Capability *cap, uint32_t i)
  * ------------------------------------------------------------------------- */
 void initCapabilities (void)
 {
-    uint32_t i;
-
     /* Declare a couple capability sets representing the process and
        clock domain. Each capability will get added to these capsets. */
     traceCapsetCreate(CAPSET_OSPROCESS_DEFAULT, CapsetTypeOsProcess);
@@ -354,7 +357,7 @@ void initCapabilities (void)
     // Initialise NUMA
     if (!RtsFlags.GcFlags.numa) {
         n_numa_nodes = 1;
-        for (i = 0; i < MAX_NUMA_NODES; i++) {
+        for (uint32_t i = 0; i < MAX_NUMA_NODES; i++) {
             numa_map[i] = 0;
         }
     } else if (RtsFlags.DebugFlags.numa) {
@@ -388,12 +391,30 @@ void initCapabilities (void)
     }
 #endif
 
-    if (RtsFlags.ParFlags.nCapabilities > MAX_N_CAPABILITIES) {
-        errorBelch("warning: this GHC runtime system only supports up to %d capabilities",
-                   MAX_N_CAPABILITIES);
-        RtsFlags.ParFlags.nCapabilities = MAX_N_CAPABILITIES;
+    /*
+     * Note [Capabilities array sizing]
+     * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+     * Determine the size of the capabilities array as the maximum of:
+     *   * the static lower bound, `MAX_N_CAPABILITIES`
+     *   * the logical core count
+     *   * the users's choice of `+RTS -N`
+     * This will serve as the upper bound on the capability count for the rest
+     * of execution. Calls to `setNumCapabilities` exceeding this bound will
+     * issue a warning and otherwise have no effect.
+     *
+     * See #25560.
+     */
+    uint32_t core_count = getNumberOfProcessors();
+    if (core_count > max_n_capabilities) {
+        max_n_capabilities = core_count;
     }
 
+    if (RtsFlags.ParFlags.nCapabilities > max_n_capabilities) {
+        max_n_capabilities = RtsFlags.ParFlags.nCapabilities;
+    }
+
+    capabilities = stgMallocBytes(sizeof(Capability) * max_n_capabilities, "initCapabilities");
+
     n_capabilities = 0;
     moreCapabilities(0, RtsFlags.ParFlags.nCapabilities);
     n_capabilities = RtsFlags.ParFlags.nCapabilities;
@@ -401,6 +422,7 @@ void initCapabilities (void)
 #else /* !THREADED_RTS */
 
     n_capabilities = 1;
+    capabilities = stgMallocBytes(sizeof(Capability), "initCapabilities");
     capabilities[0] = &MainCapability;
 
     initCapability(&MainCapability, 0);
@@ -412,7 +434,7 @@ void initCapabilities (void)
     // There are no free capabilities to begin with.  We will start
     // a worker Task to each Capability, which will quickly put the
     // Capability on the free list when it finds nothing to do.
-    for (i = 0; i < n_numa_nodes; i++) {
+    for (uint32_t i = 0; i < n_numa_nodes; i++) {
         last_free_capability[i] = getCapability(0);
     }
 }


=====================================
rts/Capability.h
=====================================
@@ -270,11 +270,13 @@ INLINE_HEADER void releaseCapability_ (Capability* cap STG_UNUSED,
 // extern Capability MainCapability;
 
 // declared in rts/include/rts/Threads.h:
+// extern uint32_t max_n_capabilities;
 // extern uint32_t n_capabilities;
 // extern uint32_t enabled_capabilities;
 
-// Array of all the capabilities
-extern Capability *capabilities[MAX_N_CAPABILITIES];
+// Array of all the capabilities, of size max_n_capabilities
+// See Note [Capabilities array sizing] in rts/Capability.c.
+extern Capability **capabilities;
 
 INLINE_HEADER Capability *getCapability(uint32_t i)
 {


=====================================
rts/CheckUnload.c
=====================================
@@ -467,48 +467,44 @@ bool prepareUnloadCheck(void)
 
 void checkUnload(void)
 {
-    if (global_s_indices == NULL) {
-        return;
-    } else if (!safeToUnload()) {
-        return;
-    }
-
     // At this point we've marked all dynamically loaded static objects
     // (including their dependencies) during GC, but not the root set of object
     // code (loaded_objects). Mark the roots first, then unload any unmarked
     // objects.
 
-    OCSectionIndices *s_indices = global_s_indices;
-    ASSERT(s_indices->sorted);
+    if (global_s_indices != NULL && safeToUnload()) {
+        OCSectionIndices *s_indices = global_s_indices;
+        ASSERT(s_indices->sorted);
 
-    // Mark roots
-    for (ObjectCode *oc = loaded_objects; oc != NULL; oc = oc->next_loaded_object) {
-        markObjectLive(NULL, (W_)oc, NULL);
-    }
+        // Mark roots
+        for (ObjectCode *oc = loaded_objects; oc != NULL; oc = oc->next_loaded_object) {
+            markObjectLive(NULL, (W_)oc, NULL);
+        }
 
-    // Free unmarked objects
-    ObjectCode *next = NULL;
-    for (ObjectCode *oc = old_objects; oc != NULL; oc = next) {
-        next = oc->next;
-        ASSERT(oc->status == OBJECT_UNLOADED);
-
-        // Symbols should be removed by unloadObj_.
-        // NB (osa): If this assertion doesn't hold then freeObjectCode below
-        // will corrupt symhash as keys of that table live in ObjectCodes. If
-        // you see a segfault in a hash table operation in linker (in non-debug
-        // RTS) then it's probably because this assertion did not hold.
-        ASSERT(oc->symbols == NULL);
-
-        if (oc->unloadable) {
-            removeOCSectionIndices(s_indices, oc);
-            freeObjectCode(oc);
-            n_unloaded_objects -= 1;
-        } else {
-            // If we don't have enough information to
-            // accurately determine the reachability of
-            // the object then hold onto it.
-            oc->next = objects;
-            objects = oc;
+        // Free unmarked objects
+        ObjectCode *next = NULL;
+        for (ObjectCode *oc = old_objects; oc != NULL; oc = next) {
+            next = oc->next;
+            ASSERT(oc->status == OBJECT_UNLOADED);
+
+            // Symbols should be removed by unloadObj_.
+            // NB (osa): If this assertion doesn't hold then freeObjectCode below
+            // will corrupt symhash as keys of that table live in ObjectCodes. If
+            // you see a segfault in a hash table operation in linker (in non-debug
+            // RTS) then it's probably because this assertion did not hold.
+            ASSERT(oc->symbols == NULL);
+
+            if (oc->unloadable) {
+                removeOCSectionIndices(s_indices, oc);
+                freeObjectCode(oc);
+                n_unloaded_objects -= 1;
+            } else {
+                // If we don't have enough information to
+                // accurately determine the reachability of
+                // the object then hold onto it.
+                oc->next = objects;
+                objects = oc;
+            }
         }
     }
 


=====================================
rts/Disassembler.c
=====================================
@@ -452,6 +452,13 @@ disInstr ( StgBCO *bco, int pc )
          debugBelch("RETURN_T\n ");
          break;
 
+      case bci_BCO_NAME: {
+         const char *name = (const char*) literals[instrs[pc]];
+         debugBelch("BCO_NAME    \"%s\"\n ", name);
+         pc += 1;
+         break;
+      }
+
       default:
          barf("disInstr: unknown opcode %u", (unsigned int) instr);
    }
@@ -464,10 +471,9 @@ void disassemble( StgBCO *bco )
    StgWord16*     instrs  = (StgWord16*)(bco->instrs->payload);
    StgMutArrPtrs* ptrs    = bco->ptrs;
    uint32_t       nbcs    = (uint32_t)(bco->instrs->bytes / sizeof(StgWord16));
-   uint32_t       pc      = 1;
+   uint32_t       pc      = 0;
 
    debugBelch("BCO\n" );
-   pc = 0;
    while (pc < nbcs) {
       debugBelch("\t%2d:  ", pc );
       pc = disInstr ( bco, pc );


=====================================
rts/Interpreter.c
=====================================
@@ -2085,6 +2085,10 @@ run_BCO:
             goto do_return_nonpointer;
         }
 
+        case bci_BCO_NAME:
+            bciPtr++;
+            goto nextInsn;
+
         case bci_SWIZZLE: {
             W_ stkoff = BCO_GET_LARGE_ARG;
             StgInt n = BCO_GET_LARGE_ARG;


=====================================
rts/RtsSymbols.c
=====================================
@@ -909,6 +909,7 @@ extern char **environ;
       SymI_NeedsDataProto(rts_stop_on_exception)                        \
       SymI_HasProto(stopTimer)                                          \
       SymI_HasProto(n_capabilities)                                     \
+      SymI_HasProto(max_n_capabilities)                                 \
       SymI_HasProto(enabled_capabilities)                               \
       SymI_HasDataProto(stg_traceEventzh)                                   \
       SymI_HasDataProto(stg_traceMarkerzh)                                  \


=====================================
rts/Schedule.c
=====================================
@@ -2288,9 +2288,12 @@ setNumCapabilities (uint32_t new_n_capabilities USED_IF_THREADS)
     } else if (new_n_capabilities <= 0) {
         errorBelch("setNumCapabilities: Capability count must be positive");
         return;
+    } else if (new_n_capabilities > max_n_capabilities) {
+        // See Note [Capabilities array sizing] in rts/Capability.c.
+        errorBelch("setNumCapabilities: Attempt to increase capability count beyond maximum capability count %" PRIu32 "; clamping...\n", max_n_capabilities);
+        new_n_capabilities = max_n_capabilities;
     }
 
-
     debugTrace(DEBUG_sched, "changing the number of Capabilities from %d to %d",
                enabled_capabilities, new_n_capabilities);
 


=====================================
rts/include/rts/Bytecodes.h
=====================================
@@ -112,6 +112,8 @@
 
 #define bci_PRIMCALL                    87
 
+#define bci_BCO_NAME                    88
+
 /* If you need to go past 255 then you will run into the flags */
 
 /* If you need to go below 0x0100 then you will run into the instructions */


=====================================
rts/include/rts/Config.h
=====================================
@@ -78,6 +78,10 @@ code.
 #endif
 
 #if defined(THREADED_RTS)
+/*
+ * See Note [Capabilities array sizing] in rts/Capability.c.
+ * Update the note in docs/users_guide/using-concurrent.rst when updating this.
+ */
 #define MAX_N_CAPABILITIES 256
 #else
 #define MAX_N_CAPABILITIES 1


=====================================
rts/include/rts/Threads.h
=====================================
@@ -69,7 +69,7 @@ HsBool rtsSupportsBoundThreads (void);
 // The number of Capabilities.
 // TODO: Ideally we would only provide getNumCapabilities
 // but this is used in compiler/cbits/genSym.c
-extern unsigned int n_capabilities;
+extern uint32_t n_capabilities;
 
 INLINE_HEADER unsigned int getNumCapabilities(void)
 { return RELAXED_LOAD(&n_capabilities); }
@@ -77,6 +77,10 @@ INLINE_HEADER unsigned int getNumCapabilities(void)
 // The number of Capabilities that are not disabled
 extern uint32_t enabled_capabilities;
 
+// The maximum number of Capabilities supported by the RTS.
+// See Note [Capabilities array sizing] in rts/Capability.c.
+extern uint32_t max_n_capabilities;
+
 #if !IN_STG_CODE
 extern Capability MainCapability;
 #endif


=====================================
rts/rts.cabal
=====================================
@@ -1,6 +1,6 @@
 cabal-version: 3.0
 name: rts
-version: 1.0.2
+version: 1.0.3
 synopsis: The GHC runtime system
 description:
     The GHC runtime system.


=====================================
testsuite/tests/rts/T25560.hs
=====================================
@@ -0,0 +1,4 @@
+import GHC.Conc
+
+main :: IO ()
+main = setNumCapabilities 100000


=====================================
testsuite/tests/rts/all.T
=====================================
@@ -630,3 +630,7 @@ test('T24142', [req_target_smp], compile_and_run, ['-threaded -with-rtsopts "-N2
 
 test('T25232', [unless(have_profiling(), skip), only_ways(['normal','nonmoving','nonmoving_prof','nonmoving_thr_prof']), extra_ways(['nonmoving', 'nonmoving_prof'] + (['nonmoving_thr_prof'] if have_threaded() else []))], compile_and_run, [''])
 test('T25280', [unless(opsys('linux'),skip),req_process,js_skip], compile_and_run, [''])
+
+# N.B. This will likely issue a warning on stderr but we merely care that the
+# program doesn't crash.
+test('T25560', [normal, ignore_stderr], compile_and_run, [''])


=====================================
testsuite/tests/th/T10279.hs
=====================================
@@ -7,4 +7,4 @@ import Language.Haskell.TH.Syntax
 -- error message doesn't recognize it as a source package ID,
 -- (This is OK,  since it will look obviously wrong when they
 -- try to find the package in their package database.)
-blah = $(conE (Name (mkOccName "Foo") (NameG VarName (mkPkgName "rts-1.0.2") (mkModName "A"))))
+blah = $(conE (Name (mkOccName "Foo") (NameG VarName (mkPkgName "rts-1.0.3") (mkModName "A"))))


=====================================
testsuite/tests/th/T10279.stderr
=====================================
@@ -1,11 +1,11 @@
 
 T10279.hs:10:9: error: [GHC-51294]
     • Failed to load interface for ‘A’.
-      no unit id matching ‘rts-1.0.2’ was found
+      no unit id matching ‘rts-1.0.3’ was found
       (This unit ID looks like the source package ID;
        the real unit ID is ‘rts’)
     • In the untyped splice:
         $(conE
             (Name
                (mkOccName "Foo")
-               (NameG VarName (mkPkgName "rts-1.0.2") (mkModName "A"))))
+               (NameG VarName (mkPkgName "rts-1.0.3") (mkModName "A"))))



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/15b659acccd1321946cd2ead3c7b839f4fa06813...e129a40bd4c4c391040fb16d53bddd00ff280a11

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/15b659acccd1321946cd2ead3c7b839f4fa06813...e129a40bd4c4c391040fb16d53bddd00ff280a11
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20241210/800c7eaf/attachment-0001.html>


More information about the ghc-commits mailing list