[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 11 commits: rts/RtsFlags: Refactor size parsing

Sat Nov 16 17:51:04 UTC 2024


Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
77dad6d0 by Ben Gamari at 2024-11-16T12:50:32-05:00
rts/RtsFlags: Refactor size parsing

This makes a number of improvements mentioned in #20201:

 * fail if the argument cannot be parsed as a number (`-Mturtles`)
 * fail if an unrecognized unit is given (e.g. `-M1x`)

- - - - -
c1408534 by Ben Gamari at 2024-11-16T12:50:32-05:00
testsuite: Add tests for RTS flag parsing error handling

See #20201.

- - - - -
078f6bf6 by Ben Gamari at 2024-11-16T12:50:32-05:00
users guide: Mention language extensions in equality constraints discussion

As suggested in #24127, mention the language extensions necessary for
usage of equality constriants in their documentation.

Closes #24127.

- - - - -
3b120d1c by Ben Gamari at 2024-11-16T12:50:32-05:00
users-guide/9.14.1-notes: Fix list syntax

- - - - -
dd6a7f3c by Ben Gamari at 2024-11-16T12:50:32-05:00
users-guide/debug-info: Fix duplicate flag descriptions

- - - - -
875f5adc by Ben Gamari at 2024-11-16T12:50:32-05:00
users-guide: Fix reference to 9.14.1 release notes

- - - - -
e7af443e by Ben Gamari at 2024-11-16T12:50:33-05:00
Introduce GHC.Tc.Plugin.lookupTHName

This makes it significantly more convenient (and less
GHC-version-dependent) to resolve a template-haskell name into a GHC
Name.

As proposed in #24741.

- - - - -
77a9549f by ARATA Mizuki at 2024-11-16T12:50:36-05:00
x86 NCG SIMD: Lower packFloatX4#, insertFloatX4# and broadcastFloatX4# to SSE1 instructions

Fixes #25441

Co-authored-by: sheaf <sam.derbyshire at gmail.com>

- - - - -
875fbbdb by sheaf at 2024-11-16T12:50:38-05:00
X86 NCG: allow VXOR at scalar floating-point types

The NCG can emit VXOR instructions at scalar floating-point types,
but the pretty-printer would panic instead of emitting the appropriate
VXORPS/VXORPD instructions. This patch rectifies that oversight.

Fixes #25455

- - - - -
1f81821e by Ben Gamari at 2024-11-16T12:50:39-05:00
rts: Fix platform-dependent pointer casts

Previously we had unnecessary (and incorrect) platform-dependent casts
to turn `OSThreadIds`s into a integer. We now just uniformly cast first
to a `uintptr_t` (which is always safe, regardless of whether
`OSThreadId` is a pointer), and then cast to the desired integral type.

This fixes a warning on musl platforms.

- - - - -
35006fac by Ben Gamari at 2024-11-16T12:50:39-05:00
testsuite: Mark encoding004 as broken on FreeBSD

Due to #22003, CP936 fails to roundtrip:
```diff
 == CP936
+Failed to roundtrip given mutant byte at index 891 (251 /= 123 at index 891)
+Failed to roundtrip given mutant byte at index 1605 (197 /= 69 at index 1605)
+Failed to roundtrip given mutant byte at index 2411 (235 /= 107 at index 2411)
+Failed to roundtrip given mutant byte at index 6480 (208 /= 80 at index 6480)
+Failed to roundtrip given mutant byte at index 6482 (210 /= 82 at index 6482)
+Failed to roundtrip given mutant byte at index 6484 (212 /= 84 at index 6484)
+Failed to roundtrip given mutant byte at index 6496 (224 /= 96 at index 6496)
+Failed to roundtrip given mutant byte at index 7243 (203 /= 75 at index 7243)
+Failed to roundtrip given mutant byte at index 7277 (237 /= 109 at index 7277)
+Failed to roundtrip given mutant byte at index 8027 (219 /= 91 at index 8027)
+Failed to roundtrip given mutant byte at index 8801 (225 /= 97 at index 8801)
```

- - - - -


27 changed files:

- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/Instr.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/Tc/Plugin.hs
- docs/users_guide/9.14.1-notes.rst
- docs/users_guide/debug-info.rst
- docs/users_guide/exts/equality_constraints.rst
- docs/users_guide/release-notes.rst
- libraries/base/tests/IO/all.T
- rts/RtsFlags.c
- rts/Task.h
- + testsuite/tests/rts/T20201a.hs
- + testsuite/tests/rts/T20201a.stderr
- + testsuite/tests/rts/T20201b.hs
- + testsuite/tests/rts/T20201b.stderr
- testsuite/tests/rts/all.T
- + testsuite/tests/simd/should_run/T25455.hs
- + testsuite/tests/simd/should_run/T25455.stdout
- testsuite/tests/simd/should_run/all.T
- + testsuite/tests/simd/should_run/simd_insert.hs
- + testsuite/tests/simd/should_run/simd_insert.stdout
- + testsuite/tests/simd/should_run/simd_insert_array.hs
- + testsuite/tests/simd/should_run/simd_insert_array.stdout
- + testsuite/tests/simd/should_run/simd_insert_array_baseline.hs
- + testsuite/tests/simd/should_run/simd_insert_array_baseline.stdout
- + testsuite/tests/simd/should_run/simd_insert_baseline.hs
- + testsuite/tests/simd/should_run/simd_insert_baseline.stdout


Changes:

=====================================
compiler/GHC/CmmToAsm/X86/CodeGen.hs
=====================================
@@ -971,7 +971,6 @@ getRegister' _ _ (CmmMachOp mop []) =
   pprPanic "getRegister(x86): nullary MachOp" (text $ show mop)
 
 getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
-    sse4_1 <- sse4_1Enabled
     avx    <- avxEnabled
     case mop of
       MO_F_Neg w  -> sse2NegCode w x
@@ -1068,10 +1067,7 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
         | avx
         -> vector_float_broadcast_avx l w x
         | otherwise
-        -> case w of
-            W32 | not sse4_1
-              -> sorry "32-bit float broadcast requires -msse4 or -fllvm."
-            _ -> vector_float_broadcast_sse l w x
+        -> vector_float_broadcast_sse l w x
       MO_V_Broadcast l w
         -> vector_int_broadcast l w x
 
@@ -1217,6 +1213,7 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
 
         -----------------------
 
+        -- TODO: we could use VBROADCASTSS/SD when AVX2 is available.
         vector_float_broadcast_avx :: Length
                                    -> Width
                                    -> CmmExpr
@@ -1224,11 +1221,8 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
         vector_float_broadcast_avx len w expr = do
           (dst, exp) <- getSomeReg expr
           let fmt = VecFormat len (floatScalarFormat w)
-              code = case w of
-                W64 -> unitOL $ VSHUF fmt (ImmInt 0) (OpReg dst) dst dst
-                _   -> toOL [ INSERTPS fmt (ImmInt 0b00_10_0000) (OpReg dst) dst
-                            , VSHUF fmt (ImmInt 0) (OpReg dst) dst dst ]
-          return $ Fixed fmt dst (exp `appOL` code)
+              code = VSHUF fmt (ImmInt 0) (OpReg dst) dst dst
+          return $ Fixed fmt dst (exp `snocOL` code)
 
         vector_float_broadcast_sse :: Length
                                    -> Width
@@ -1237,11 +1231,8 @@ getRegister' platform is32Bit (CmmMachOp mop [x]) = do -- unary MachOps
         vector_float_broadcast_sse len w expr = do
           (dst, exp) <- getSomeReg expr
           let fmt = VecFormat len (floatScalarFormat w)
-              code = case w of
-                W64 -> unitOL $ SHUF fmt (ImmInt 0) (OpReg dst) dst
-                _   -> toOL [ INSERTPS fmt (ImmInt 0b00_10_0000) (OpReg dst) dst
-                            , SHUF fmt (ImmInt 0) (OpReg dst) dst ]
-          return $ Fixed fmt dst (exp `appOL` code)
+              code = SHUF fmt (ImmInt 0) (OpReg dst) dst
+          return $ Fixed fmt dst (exp `snocOL` code)
 
         vector_int_broadcast :: Length
                              -> Width
@@ -1801,9 +1792,12 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
         -> genFMA3Code l w var x y z
 
       -- Ternary vector operations
-      MO_VF_Insert l W32  | sse4_1 -> vector_float_insert_sse l x y z
-                          | otherwise
-                          -> sorry "FloatX4# operations require either -msse4 or -fllvm"
+      MO_VF_Insert l W32  | l == 4 -> vector_floatx4_insert_sse sse4_1 x y z
+                          | otherwise ->
+         sorry $ "FloatX" ++ show l ++ "# insert operations require -fllvm"
+           -- SIMD NCG TODO:
+           --
+           --   - add support for FloatX8, FloatX16.
       MO_VF_Insert l W64  -> vector_double_insert avx l x y z
       MO_V_Insert l W64   -> vector_int_insert_sse l W64 x y z
 
@@ -1814,31 +1808,59 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
     -- SIMD NCG TODO:
     --
     --   - add support for FloatX8, FloatX16.
-    vector_float_insert_sse :: Length
-                            -> CmmExpr
-                            -> CmmExpr
-                            -> CmmExpr
-                            -> NatM Register
-    -- FloatX4
-    vector_float_insert_sse len at 4 vecExpr valExpr (CmmLit (CmmInt offset _))
-      = do
-      (r, exp)    <- getNonClobberedReg valExpr
-      fn          <- getAnyReg vecExpr
-      let fmt      = VecFormat len FmtFloat
-          imm      = litToImm (CmmInt (offset `shiftL` 4) W32)
-          code dst = exp `appOL`
-                     (fn dst) `snocOL`
-                     (INSERTPS fmt imm (OpReg r) dst)
-       in return $ Any fmt code
-    vector_float_insert_sse len _ _ offset
+    vector_floatx4_insert_sse :: Bool
+                              -> CmmExpr
+                              -> CmmExpr
+                              -> CmmExpr
+                              -> NatM Register
+    vector_floatx4_insert_sse sse4_1 vecExpr valExpr (CmmLit (CmmInt offset _))
+      | sse4_1 = do
+        (r, exp)    <- getNonClobberedReg valExpr
+        fn          <- getAnyReg vecExpr
+        let fmt      = VecFormat 4 FmtFloat
+            imm      = litToImm (CmmInt (offset `shiftL` 4) W32)
+            code dst = exp `appOL`
+                      (fn dst) `snocOL`
+                      (INSERTPS fmt imm (OpReg r) dst)
+         in return $ Any fmt code
+      | otherwise = do -- SSE <= 3
+        (r, exp)    <- getNonClobberedReg valExpr
+        fn          <- getAnyReg vecExpr
+        let fmt      = VecFormat 4 FmtFloat
+        tmp <- getNewRegNat fmt
+        let code dst
+              = case offset of
+                  0 -> exp `appOL`
+                      (fn dst) `snocOL`
+                      -- The following MOV compiles to MOVSS instruction and merges two vectors
+                      (MOV fmt (OpReg r) (OpReg dst))  -- dst <- (r[0],dst[1],dst[2],dst[3])
+                  1 -> exp `appOL`
+                      (fn dst) `snocOL`
+                      (MOVU fmt (OpReg dst) (OpReg tmp)) `snocOL`  -- tmp <- dst
+                      (UNPCKL fmt (OpReg r) dst) `snocOL`          -- dst <- (dst[0],r[0],dst[1],r[1])
+                      (SHUF fmt (ImmInt 0xe4) (OpReg tmp) dst)     -- dst <- (dst[0],dst[1],tmp[2],tmp[3])
+                  2 -> exp `appOL`
+                       (fn dst) `snocOL`
+                       (MOVU fmt (OpReg dst) (OpReg tmp)) `snocOL`  -- tmp <- dst
+                       (MOV fmt (OpReg r) (OpReg tmp)) `snocOL`     -- tmp <- (r[0],tmp[1],tmp[2],tmp[3]) with MOVSS
+                       (SHUF fmt (ImmInt 0xc4) (OpReg tmp) dst)     -- dst <- (dst[0],dst[1],tmp[0],tmp[3])
+                  3 -> exp `appOL`
+                       (fn dst) `snocOL`
+                       (MOVU fmt (OpReg dst) (OpReg tmp)) `snocOL`  -- tmp <- dst
+                       (MOV fmt (OpReg r) (OpReg tmp)) `snocOL`     -- tmp <- (r[0],tmp[1],tmp[2],tmp[3]) with MOVSS
+                       (SHUF fmt (ImmInt 0x24) (OpReg tmp) dst)     -- dst <- (dst[0],dst[1],tmp[2],tmp[0])
+                  _ -> panic "MO_VF_Insert FloatX4: unsupported offset"
+         in return $ Any fmt code
+    vector_floatx4_insert_sse _ _ _ offset
       = pprPanic "Unsupported vector insert operation" $
           vcat
-            [ text "FloatX" <> ppr len <> text "#"
+            [ text "FloatX4#"
             , text "offset:" <+> pdoc platform offset ]
 
+
     -- SIMD NCG TODO:
     --
-    --   - add support for FloatX8, FloatX16.
+    --   - add support for DoubleX4#, DoubleX8#.
     vector_double_insert :: Bool
                          -> Length
                          -> CmmExpr
@@ -1857,6 +1879,7 @@ getRegister' platform _is32Bit (CmmMachOp mop [x, y, z]) = do -- ternary MachOps
                   CmmInt 0 _ -> valExp `appOL`
                                 vecExp `snocOL`
                                 (movu (VecFormat 2 FmtDouble) (OpReg vecReg) (OpReg dst)) `snocOL`
+                                -- The following MOV compiles to MOVSD instruction and merges two vectors
                                 (MOV (VecFormat 2 FmtDouble) (OpReg valReg) (OpReg dst))
                   CmmInt 1 _ -> valExp `appOL`
                                 vecExp `snocOL`


=====================================
compiler/GHC/CmmToAsm/X86/Instr.hs
=====================================
@@ -325,6 +325,7 @@ data Instr
         -- | Move two 32-bit floats from the high part of an xmm register
         -- to the low part of another xmm register.
         | MOVHLPS    Format Reg Reg
+        | UNPCKL     Format Operand Reg
         | PUNPCKLQDQ Format Operand Reg
 
         -- Shift
@@ -524,6 +525,8 @@ regUsageOfInstr platform instr
 
     MOVHLPS    fmt src dst
       -> mkRU [mk fmt src] [mk fmt dst]
+    UNPCKL fmt src dst
+      -> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
     PUNPCKLQDQ fmt src dst
       -> mkRU (use_R fmt src [mk fmt dst]) [mk fmt dst]
 
@@ -765,6 +768,8 @@ patchRegsOfInstr platform instr env
 
     MOVHLPS    fmt src dst
       -> MOVHLPS fmt (env src) (env dst)
+    UNPCKL fmt src dst
+      -> UNPCKL fmt (patchOp src) (env dst)
     PUNPCKLQDQ fmt src dst
       -> PUNPCKLQDQ fmt (patchOp src) (env dst)
 


=====================================
compiler/GHC/CmmToAsm/X86/Ppr.hs
=====================================
@@ -1041,6 +1041,8 @@ pprInstr platform i = case i of
 
    MOVHLPS format from to
      -> pprOpReg (text "movhlps") format (OpReg from) to
+   UNPCKL format src dst
+     -> pprFormatOpReg (text "unpckl") format src dst
    PUNPCKLQDQ format from to
      -> pprOpReg (text "punpcklqdq") format from to
 
@@ -1343,9 +1345,11 @@ pprInstr platform i = case i of
        ]
      where
       mem = case fmt of
+        FF32 -> text "vxorps"
+        FF64 -> text "vxorpd"
         VecFormat _ FmtFloat -> text "vxorps"
         VecFormat _ FmtDouble -> text "vxorpd"
-        _ -> pprPanic "GHC.CmmToAsm.X86.Ppr.pprVxor: elementy type must be Float or Double"
+        _ -> pprPanic "GHC.CmmToAsm.X86.Ppr.pprVxor: element type must be Float or Double"
               (ppr fmt)
 
    pprInsert :: Line doc -> Format -> Imm -> Operand -> Reg -> doc


=====================================
compiler/GHC/Tc/Plugin.hs
=====================================
@@ -15,6 +15,7 @@ module GHC.Tc.Plugin (
         lookupOrig,
 
         -- * Looking up Names in the typechecking environment
+        lookupTHName,
         tcLookupGlobal,
         tcLookupTyCon,
         tcLookupDataCon,
@@ -74,6 +75,7 @@ import GHC.Tc.Utils.Env        ( TcTyThing )
 import GHC.Tc.Types.Evidence   ( CoercionHole, EvTerm(..)
                                , EvExpr, EvBindsVar, EvBind, mkGivenEvBind )
 import GHC.Types.Var           ( EvVar )
+import GHC.Plugins             ( thNameToGhcNameIO )
 
 import GHC.Unit.Module    ( ModuleName, Module )
 import GHC.Types.Name     ( OccName, Name )
@@ -90,6 +92,7 @@ import GHC.Core.InstEnv     ( InstEnvs )
 import GHC.Types.Unique     ( Unique )
 import GHC.Types.PkgQual    ( PkgQual )
 
+import qualified GHC.Internal.TH.Syntax as TH
 
 -- | Perform some IO, typically to interact with an external tool.
 tcPluginIO :: IO a -> TcPluginM a
@@ -108,6 +111,13 @@ findImportedModule mod_name mb_pkg = do
 lookupOrig :: Module -> OccName -> TcPluginM Name
 lookupOrig mod = unsafeTcPluginTcM . IfaceEnv.lookupOrig mod
 
+-- | Resolve a @template-haskell@ 'TH.Name' to a GHC 'Name'.
+--
+-- @since 9.14.1
+lookupTHName :: TH.Name -> TcPluginM (Maybe Name)
+lookupTHName th = do
+    nc <- hsc_NC <$> getTopEnv
+    tcPluginIO $ thNameToGhcNameIO nc th
 
 tcLookupGlobal :: Name -> TcPluginM TyThing
 tcLookupGlobal = unsafeTcPluginTcM . TcM.tcLookupGlobal


=====================================
docs/users_guide/9.14.1-notes.rst
=====================================
@@ -51,7 +51,7 @@ Cmm
 ~~~~~~~~~~~~~~~~~~~~
 
 * The functions `getClosureInfoTbl_maybe`, `getClosureInfoTbl`,
- `getClosurePtrArgs` and `getClosurePtrArgs_maybe` have been added to allow
+  `getClosurePtrArgs` and `getClosurePtrArgs_maybe` have been added to allow
   reading of the relevant Closure attributes without reliance on incomplete
   selectors.
 


=====================================
docs/users_guide/debug-info.rst
=====================================
@@ -395,18 +395,6 @@ to a source location. This lookup table is generated by using the ``-finfo-table
     :implies: :ghc-flag:`-finfo-table-map-with-stack`
     :implies: :ghc-flag:`-finfo-table-map-with-fallback`
 
-.. ghc-flag:: -finfo-table-map-with-stack
-    :shortdesc: Include info tables for ``STACK`` closures in the info table
-                map.
-    :type: dynamic
-    :reverse: -fno-info-table-map-with-stack
-    :category: debugging
-
-    :since: 9.10
-
-    Include info tables for ``STACK`` closures in the info table map. Note that
-    this flag is implied by :ghc-flag:`-finfo-table-map`.
-
 .. ghc-flag:: -fno-info-table-map-with-stack
     :shortdesc: Omit info tables for ``STACK`` closures from the info table
                 map.
@@ -423,6 +411,9 @@ to a source location. This lookup table is generated by using the ``-finfo-table
     ``STACK`` info tables from the info table map and decrease the size of
     executables with info table profiling information.
 
+    Note that :ghc-flag:`-finfo-table-map-with-stack` is implied by
+    :ghc-flag:`-finfo-table-map`.
+
 .. ghc-flag:: -finfo-table-map-with-fallback
     :shortdesc: Include info tables with no source location information in the
                 info table map.


=====================================
docs/users_guide/exts/equality_constraints.rst
=====================================
@@ -6,11 +6,11 @@ Equality constraints and Coercible constraint
 Equality constraints
 --------------------
 
-A type context can include equality constraints of the form ``t1 ~ t2``,
-which denote that the types ``t1`` and ``t2`` need to be the same. In
-the presence of type families, whether two types are equal cannot
-generally be decided locally. Hence, the contexts of function signatures
-may include equality constraints, as in the following example: ::
+When :extension:`TypeOperators` are enabled, a type context can include equality
+constraints of the form ``t1 ~ t2``, which denote that the types ``t1`` and
+``t2`` need to be the same. In the presence of type families, whether two types
+are equal cannot generally be decided locally. Hence, the contexts of function
+signatures may include equality constraints, as in the following example: ::
 
     sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2
 


=====================================
docs/users_guide/release-notes.rst
=====================================
@@ -4,4 +4,4 @@ Release notes
 .. toctree::
    :maxdepth: 1
 
-   9.12.1-notes
+   9.14.1-notes


=====================================
libraries/base/tests/IO/all.T
=====================================
@@ -137,17 +137,20 @@ test('encoding001', [], compile_and_run, [''])
 
 test('encoding002', normal, compile_and_run, [''])
 test('encoding003', normal, compile_and_run, [''])
-test('encoding004', [extra_files(['encoded-data/']), js_broken(22374),
-# wasi-libc doesn't have cp936, see
-# https://gitlab.haskell.org/ghc/wasi-libc/-/blob/main/libc-top-half/musl/src/locale/iconv.c#L38
-# and
-# https://gitlab.haskell.org/ghc/wasi-libc/-/blob/main/libc-top-half/musl/src/locale/codepages.h
-# for locales supported by wasi-libc's iconv implementation
-when(arch('wasm32'), skip),
-# MacOS Sonoma iconv() has a regression that causes this test to fail on the
-# CP936 roundtrip. See the ticket for related issues in other projects.
-when(opsys('darwin'), fragile(24161))
-], compile_and_run, [''])
+test('encoding004',
+     [ extra_files(['encoded-data/']),
+       js_broken(22374),
+       # wasi-libc doesn't have cp936, see
+       # https://gitlab.haskell.org/ghc/wasi-libc/-/blob/main/libc-top-half/musl/src/locale/iconv.c#L38
+       # and
+       # https://gitlab.haskell.org/ghc/wasi-libc/-/blob/main/libc-top-half/musl/src/locale/codepages.h
+       # for locales supported by wasi-libc's iconv implementation
+       when(arch('wasm32'), skip),
+       # MacOS Sonoma iconv() has a regression that causes this test to fail on the
+       # CP936 roundtrip. See the ticket for related issues in other projects.
+       when(opsys('darwin'), fragile(24161)),
+       when(opsys('freebsd'), expect_broken(22003))
+     ], compile_and_run, [''])
 test('encoding005', normal, compile_and_run, [''])
 
 test('environment001', [], makefile_test, ['environment001-test'])


=====================================
rts/RtsFlags.c
=====================================
@@ -2148,7 +2148,6 @@ static void initStatsFile (FILE *f)
 static StgWord64
 decodeSize(const char *flag, uint32_t offset, StgWord64 min, StgWord64 max)
 {
-    char c;
     const char *s;
     StgDouble m;
     StgWord64 val;
@@ -2161,19 +2160,47 @@ decodeSize(const char *flag, uint32_t offset, StgWord64 min, StgWord64 max)
     }
     else
     {
-        m = atof(s);
-        c = s[strlen(s)-1];
-
-        if (c == 't' || c == 'T')
-            m *= (StgWord64)1024*1024*1024*1024;
-        else if (c == 'g' || c == 'G')
-            m *= 1024*1024*1024;
-        else if (c == 'm' || c == 'M')
-            m *= 1024*1024;
-        else if (c == 'k' || c == 'K')
-            m *= 1024;
-        else if (c == 'w' || c == 'W')
-            m *= sizeof(W_);
+        char *end;
+        m = strtod(s, &end);
+
+        if (end == s) {
+            errorBelch("error in RTS option %s: unable to parse number '%s'", flag, s);
+            stg_exit(EXIT_FAILURE);
+        }
+
+        StgWord64 unit;
+        switch (*end) {
+        case 't':
+        case 'T':
+            unit = (StgWord64)1024*1024*1024*1024;
+            break;
+        case 'g':
+        case 'G':
+            unit = 1024*1024*1024;
+            break;
+        case 'm':
+        case 'M':
+            unit = 1024*1024;
+            break;
+        case 'k':
+        case 'K':
+            unit = 1024;
+            break;
+        case 'w':
+        case 'W':
+            unit = sizeof(W_);
+            break;
+        case 'b':
+        case 'B':
+        case '\0':
+            unit = 1;
+            break;
+        default:
+            errorBelch("error in RTS option %s: unknown unit suffix '%c'", flag, *end);
+            stg_exit(EXIT_FAILURE);
+        }
+
+        m *= unit;
     }
 
     val = (StgWord64)m;


=====================================
rts/Task.h
=====================================
@@ -314,18 +314,12 @@ typedef StgWord64 TaskId;
 //
 #if defined(THREADED_RTS)
 INLINE_HEADER TaskId serialiseTaskId (OSThreadId taskID) {
-#if defined(freebsd_HOST_OS) || defined(darwin_HOST_OS)
-    // Here OSThreadId is a pthread_t and pthread_t is a pointer, but within
+    // Here OSThreadId may be a pthread_t and pthread_t is a pointer, but within
     // the process we can still use that pointer value as a unique id.
-    return (TaskId) (size_t) taskID;
-#else
-    // On Windows, Linux and others it's an integral type to start with.
-    return (TaskId) taskID;
-#endif
+    return (TaskId) (uintptr_t) taskID;
 }
 #endif
 
-//
 // Get a serialisable Id for the Task's OS thread
 // Needed mainly for logging since the OSThreadId is an opaque type
 INLINE_HEADER TaskId
@@ -334,7 +328,7 @@ serialisableTaskId (Task *task)
 #if defined(THREADED_RTS)
     return serialiseTaskId(task->id);
 #else
-    return (TaskId) (size_t) task;
+    return (TaskId) (uintptr_t) task;
 #endif
 }
 


=====================================
testsuite/tests/rts/T20201a.hs
=====================================
@@ -0,0 +1 @@
+main = putStrLn "hi"


=====================================
testsuite/tests/rts/T20201a.stderr
=====================================
@@ -0,0 +1 @@
+T20201a: error in RTS option -AturtlesM: unable to parse number 'turtlesM'


=====================================
testsuite/tests/rts/T20201b.hs
=====================================
@@ -0,0 +1,2 @@
+main = putStrLn "hi"
+


=====================================
testsuite/tests/rts/T20201b.stderr
=====================================
@@ -0,0 +1 @@
+T20201b: error in RTS option -A64z: unknown unit suffix 'z'


=====================================
testsuite/tests/rts/all.T
=====================================
@@ -587,6 +587,10 @@ test('decodeMyStack_emptyListForMissingFlag',
   , js_broken(22261) # cloneMyStack# not yet implemented
   ], compile_and_run, [''])
 
+# Tests RTS flag parsing. Skipped on JS as it uses a distinct RTS.
+test('T20201a', [js_skip, exit_code(1)], compile_and_run, ['-with-rtsopts -AturtlesM'])
+test('T20201b', [js_skip, exit_code(1)], compile_and_run, ['-with-rtsopts -A64z'])
+
 test('T22012', [js_skip, extra_ways(['ghci'])], compile_and_run, ['T22012_c.c'])
 
 # Skip for JS platform as the JS RTS is always single threaded


=====================================
testsuite/tests/simd/should_run/T25455.hs
=====================================
@@ -0,0 +1,14 @@
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
+
+module Main where
+
+import GHC.Exts
+
+unpackFloatX4 :: FloatX4# -> (Float, Float, Float, Float)
+unpackFloatX4 v = case unpackFloatX4# v of
+  (# a0, a1, a2, a3 #) -> (F# a0, F# a1, F# a2, F# a3)
+
+main :: IO ()
+main = do
+    let v = packFloatX4# (# 0.0#, 1.0#, 2.0#, 3.0# #)
+    print $ unpackFloatX4 v


=====================================
testsuite/tests/simd/should_run/T25455.stdout
=====================================
@@ -0,0 +1 @@
+(0.0,1.0,2.0,3.0)


=====================================
testsuite/tests/simd/should_run/all.T
=====================================
@@ -12,13 +12,18 @@ setTestOpts(
   , when(unregisterised(), skip)
   , when(arch('wasm32'), skip)
   , js_skip
+  ])
+
+test('simd_insert_baseline', [], compile_and_run, [''])
+test('simd_insert_array_baseline', [], compile_and_run, [''])
 
-  # Ensure we set the CPU features we have available.
-  #
-  # This is especially important with the LLVM backend, as LLVM can otherwise
-  # produce ABI-incompatible code, e.g. when compiling usage of YMM registers
-  # with or without -mavx2.
-  , when(have_cpu_feature('sse4_1'), extra_hc_opts('-msse4'))
+# Ensure we set the CPU features we have available.
+#
+# This is especially important with the LLVM backend, as LLVM can otherwise
+# produce ABI-incompatible code, e.g. when compiling usage of YMM registers
+# with or without -mavx2.
+setTestOpts(
+  [ when(have_cpu_feature('sse4_1'), extra_hc_opts('-msse4'))
   , when(have_cpu_feature('avx'), extra_hc_opts('-mavx'))
   , when(have_cpu_feature('avx2'), extra_hc_opts('-mavx2'))
   , when(have_cpu_feature('avx512f'), extra_hc_opts('-mavx512f'))
@@ -55,6 +60,9 @@ test('simd014',
         # register on non-x86 architectures.
      compile_and_run, ['simd014Cmm.cmm'])
 
+test('simd_insert', [], compile_and_run, [''])
+test('simd_insert_array', [], compile_and_run, [''])
+
 test('T22187', [],compile,[''])
 test('T22187_run', [],compile_and_run,[''])
 test('T25062_V16', [], compile_and_run, [''])
@@ -81,3 +89,4 @@ test('T25062_V64'
     , [''])
 
 test('T25169', [], compile_and_run, [''])
+test('T25455', [], compile_and_run, [''])


=====================================
testsuite/tests/simd/should_run/simd_insert.hs
=====================================
@@ -0,0 +1,37 @@
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
+import GHC.Exts
+
+unpackFloatX4 :: FloatX4# -> (Float, Float, Float, Float)
+unpackFloatX4 v = case unpackFloatX4# v of
+  (# a0, a1, a2, a3 #) -> (F# a0, F# a1, F# a2, F# a3)
+
+unpackDoubleX2 :: DoubleX2# -> (Double, Double)
+unpackDoubleX2 v = case unpackDoubleX2# v of
+  (# a0, a1 #) -> (D# a0, D# a1)
+
+testFloatX4 :: IO ()
+testFloatX4 = do
+  let v = packFloatX4# (# 0.1#, 1.0#, 2.0#, 3.0# #)
+  print $ unpackFloatX4 v
+  let w = insertFloatX4# v 7.0# 0#
+  print $ unpackFloatX4 w
+  let x = insertFloatX4# v 7.0# 1#
+  print $ unpackFloatX4 x
+  let y = insertFloatX4# v 7.0# 2#
+  print $ unpackFloatX4 y
+  let z = insertFloatX4# v 7.0# 3#
+  print $ unpackFloatX4 z
+
+testDoubleX2 :: IO ()
+testDoubleX2 = do
+  let v = packDoubleX2# (# 0.1##, 1.0## #)
+  print $ unpackDoubleX2 v
+  let w = insertDoubleX2# v 7.0## 0#
+  print $ unpackDoubleX2 w
+  let x = insertDoubleX2# v 7.0## 1#
+  print $ unpackDoubleX2 x
+
+main :: IO ()
+main = do
+  testFloatX4
+  testDoubleX2


=====================================
testsuite/tests/simd/should_run/simd_insert.stdout
=====================================
@@ -0,0 +1,8 @@
+(0.1,1.0,2.0,3.0)
+(7.0,1.0,2.0,3.0)
+(0.1,7.0,2.0,3.0)
+(0.1,1.0,7.0,3.0)
+(0.1,1.0,2.0,7.0)
+(0.1,1.0)
+(7.0,1.0)
+(0.1,7.0)


=====================================
testsuite/tests/simd/should_run/simd_insert_array.hs
=====================================
@@ -0,0 +1,49 @@
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
+import Control.Monad
+import Data.Array.Base
+import GHC.Exts
+
+unpackFloatX4 :: FloatX4# -> (Float, Float, Float, Float)
+unpackFloatX4 v = case unpackFloatX4# v of
+  (# a0, a1, a2, a3 #) -> (F# a0, F# a1, F# a2, F# a3)
+
+unpackDoubleX2 :: DoubleX2# -> (Double, Double)
+unpackDoubleX2 v = case unpackDoubleX2# v of
+  (# a0, a1 #) -> (D# a0, D# a1)
+
+indexFloatArrayAsFloatX4 :: UArray Int Float -> Int -> FloatX4#
+indexFloatArrayAsFloatX4 (UArray l u n ba) i = case i - l of I# i# -> indexFloatArrayAsFloatX4# ba i#
+
+indexDoubleArrayAsDoubleX2 :: UArray Int Double -> Int -> DoubleX2#
+indexDoubleArrayAsDoubleX2 (UArray l u n ba) i = case i - l of I# i# -> indexDoubleArrayAsDoubleX2# ba i#
+
+someFloatArray :: UArray Int Float
+someFloatArray = listArray (0, 7) [111.0, 222.0, 333.0, 444.0, 555.0, 666.0, 777.0, 888.0]
+
+someDoubleArray :: UArray Int Double
+someDoubleArray = listArray (0, 7) [111.0, 222.0, 333.0, 444.0, 555.0, 666.0, 777.0, 888.0]
+
+testFloatX4 :: IO ()
+testFloatX4 = forM_ [0,4] $ \i -> do
+  let v = indexFloatArrayAsFloatX4 someFloatArray i
+  let w = insertFloatX4# v 123.45# 0#
+  print $ unpackFloatX4 w
+  let x = insertFloatX4# v 123.45# 1#
+  print $ unpackFloatX4 x
+  let y = insertFloatX4# v 123.45# 2#
+  print $ unpackFloatX4 y
+  let z = insertFloatX4# v 123.45# 3#
+  print $ unpackFloatX4 z
+
+testDoubleX2 :: IO ()
+testDoubleX2 = forM_ [0,2,4,6] $ \i -> do
+  let v = indexDoubleArrayAsDoubleX2 someDoubleArray i
+  let w = insertDoubleX2# v 123.45## 0#
+  print $ unpackDoubleX2 w
+  let x = insertDoubleX2# v 123.45## 1#
+  print $ unpackDoubleX2 x
+
+main :: IO ()
+main = do
+  testFloatX4
+  testDoubleX2


=====================================
testsuite/tests/simd/should_run/simd_insert_array.stdout
=====================================
@@ -0,0 +1,16 @@
+(123.45,222.0,333.0,444.0)
+(111.0,123.45,333.0,444.0)
+(111.0,222.0,123.45,444.0)
+(111.0,222.0,333.0,123.45)
+(123.45,666.0,777.0,888.0)
+(555.0,123.45,777.0,888.0)
+(555.0,666.0,123.45,888.0)
+(555.0,666.0,777.0,123.45)
+(123.45,222.0)
+(111.0,123.45)
+(123.45,444.0)
+(333.0,123.45)
+(123.45,666.0)
+(555.0,123.45)
+(123.45,888.0)
+(777.0,123.45)


=====================================
testsuite/tests/simd/should_run/simd_insert_array_baseline.hs
=====================================
@@ -0,0 +1,49 @@
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
+import Control.Monad
+import Data.Array.Base
+import GHC.Exts
+
+unpackFloatX4 :: FloatX4# -> (Float, Float, Float, Float)
+unpackFloatX4 v = case unpackFloatX4# v of
+  (# a0, a1, a2, a3 #) -> (F# a0, F# a1, F# a2, F# a3)
+
+unpackDoubleX2 :: DoubleX2# -> (Double, Double)
+unpackDoubleX2 v = case unpackDoubleX2# v of
+  (# a0, a1 #) -> (D# a0, D# a1)
+
+indexFloatArrayAsFloatX4 :: UArray Int Float -> Int -> FloatX4#
+indexFloatArrayAsFloatX4 (UArray l u n ba) i = case i - l of I# i# -> indexFloatArrayAsFloatX4# ba i#
+
+indexDoubleArrayAsDoubleX2 :: UArray Int Double -> Int -> DoubleX2#
+indexDoubleArrayAsDoubleX2 (UArray l u n ba) i = case i - l of I# i# -> indexDoubleArrayAsDoubleX2# ba i#
+
+someFloatArray :: UArray Int Float
+someFloatArray = listArray (0, 7) [111.0, 222.0, 333.0, 444.0, 555.0, 666.0, 777.0, 888.0]
+
+someDoubleArray :: UArray Int Double
+someDoubleArray = listArray (0, 7) [111.0, 222.0, 333.0, 444.0, 555.0, 666.0, 777.0, 888.0]
+
+testFloatX4 :: IO ()
+testFloatX4 = forM_ [0,4] $ \i -> do
+  let v = indexFloatArrayAsFloatX4 someFloatArray i
+  let w = insertFloatX4# v 123.45# 0#
+  print $ unpackFloatX4 w
+  let x = insertFloatX4# v 123.45# 1#
+  print $ unpackFloatX4 x
+  let y = insertFloatX4# v 123.45# 2#
+  print $ unpackFloatX4 y
+  let z = insertFloatX4# v 123.45# 3#
+  print $ unpackFloatX4 z
+
+testDoubleX2 :: IO ()
+testDoubleX2 = forM_ [0,2,4,6] $ \i -> do
+  let v = indexDoubleArrayAsDoubleX2 someDoubleArray i
+  let w = insertDoubleX2# v 123.45## 0#
+  print $ unpackDoubleX2 w
+  let x = insertDoubleX2# v 123.45## 1#
+  print $ unpackDoubleX2 x
+
+main :: IO ()
+main = do
+  testFloatX4
+  testDoubleX2


=====================================
testsuite/tests/simd/should_run/simd_insert_array_baseline.stdout
=====================================
@@ -0,0 +1,16 @@
+(123.45,222.0,333.0,444.0)
+(111.0,123.45,333.0,444.0)
+(111.0,222.0,123.45,444.0)
+(111.0,222.0,333.0,123.45)
+(123.45,666.0,777.0,888.0)
+(555.0,123.45,777.0,888.0)
+(555.0,666.0,123.45,888.0)
+(555.0,666.0,777.0,123.45)
+(123.45,222.0)
+(111.0,123.45)
+(123.45,444.0)
+(333.0,123.45)
+(123.45,666.0)
+(555.0,123.45)
+(123.45,888.0)
+(777.0,123.45)


=====================================
testsuite/tests/simd/should_run/simd_insert_baseline.hs
=====================================
@@ -0,0 +1,37 @@
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
+import GHC.Exts
+
+unpackFloatX4 :: FloatX4# -> (Float, Float, Float, Float)
+unpackFloatX4 v = case unpackFloatX4# v of
+  (# a0, a1, a2, a3 #) -> (F# a0, F# a1, F# a2, F# a3)
+
+unpackDoubleX2 :: DoubleX2# -> (Double, Double)
+unpackDoubleX2 v = case unpackDoubleX2# v of
+  (# a0, a1 #) -> (D# a0, D# a1)
+
+testFloatX4 :: IO ()
+testFloatX4 = do
+  let v = packFloatX4# (# 0.1#, 1.0#, 2.0#, 3.0# #)
+  print $ unpackFloatX4 v
+  let w = insertFloatX4# v 7.0# 0#
+  print $ unpackFloatX4 w
+  let x = insertFloatX4# v 7.0# 1#
+  print $ unpackFloatX4 x
+  let y = insertFloatX4# v 7.0# 2#
+  print $ unpackFloatX4 y
+  let z = insertFloatX4# v 7.0# 3#
+  print $ unpackFloatX4 z
+
+testDoubleX2 :: IO ()
+testDoubleX2 = do
+  let v = packDoubleX2# (# 0.1##, 1.0## #)
+  print $ unpackDoubleX2 v
+  let w = insertDoubleX2# v 7.0## 0#
+  print $ unpackDoubleX2 w
+  let x = insertDoubleX2# v 7.0## 1#
+  print $ unpackDoubleX2 x
+
+main :: IO ()
+main = do
+  testFloatX4
+  testDoubleX2


=====================================
testsuite/tests/simd/should_run/simd_insert_baseline.stdout
=====================================
@@ -0,0 +1,8 @@
+(0.1,1.0,2.0,3.0)
+(7.0,1.0,2.0,3.0)
+(0.1,7.0,2.0,3.0)
+(0.1,1.0,7.0,3.0)
+(0.1,1.0,2.0,7.0)
+(0.1,1.0)
+(7.0,1.0)
+(0.1,7.0)



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ed17b4c500665cf4db48f2600120ecca805f9765...35006face779416fabcd8d11366de55d6b570f84

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ed17b4c500665cf4db48f2600120ecca805f9765...35006face779416fabcd8d11366de55d6b570f84
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20241116/e00db882/attachment-0001.html>