[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 24 commits: hadrian: Add ghcToolchain to tool args list

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Tue Oct 10 16:10:49 UTC 2023



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
c83ae3b8 by Matthew Pickering at 2023-10-10T12:56:54+00:00
hadrian: Add ghcToolchain to tool args list

This allows you to load ghc-toolchain and ghc-toolchain-bin into HLS.

- - - - -
8fbe6b29 by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Normalise triple via config.sub

We were not normalising the target triple anymore like we did with the
old make build system.

Fixes #23856

- - - - -
d4cd6785 by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Add missing vendor normalisation

This is copied from m4/ghc_convert_vendor.m4

Towards #23868

- - - - -
671bb1ce by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Add loongarch64 to parseArch

Towards #23868

- - - - -
4725c6b7 by Matthew Pickering at 2023-10-10T12:56:54+00:00
Add same LD hack to ghc-toolchain

In the ./configure script, if you pass the `LD` variable then this has
the effect of stopping use searching for a linker and hence passing
`-fuse-ld=...`.

We want to emulate this logic in ghc-toolchain, if a use explicilty
specifies `LD` variable then don't add `-fuse-ld=..` with the goal of
making ./configure and ghc-toolchain agree on which flags to use when
using the C compiler as a linker.

This is quite unsavoury as we don't bake the choice of LD into the
configuration anywhere but what's important for now is making
ghc-toolchain and ./configure agree as much as possible.

See #23857 for more discussion

- - - - -
ab592988 by Ben Gamari at 2023-10-10T12:56:54+00:00
ghc-toolchain: Check for C99 support with -std=c99

Previously we failed to try enabling C99 support with `-std=c99`, as
`autoconf` attempts. This broke on older compilers (e.g. CentOS 7) which
don't enable C99 by default.

Fixes #23879.

- - - - -
10db17bc by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Add endianess check using __BYTE_ORDER__ macro

In very old toolchains the BYTE_ORDER macro is not set but thankfully
the __BYTE_ORDER__ macro can be used instead.

- - - - -
28db40d3 by Matthew Pickering at 2023-10-10T12:56:54+00:00
configure: AC_PATH_TARGET_TOOL for LD

We want to make sure that LD is set to an absolute path in order to be
consistent with the `LD=$(command -v ld)` call. The AC_PATH_TARGET_TOOL
macro uses the absolute path rather than AC_CHECK_TARGET_TOOL which
might use a relative path.

- - - - -
822ac14d by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Check whether we need -std=gnu99 for CPP as well

In ./configure the C99 flag is passed to the C compiler when used as a C
preprocessor. So we also check the same thing in ghc-toolchain.

- - - - -
7c8219c3 by Matthew Pickering at 2023-10-10T12:56:54+00:00
Check for --target linker flag separately to C compiler

There are situations where the C compiler doesn't accept `--target` but
when used as a linker it does (but doesn't do anything most likely)

In particular with old gcc toolchains, the C compiler doesn't support
--target but when used as a linker it does.

- - - - -
72d251fb by Matthew Pickering at 2023-10-10T12:56:54+00:00
Use Cc to compile test file in nopie check

We were attempting to use the C compiler, as a linker, to compile a file
in the nopie check, but that won't work in general as the flags we pass
to the linker might not be compatible with the ones we pass when using
the C compiler.

- - - - -
e2381d3c by Matthew Pickering at 2023-10-10T12:56:54+00:00
configure: Error when ghc-toolchain fails to compile

This is a small QOL change as if you are working on ghc-toolchain and it
fails to compile then configure will continue and can give you outdated
results.

- - - - -
edddacd5 by Matthew Pickering at 2023-10-10T12:56:54+00:00
configure: Check whether -no-pie works when the C compiler is used as a linker

`-no-pie` is a flag we pass when using the C compiler as a linker (see
pieCCLDOpts in GHC.Driver.Session) so we should test whether the C
compiler used as a linker supports the flag, rather than just the C
compiler.

- - - - -
3888d96e by Matthew Pickering at 2023-10-10T12:56:54+00:00
ghc-toolchain: Remove javascript special case for --target detection

emcc when used as a linker seems to ignore the --target flag, and for
consistency with configure which now tests for --target, we remove this
special case.

- - - - -
2862475a by Ben Gamari at 2023-10-10T12:56:54+00:00
toolchain: Don't pass --target to emscripten toolchain

As noted in `Note [Don't pass --target to emscripten toolchain]`,
emscripten's `emcc` is rather inconsistent with respect to its treatment
of the `--target` flag. Avoid this by special-casing this toolchain
in the `configure` script and `ghc-toolchain`.

Fixes on aspect of #23744.

- - - - -
1dc979f4 by Matthew Pickering at 2023-10-10T12:56:54+00:00
hadrian: Don't pass `--gcc-options` as a --configure-arg to cabal configure

Stop passing -gcc-options which mixed together linker flags and
non-linker flags. There's no guarantee the C compiler will accept both
of these in each mode.

- - - - -
f42c4bfa by Ben Gamari at 2023-10-10T12:56:54+00:00
configure: Probe stage0 link flags

For consistency with later stages and CC.

- - - - -
062beed5 by Sebastian Graf at 2023-10-10T12:09:57-04:00
Stricter Binary.get in GHC.Types.Unit (#23964)

I noticed some thunking while looking at Core.
This change has very modest, but throughout positive ghc/alloc effect:

```
 hard_hole_fits(normal) ghc/alloc    283,057,664    281,620,872  -0.5%

              geo. mean                                          -0.1%
              minimum                                            -0.5%
              maximum                                            +0.0%
```

Fixes #23964.

- - - - -
4c55ca63 by Bryan Richter at 2023-10-10T12:09:58-04:00
rel_eng/upload.sh cleanups

- - - - -
48b0ed9f by doyougnu at 2023-10-10T12:10:01-04:00
ci: add javascript label rule

This adds a rule which triggers the javascript job when the "javascript"
label is assigned to an MR.

- - - - -
55d56ef3 by Matthew Craven at 2023-10-10T12:10:01-04:00
Make 'wWarningFlagsDeps' include every WarningFlag

Fixes #24071.

- - - - -
8d49938a by Andreas Klebinger at 2023-10-10T12:10:01-04:00
Aarch64 NCG: Use encoded immediates for literals.

Try to generate

    instr x2, <imm>

instead of

    mov x1, lit
    instr x2, x1

When possible. This get's rid if quite a few redundant
mov instructions.

I believe this causes a metric decrease for LargeRecords as
we reduce register pressure.

-------------------------
Metric Decrease:
    LargeRecord
-------------------------

- - - - -
97d065ee by Andreas Klebinger at 2023-10-10T12:10:01-04:00
AArch NCG: Refactor getRegister'

Remove some special cases which can be handled just as well by the
generic case.

This increases code re-use while also fixing #23749. Since some of the
special case wasn't upholding Note [Signed arithmetic on AArch64].

- - - - -
c48f2a4f by Andreas Klebinger at 2023-10-10T12:10:01-04:00
Aarch ncg: Optimize immediate use for address calculations

When the offset doesn't fit into the immediate we now just reuse the
general getRegister' code path which is well optimized to compute the
offset into a register instead of a special case for CmmRegOff.

This means we generate a lot less code under certain conditions which is
why performance metrics for these improve.

-------------------------
Metric Decrease:
    T4801
    T5321FD
    T5321Fun
-------------------------

- - - - -


28 changed files:

- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- .gitlab/rel_eng/upload.sh
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/AArch64/Regs.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Unit/Types.hs
- configure.ac
- hadrian/src/Rules/ToolArgs.hs
- hadrian/src/Settings/Builders/Cabal.hs
- m4/find_merge_objects.m4
- m4/fp_cc_supports_target.m4
- m4/fp_gcc_supports_no_pie.m4
- + m4/fp_prog_cc_linker_target.m4
- m4/ghc_toolchain.m4
- utils/ghc-toolchain/exe/Main.hs
- utils/ghc-toolchain/ghc-toolchain.cabal
- + utils/ghc-toolchain/src/GHC/Toolchain/NormaliseTriple.hs
- utils/ghc-toolchain/src/GHC/Toolchain/ParseTriple.hs
- utils/ghc-toolchain/src/GHC/Toolchain/PlatformDetails.hs
- utils/ghc-toolchain/src/GHC/Toolchain/Program.hs
- utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cc.hs
- utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cpp.hs
- utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cxx.hs
- utils/ghc-toolchain/src/GHC/Toolchain/Tools/Link.hs


Changes:

=====================================
.gitlab/generate-ci/gen_ci.hs
=====================================
@@ -604,6 +604,7 @@ data Rule = ReleaseOnly  -- ^ Only run this job in a release pipeline
 data ValidateRule =
             FullCI       -- ^ Run this job when the "full-ci" label is present.
           | LLVMBackend  -- ^ Run this job when the "LLVM backend" label is present
+          | JSBackend    -- ^ Run this job when the "javascript" label is present
           | FreeBSDLabel -- ^ Run this job when the "FreeBSD" label is set.
           | NonmovingGc  -- ^ Run this job when the "non-moving GC" label is set.
           | IpeData      -- ^ Run this job when the "IPE" label is set
@@ -648,11 +649,12 @@ validateRuleString FullCI = or_all ([ labelString "full-ci"
                                     , branchStringLike "ghc-[0-9]+\\.[0-9]+"
                                     ])
 
-validateRuleString LLVMBackend = labelString "LLVM backend"
+validateRuleString LLVMBackend  = labelString "LLVM backend"
+validateRuleString JSBackend    = labelString "javascript"
 validateRuleString FreeBSDLabel = labelString "FreeBSD"
-validateRuleString NonmovingGc = labelString "non-moving GC"
-validateRuleString IpeData = labelString "IPE"
-validateRuleString TestPrimops = labelString "test-primops"
+validateRuleString NonmovingGc  = labelString "non-moving GC"
+validateRuleString IpeData      = labelString "IPE"
+validateRuleString TestPrimops  = labelString "test-primops"
 
 -- | A 'Job' is the description of a single job in a gitlab pipeline. The
 -- job contains all the information about how to do the build but can be further
@@ -1010,10 +1012,9 @@ job_groups =
      , disableValidate (standardBuildsWithConfig AArch64 (Linux Alpine318) (splitSectionsBroken vanilla))
      , fullyStaticBrokenTests (disableValidate (allowFailureGroup (standardBuildsWithConfig Amd64 (Linux Alpine312) staticNativeInt)))
      , validateBuilds Amd64 (Linux Debian11) (crossConfig "aarch64-linux-gnu" (Emulator "qemu-aarch64 -L /usr/aarch64-linux-gnu") Nothing)
-     , validateBuilds Amd64 (Linux Debian11) (crossConfig "javascript-unknown-ghcjs" (Emulator "js-emulator") (Just "emconfigure")
-        )
-        { bignumBackend = Native
-        }
+
+     , addValidateRule JSBackend (validateBuilds Amd64 (Linux Debian11) javascriptConfig)
+
      , make_wasm_jobs wasm_build_config
      , modifyValidateJobs manual $
          make_wasm_jobs wasm_build_config {bignumBackend = Native}
@@ -1024,6 +1025,8 @@ job_groups =
      ]
 
   where
+    javascriptConfig = (crossConfig "javascript-unknown-ghcjs" (Emulator "js-emulator") (Just "emconfigure"))
+                         { bignumBackend = Native }
 
     -- ghcilink002 broken due to #17869
     --


=====================================
.gitlab/jobs.yaml
=====================================
@@ -5255,7 +5255,7 @@
     ],
     "rules": [
       {
-        "if": "((($CI_MERGE_REQUEST_LABELS =~ /.*full-ci.*/) || ($CI_MERGE_REQUEST_LABELS =~ /.*marge_bot_batch_merge_job.*/) || ($CI_COMMIT_BRANCH == \"master\") || ($CI_COMMIT_BRANCH =~ /ghc-[0-9]+\\.[0-9]+/))) && ($RELEASE_JOB != \"yes\") && ($NIGHTLY == null)",
+        "if": "((($CI_MERGE_REQUEST_LABELS =~ /.*full-ci.*/) || ($CI_MERGE_REQUEST_LABELS =~ /.*marge_bot_batch_merge_job.*/) || ($CI_COMMIT_BRANCH == \"master\") || ($CI_COMMIT_BRANCH =~ /ghc-[0-9]+\\.[0-9]+/)) || ($CI_MERGE_REQUEST_LABELS =~ /.*javascript.*/)) && ($RELEASE_JOB != \"yes\") && ($NIGHTLY == null)",
         "when": "on_success"
       }
     ],


=====================================
.gitlab/rel_eng/upload.sh
=====================================
@@ -1,6 +1,7 @@
-#!/usr/bin/env bash
+#!/usr/bin/env nix-shell
+#! nix-shell -i bash -p moreutils lzip zip lftp gnupg
 
-set -e
+set -Eeuo pipefail
 
 # This is a script for preparing and uploading a release of GHC.
 #
@@ -30,21 +31,15 @@ set -e
 #
 # Prerequisites: moreutils
 
-if [ -z "$SIGNING_KEY" ]; then
-    SIGNING_KEY="=Benjamin Gamari <ben at well-typed.com>"
-fi
+: ${SIGNING_KEY:="=Benjamin Gamari <ben at well-typed.com>"}
 
 
 # Infer release name from directory name
-if [ -z "$rel_name" ]; then
-    rel_name="$(basename $(pwd))"
-fi
+: ${rel_name:=$(basename $(pwd))}
 
 # Infer version from tarball names
-if [ -z "$ver" ]; then
-    ver="$(ls ghc-*.tar.* | sed -ne 's/ghc-\([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)\?\).\+/\1/p' | head -n1)"
-    if [ -z "$ver" ]; then echo "Failed to infer \$ver"; exit 1; fi
-fi
+: ${ver:=$(ls ghc-*.tar.* | sed -ne 's/ghc-\([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)\?\).\+/\1/p' | head -n1)}
+if [ -z "$ver" ]; then echo "Failed to infer \$ver"; exit 1; fi
 
 host="gitlab-storage.haskell.org"
 
@@ -141,6 +136,7 @@ function upload() {
 }
 
 function purge_all() {
+    dir="$(echo $rel_name | sed s/-release//)"
     # Purge CDN cache
     curl -X PURGE http://downloads.haskell.org/ghc/
     curl -X PURGE http://downloads.haskell.org/~ghc/
@@ -196,6 +192,7 @@ function prepare_docs() {
 }
 
 function recompress() {
+    set -Eeuo pipefail
     combine <(basename -s .xz *.xz) not <(basename -s .lz *.lz) | \
         parallel 'echo "Recompressing {}.xz to {}.lz"; unxz -c {}.xz | lzip - -o {}.lz'
 


=====================================
compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
=====================================
@@ -380,6 +380,114 @@ getSomeReg expr = do
     Fixed rep reg code ->
         return (reg, rep, code)
 
+{- Note [Aarch64 immediates]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Aarch64 with it's fixed width instruction encoding uses leftover space for
+immediates.
+If you want the full rundown consult the arch reference document:
+"ArmĀ® Architecture Reference Manual" - "C3.4 Data processing - immediate"
+
+The gist of it is that different instructions allow for different immediate encodings.
+The ones we care about for better code generation are:
+
+* Simple but potentially repeated bit-patterns for logic instructions.
+* 16bit numbers shifted by multiples of 16.
+* 12 bit numbers optionally shifted by 12 bits.
+
+It might seem like the ISA allows for 64bit immediates but this isn't the case.
+Rather there are some instruction aliases which allow for large unencoded immediates
+which will then be transalted to one of the immediate encodings implicitly.
+
+For example mov x1, #0x10000 is allowed but will be assembled to movz x1, #0x1, lsl #16
+-}
+
+-- | Move (wide immediate)
+-- Allows for 16bit immediate which can be shifted by 0/16/32/48 bits.
+-- Used with MOVZ,MOVN, MOVK
+-- See Note [Aarch64 immediates]
+getMovWideImm :: Integer -> Width -> Maybe Operand
+getMovWideImm n w
+  -- TODO: Handle sign extension/negatives
+  | n <= 0
+  = Nothing
+  -- Fits in 16 bits
+  | sized_n < 2^(16 :: Int)
+  = Just $ OpImm (ImmInteger truncated)
+
+  -- 0x0000 0000 xxxx 0000
+  | trailing_zeros >= 16 && sized_n < 2^(32 :: Int)
+  = Just $ OpImmShift (ImmInteger $ truncated `shiftR` 16) SLSL 16
+
+  -- 0x 0000 xxxx 0000 0000
+  | trailing_zeros >= 32 && sized_n < 2^(48 :: Int)
+  = Just $ OpImmShift (ImmInteger $ truncated `shiftR` 32) SLSL 32
+
+  -- 0x xxxx 0000 0000 0000
+  | trailing_zeros >= 48
+  = Just $ OpImmShift (ImmInteger $ truncated `shiftR` 48) SLSL 48
+
+  | otherwise
+  = Nothing
+  where
+    truncated = narrowU w n
+    sized_n = fromIntegral truncated :: Word64
+    trailing_zeros = countTrailingZeros sized_n
+
+-- | Arithmetic(immediate)
+--  Allows for 12bit immediates which can be shifted by 0 or 12 bits.
+-- Used with ADD, ADDS, SUB, SUBS, CMP, CMN
+-- See Note [Aarch64 immediates]
+getArithImm :: Integer -> Width -> Maybe Operand
+getArithImm n w
+  -- TODO: Handle sign extension
+  | n <= 0
+  = Nothing
+  -- Fits in 16 bits
+  -- Fits in 12 bits
+  | sized_n < 2^(12::Int)
+  = Just $ OpImm (ImmInteger truncated)
+
+  -- 12 bits shifted by 12 places.
+  | trailing_zeros >= 12 && sized_n < 2^(24::Int)
+  = Just $ OpImmShift (ImmInteger $ truncated `shiftR` 12) SLSL 12
+
+  | otherwise
+  = Nothing
+  where
+    sized_n = fromIntegral truncated :: Word64
+    truncated = narrowU w n
+    trailing_zeros = countTrailingZeros sized_n
+
+-- |  Logical (immediate)
+-- Allows encoding of some repeated bitpatterns
+-- Used with AND, ANDS, EOR, ORR, TST
+-- and their aliases which includes at least MOV (bitmask immediate)
+-- See Note [Aarch64 immediates]
+getBitmaskImm :: Integer -> Width -> Maybe Operand
+getBitmaskImm n w
+  | isAArch64Bitmask (opRegWidth w) truncated = Just $ OpImm (ImmInteger truncated)
+  | otherwise = Nothing
+  where
+    truncated = narrowU w n
+
+-- | Load/store immediate.
+-- Depends on the width of the store to some extent.
+isOffsetImm :: Int -> Width -> Bool
+isOffsetImm off w
+  -- 8 bits + sign for unscaled offsets
+  | -256 <= off, off <= 255 = True
+  -- Offset using 12-bit positive immediate, scaled by width
+  -- LDR/STR: imm12: if reg is 32bit: 0 -- 16380 in multiples of 4
+  -- LDR/STR: imm12: if reg is 64bit: 0 -- 32760 in multiples of 8
+  -- 16-bit: 0 .. 8188, 8-bit: 0 -- 4095
+  | 0 <= off, off < 4096 * byte_width, off `mod` byte_width == 0 = True
+  | otherwise = False
+  where
+    byte_width = widthInBytes w
+
+
+
+
 -- TODO OPT: we might be able give getRegister
 --          a hint, what kind of register we want.
 getFloatReg :: HasCallStack => CmmExpr -> NatM (Reg, Format, InstrBlock)
@@ -502,8 +610,14 @@ getRegister' config plat expr
     CmmLit lit
       -> case lit of
 
-        -- TODO handle CmmInt 0 specially, use wzr or xzr.
-
+        -- Use wzr xzr for CmmInt 0 if the width matches up, otherwise do a move.
+        -- TODO: Reenable after https://gitlab.haskell.org/ghc/ghc/-/issues/23632 is fixed.
+        -- CmmInt 0 W32 -> do
+        --   let format = intFormat W32
+        --   return (Fixed format reg_zero (unitOL $ (COMMENT ((text . show $ expr))) ))
+        -- CmmInt 0 W64 -> do
+        --   let format = intFormat W64
+        --   return (Fixed format reg_zero (unitOL $ (COMMENT ((text . show $ expr))) ))
         CmmInt i W8 | i >= 0 -> do
           return (Any (intFormat W8) (\dst -> unitOL $ annExpr expr (MOV (OpReg W8 dst) (OpImm (ImmInteger (narrowU W8 i))))))
         CmmInt i W16 | i >= 0 -> do
@@ -518,8 +632,13 @@ getRegister' config plat expr
         -- Those need the upper bits set. We'd either have to explicitly sign
         -- or figure out something smarter. Lowered to
         -- `MOV dst XZR`
+        CmmInt i w | i >= 0
+                   , Just imm_op <- getMovWideImm i w -> do
+          return (Any (intFormat w) (\dst -> unitOL $ annExpr expr (MOVZ (OpReg w dst) imm_op)))
+
         CmmInt i w | isNbitEncodeable 16 i, i >= 0 -> do
           return (Any (intFormat w) (\dst -> unitOL $ annExpr expr (MOV (OpReg W16 dst) (OpImm (ImmInteger i)))))
+
         CmmInt i w | isNbitEncodeable 32 i, i >= 0 -> do
           let  half0 = fromIntegral (fromIntegral i :: Word16)
                half1 = fromIntegral (fromIntegral (i `shiftR` 16) :: Word16)
@@ -594,7 +713,6 @@ getRegister' config plat expr
           (op, imm_code) <- litToImm' lit
           let rep = cmmLitType plat lit
               format = cmmTypeFormat rep
-              -- width = typeWidth rep
           return (Any format (\dst -> imm_code `snocOL` LDR format (OpReg (formatToWidth format) dst) op))
 
         CmmLabelOff lbl off -> do
@@ -618,18 +736,11 @@ getRegister' config plat expr
       -> return (Fixed (cmmTypeFormat (cmmRegType reg))
                        (getRegisterReg plat reg)
                        nilOL)
-    CmmRegOff reg off | isNbitEncodeable 12 (fromIntegral off) -> do
-      getRegister' config plat $
-            CmmMachOp (MO_Add width) [CmmReg reg, CmmLit (CmmInt (fromIntegral off) width)]
-          where width = typeWidth (cmmRegType reg)
-
-    CmmRegOff reg off -> do
-      (off_r, _off_format, off_code) <- getSomeReg $ CmmLit (CmmInt (fromIntegral off) width)
-      (reg, _format, code) <- getSomeReg $ CmmReg reg
-      return $ Any (intFormat width) (\dst -> off_code `appOL` code `snocOL` ADD (OpReg width dst) (OpReg width reg) (OpReg width off_r))
-          where width = typeWidth (cmmRegType reg)
-
-
+    CmmRegOff reg off ->
+      -- If we got here we will load the address into a register either way. So we might as well just expand
+      -- and re-use the existing code path to handle "reg + off".
+      let !width = cmmRegWidth reg
+      in getRegister' config plat (CmmMachOp (MO_Add width) [CmmReg reg, CmmLit (CmmInt (fromIntegral off) width)])
 
     -- for MachOps, see GHC.Cmm.MachOp
     -- For CmmMachOp, see GHC.Cmm.Expr
@@ -701,33 +812,25 @@ getRegister' config plat expr
     -- 0. TODO This should not exist! Rewrite: Reg +- 0 -> Reg
     CmmMachOp (MO_Add _) [expr'@(CmmReg (CmmGlobal _r)), CmmLit (CmmInt 0 _)] -> getRegister' config plat expr'
     CmmMachOp (MO_Sub _) [expr'@(CmmReg (CmmGlobal _r)), CmmLit (CmmInt 0 _)] -> getRegister' config plat expr'
-    -- 1. Compute Reg +/- n directly.
-    --    For Add/Sub we can directly encode 12bits, or 12bits lsl #12.
-    CmmMachOp (MO_Add w) [(CmmReg reg), CmmLit (CmmInt n _)]
-      | n > 0 && n < 4096 -> return $ Any (intFormat w) (\d -> unitOL $ annExpr expr (ADD (OpReg w d) (OpReg w' r') (OpImm (ImmInteger n))))
-      -- TODO: 12bits lsl #12; e.g. lower 12 bits of n are 0; shift n >> 12, and set lsl to #12.
-      where w' = formatToWidth (cmmTypeFormat (cmmRegType reg))
-            r' = getRegisterReg plat reg
-    CmmMachOp (MO_Sub w) [(CmmReg reg), CmmLit (CmmInt n _)]
-      | n > 0 && n < 4096 -> return $ Any (intFormat w) (\d -> unitOL $ annExpr expr (SUB (OpReg w d) (OpReg w' r') (OpImm (ImmInteger n))))
-      -- TODO: 12bits lsl #12; e.g. lower 12 bits of n are 0; shift n >> 12, and set lsl to #12.
-      where w' = formatToWidth (cmmTypeFormat (cmmRegType reg))
-            r' = getRegisterReg plat reg
+    -- Immediates are handled via `getArithImm` in the generic code path.
 
     CmmMachOp (MO_U_Quot w) [x, y] | w == W8 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (UXTB (OpReg w reg_y) (OpReg w reg_y)) `snocOL` (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL`
+                                                                        (UXTB (OpReg w reg_y) (OpReg w reg_y)) `snocOL`
+                                                                        (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
     CmmMachOp (MO_U_Quot w) [x, y] | w == W16 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (UXTH (OpReg w reg_y) (OpReg w reg_y)) `snocOL` (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL`
+                                                                        (UXTH (OpReg w reg_y) (OpReg w reg_y)) `snocOL`
+                                                                        (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
 
     -- 2. Shifts. x << n, x >> n.
-    CmmMachOp (MO_Shl w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
-      (reg_x, _format_x, code_x) <- getSomeReg x
-      return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSL (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
-    CmmMachOp (MO_Shl w) [x, (CmmLit (CmmInt n _))] | w == W64, 0 <= n, n < 64 -> do
+    CmmMachOp (MO_Shl w) [x, (CmmLit (CmmInt n _))]
+      | w == W32 || w == W64
+      , 0 <= n, n < fromIntegral (widthInBits w) -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSL (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
 
@@ -737,7 +840,8 @@ getRegister' config plat expr
     CmmMachOp (MO_S_Shr w) [x, y] | w == W8 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL`
+                                                                         (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
 
     CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W16, 0 <= n, n < 16 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
@@ -745,24 +849,23 @@ getRegister' config plat expr
     CmmMachOp (MO_S_Shr w) [x, y] | w == W16 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
-
-    CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
-      (reg_x, _format_x, code_x) <- getSomeReg x
-      return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL`
+                                                                         (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
 
-    CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W64, 0 <= n, n < 64 -> do
+    CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))]
+      | w == W32 || w == W64
+      , 0 <= n, n < fromIntegral (widthInBits w) -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
 
-
     CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W8, 0 <= n, n < 8 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (UBFX (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n)) (OpImm (ImmInteger (8-n)))))
     CmmMachOp (MO_U_Shr w) [x, y] | w == W8 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL`
+                                                                        (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
 
     CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W16, 0 <= n, n < 16 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
@@ -770,13 +873,12 @@ getRegister' config plat expr
     CmmMachOp (MO_U_Shr w) [x, y] | w == W16 -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       (reg_y, _format_y, code_y) <- getSomeReg y
-      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+      return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x))
+                                                                `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
 
-    CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
-      (reg_x, _format_x, code_x) <- getSomeReg x
-      return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
-
-    CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W64, 0 <= n, n < 64 -> do
+    CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))]
+      | w == W32 || w == W64
+      , 0 <= n, n < fromIntegral (widthInBits w) -> do
       (reg_x, _format_x, code_x) <- getSomeReg x
       return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
 
@@ -799,17 +901,51 @@ getRegister' config plat expr
           -- withTempFloatReg w op = OpReg w <$> getNewRegNat (floatFormat w) >>= op
 
           -- A "plain" operation.
-          bitOp w op = do
+          bitOpImm w op encode_imm = do
             -- compute x<m> <- x
             -- compute x<o> <- y
             -- <OP> x<n>, x<m>, x<o>
             (reg_x, format_x, code_x) <- getSomeReg x
-            (reg_y, format_y, code_y) <- getSomeReg y
-            massertPpr (isIntFormat format_x == isIntFormat format_y) $ text "bitOp: incompatible"
+            (op_y, format_y, code_y) <- case y of
+              CmmLit (CmmInt n w)
+                | Just imm_operand_y <- encode_imm n w
+                -> return (imm_operand_y, intFormat w, nilOL)
+              _ -> do
+                  (reg_y, format_y, code_y) <- getSomeReg y
+                  return (OpReg w reg_y, format_y, code_y)
+            massertPpr (isIntFormat format_x == isIntFormat format_y) $ text "bitOpImm: incompatible"
             return $ Any (intFormat w) (\dst ->
                 code_x `appOL`
                 code_y `appOL`
-                op (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y))
+                op (OpReg w dst) (OpReg w reg_x) op_y)
+
+          -- A (potentially signed) integer operation.
+          -- In the case of 8- and 16-bit signed arithmetic we must first
+          -- sign-extend both arguments to 32-bits.
+          -- See Note [Signed arithmetic on AArch64].
+          intOpImm :: Bool -> Width -> (Operand -> Operand -> Operand -> OrdList Instr) -> (Integer -> Width -> Maybe Operand) -> NatM (Register)
+          intOpImm {- is signed -} True  w op _encode_imm = intOp True w op
+          intOpImm                 False w op  encode_imm = do
+              -- compute x<m> <- x
+              -- compute x<o> <- y
+              -- <OP> x<n>, x<m>, x<o>
+              (reg_x, format_x, code_x) <- getSomeReg x
+              (op_y, format_y, code_y) <- case y of
+                CmmLit (CmmInt n w)
+                  | Just imm_operand_y <- encode_imm n w
+                  -> return (imm_operand_y, intFormat w, nilOL)
+                _ -> do
+                    (reg_y, format_y, code_y) <- getSomeReg y
+                    return (OpReg w reg_y, format_y, code_y)
+              massertPpr (isIntFormat format_x && isIntFormat format_y) $ text "intOp: non-int"
+              -- This is the width of the registers on which the operation
+              -- should be performed.
+              let w' = opRegWidth w
+              return $ Any (intFormat w) $ \dst ->
+                  code_x `appOL`
+                  code_y `appOL`
+                  op (OpReg w' dst) (OpReg w' reg_x) (op_y) `appOL`
+                  truncateReg w' w dst -- truncate back to the operand's original width
 
           -- A (potentially signed) integer operation.
           -- In the case of 8- and 16-bit signed arithmetic we must first
@@ -855,9 +991,9 @@ getRegister' config plat expr
       case op of
         -- Integer operations
         -- Add/Sub should only be Integer Options.
-        MO_Add w -> intOp False w (\d x y -> unitOL $ annExpr expr (ADD d x y))
+        MO_Add w -> intOpImm False w (\d x y -> unitOL $ annExpr expr (ADD d x y)) getArithImm
         -- TODO: Handle sub-word case
-        MO_Sub w -> intOp False w (\d x y -> unitOL $ annExpr expr (SUB d x y))
+        MO_Sub w -> intOpImm False w (\d x y -> unitOL $ annExpr expr (SUB d x y)) getArithImm
 
         -- Note [CSET]
         -- ~~~~~~~~~~~
@@ -899,8 +1035,8 @@ getRegister' config plat expr
 
         -- N.B. We needn't sign-extend sub-word size (in)equality comparisons
         -- since we don't care about ordering.
-        MO_Eq w     -> bitOp w (\d x y -> toOL [ CMP x y, CSET d EQ ])
-        MO_Ne w     -> bitOp w (\d x y -> toOL [ CMP x y, CSET d NE ])
+        MO_Eq w     -> bitOpImm w (\d x y -> toOL [ CMP x y, CSET d EQ ]) getArithImm
+        MO_Ne w     -> bitOpImm w (\d x y -> toOL [ CMP x y, CSET d NE ]) getArithImm
 
         -- Signed multiply/divide
         MO_Mul w          -> intOp True w (\d x y -> unitOL $ MUL d x y)
@@ -929,10 +1065,10 @@ getRegister' config plat expr
         MO_S_Lt w     -> intOp True  w (\d x y -> toOL [ CMP x y, CSET d SLT ])
 
         -- Unsigned comparisons
-        MO_U_Ge w     -> intOp False w (\d x y -> toOL [ CMP x y, CSET d UGE ])
-        MO_U_Le w     -> intOp False w (\d x y -> toOL [ CMP x y, CSET d ULE ])
-        MO_U_Gt w     -> intOp False w (\d x y -> toOL [ CMP x y, CSET d UGT ])
-        MO_U_Lt w     -> intOp False w (\d x y -> toOL [ CMP x y, CSET d ULT ])
+        MO_U_Ge w     -> intOpImm False w (\d x y -> toOL [ CMP x y, CSET d UGE ]) getArithImm
+        MO_U_Le w     -> intOpImm False w (\d x y -> toOL [ CMP x y, CSET d ULE ]) getArithImm
+        MO_U_Gt w     -> intOpImm False w (\d x y -> toOL [ CMP x y, CSET d UGT ]) getArithImm
+        MO_U_Lt w     -> intOpImm False w (\d x y -> toOL [ CMP x y, CSET d ULT ]) getArithImm
 
         -- Floating point arithmetic
         MO_F_Add w   -> floatOp w (\d x y -> unitOL $ ADD d x y)
@@ -955,9 +1091,9 @@ getRegister' config plat expr
         MO_F_Lt w    -> floatCond w (\d x y -> toOL [ CMP x y, CSET d OLT ]) -- x < y <=> y >= x
 
         -- Bitwise operations
-        MO_And   w -> bitOp w (\d x y -> unitOL $ AND d x y)
-        MO_Or    w -> bitOp w (\d x y -> unitOL $ ORR d x y)
-        MO_Xor   w -> bitOp w (\d x y -> unitOL $ EOR d x y)
+        MO_And   w -> bitOpImm w (\d x y -> unitOL $ AND d x y) getBitmaskImm
+        MO_Or    w -> bitOpImm w (\d x y -> unitOL $ ORR d x y) getBitmaskImm
+        MO_Xor   w -> bitOpImm w (\d x y -> unitOL $ EOR d x y) getBitmaskImm
         MO_Shl   w -> intOp False w (\d x y -> unitOL $ LSL d x y)
         MO_U_Shr w -> intOp False w (\d x y -> unitOL $ LSR d x y)
         MO_S_Shr w -> intOp True  w (\d x y -> unitOL $ ASR d x y)
@@ -1007,7 +1143,7 @@ getRegister' config plat expr
 
   where
     isNbitEncodeable :: Int -> Integer -> Bool
-    isNbitEncodeable n i = let shift = n - 1 in (-1 `shiftL` shift) <= i && i < (1 `shiftL` shift)
+    isNbitEncodeable n_bits i = let shift = n_bits - 1 in (-1 `shiftL` shift) <= i && i < (1 `shiftL` shift)
 
     -- N.B. MUL does not set the overflow flag.
     -- These implementations are based on output from GCC 11.
@@ -1146,20 +1282,8 @@ getAmode :: Platform
 
 -- OPTIMIZATION WARNING: Addressing modes.
 -- Addressing options:
--- LDUR/STUR: imm9: -256 - 255
-getAmode platform _ (CmmRegOff reg off) | -256 <= off, off <= 255
-  = return $ Amode (AddrRegImm reg' off') nilOL
-    where reg' = getRegisterReg platform reg
-          off' = ImmInt off
--- LDR/STR: imm12: if reg is 32bit: 0 -- 16380 in multiples of 4
-getAmode platform W32 (CmmRegOff reg off)
-  | 0 <= off, off <= 16380, off `mod` 4 == 0
-  = return $ Amode (AddrRegImm reg' off') nilOL
-    where reg' = getRegisterReg platform reg
-          off' = ImmInt off
--- LDR/STR: imm12: if reg is 64bit: 0 -- 32760 in multiples of 8
-getAmode platform W64 (CmmRegOff reg off)
-  | 0 <= off, off <= 32760, off `mod` 8 == 0
+getAmode platform w (CmmRegOff reg off)
+  | isOffsetImm off w
   = return $ Amode (AddrRegImm reg' off') nilOL
     where reg' = getRegisterReg platform reg
           off' = ImmInt off
@@ -1168,15 +1292,15 @@ getAmode platform W64 (CmmRegOff reg off)
 -- CmmStore (CmmMachOp (MO_Add w) [CmmLoad expr, CmmLit (CmmInt n w')]) (expr2)
 -- E.g. a CmmStoreOff really. This can be translated to `str $expr2, [$expr, #n ]
 -- for `n` in range.
-getAmode _platform _ (CmmMachOp (MO_Add _w) [expr, CmmLit (CmmInt off _w')])
-  | -256 <= off, off <= 255
+getAmode _platform w (CmmMachOp (MO_Add _w) [expr, CmmLit (CmmInt off _w')])
+  | isOffsetImm (fromIntegral off) w
   = do (reg, _format, code) <- getSomeReg expr
        return $ Amode (AddrRegImm reg (ImmInteger off)) code
 
-getAmode _platform _ (CmmMachOp (MO_Sub _w) [expr, CmmLit (CmmInt off _w')])
-  | -256 <= -off, -off <= 255
+getAmode _platform w (CmmMachOp (MO_Sub _w) [expr, CmmLit (CmmInt off _w')])
+  | isOffsetImm (fromIntegral $ -off) w
   = do (reg, _format, code) <- getSomeReg expr
-       return $ Amode (AddrRegImm reg (ImmInteger (-off))) code
+       return $ Amode (AddrRegImm reg (ImmInteger $ -off)) code
 
 -- Generic case
 getAmode _platform _ expr


=====================================
compiler/GHC/CmmToAsm/AArch64/Instr.hs
=====================================
@@ -110,6 +110,7 @@ regUsageOfInstr platform instr = case instr of
   LSR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   MOV dst src              -> usage (regOp src, regOp dst)
   MOVK dst src             -> usage (regOp src, regOp dst)
+  MOVZ dst src             -> usage (regOp src, regOp dst)
   MVN dst src              -> usage (regOp src, regOp dst)
   ORR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   ROR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
@@ -251,6 +252,7 @@ patchRegsOfInstr instr env = case instr of
     LSR o1 o2 o3   -> LSR  (patchOp o1) (patchOp o2) (patchOp o3)
     MOV o1 o2      -> MOV  (patchOp o1) (patchOp o2)
     MOVK o1 o2     -> MOVK (patchOp o1) (patchOp o2)
+    MOVZ o1 o2     -> MOVZ (patchOp o1) (patchOp o2)
     MVN o1 o2      -> MVN  (patchOp o1) (patchOp o2)
     ORR o1 o2 o3   -> ORR  (patchOp o1) (patchOp o2) (patchOp o3)
     ROR o1 o2 o3   -> ROR  (patchOp o1) (patchOp o2) (patchOp o3)
@@ -381,9 +383,8 @@ mkSpillInstr config reg delta slot =
     where
         a .&~. b = a .&. (complement b)
 
-        fmt = case reg of
-            RegReal (RealRegSingle n) | n < 32 -> II64
-            _                                  -> FF64
+        fmt = fmtOfRealReg (case reg of { RegReal r -> r; _ -> panic "Expected real reg"})
+
         mkIp0SpillAddr imm = ANN (text "Spill: IP0 <- SP + " <> int imm) $ ADD ip0 sp (OpImm (ImmInt imm))
         mkStrSp imm = ANN (text "Spill@" <> int (off - delta)) $ STR fmt (OpReg W64 reg) (OpAddr (AddrRegImm (regSingle 31) (ImmInt imm)))
         mkStrIp0 imm = ANN (text "Spill@" <> int (off - delta)) $ STR fmt (OpReg W64 reg) (OpAddr (AddrRegImm (regSingle 16) (ImmInt imm)))
@@ -408,9 +409,7 @@ mkLoadInstr config reg delta slot =
     where
         a .&~. b = a .&. (complement b)
 
-        fmt = case reg of
-            RegReal (RealRegSingle n) | n < 32 -> II64
-            _                                  -> FF64
+        fmt = fmtOfRealReg (case reg of { RegReal r -> r; _ -> panic "Expected real reg"})
 
         mkIp0SpillAddr imm = ANN (text "Reload: IP0 <- SP + " <> int imm) $ ADD ip0 sp (OpImm (ImmInt imm))
         mkLdrSp imm = ANN (text "Reload@" <> int (off - delta)) $ LDR fmt (OpReg W64 reg) (OpAddr (AddrRegImm (regSingle 31) (ImmInt imm)))
@@ -619,7 +618,7 @@ data Instr
     | MOV Operand Operand -- rd = rn  or  rd = #i
     | MOVK Operand Operand
     -- | MOVN Operand Operand
-    -- | MOVZ Operand Operand
+    | MOVZ Operand Operand
     | MVN Operand Operand -- rd = ~rn
     | ORN Operand Operand Operand -- rd = rn | ~op2
     | ORR Operand Operand Operand -- rd = rn | op2
@@ -708,6 +707,7 @@ instrCon i =
       LSR{} -> "LSR"
       MOV{} -> "MOV"
       MOVK{} -> "MOVK"
+      MOVZ{} -> "MOVZ"
       MVN{} -> "MVN"
       ORN{} -> "ORN"
       ORR{} -> "ORR"
@@ -783,6 +783,9 @@ wzr = OpReg W32 (RegReal (RealRegSingle (-1)))
 sp  = OpReg W64 (RegReal (RealRegSingle 31))
 ip0 = OpReg W64 (RegReal (RealRegSingle 16))
 
+reg_zero :: Reg
+reg_zero = RegReal (RealRegSingle (-1))
+
 _x :: Int -> Operand
 _x i = OpReg W64 (RegReal (RealRegSingle i))
 x0,  x1,  x2,  x3,  x4,  x5,  x6,  x7  :: Operand


=====================================
compiler/GHC/CmmToAsm/AArch64/Ppr.hs
=====================================
@@ -418,6 +418,7 @@ pprInstr platform instr = case instr of
     | isFloatOp o1 || isFloatOp o2 -> op2 (text "\tfmov") o1 o2
     | otherwise                    -> op2 (text "\tmov") o1 o2
   MOVK o1 o2    -> op2 (text "\tmovk") o1 o2
+  MOVZ o1 o2    -> op2 (text "\tmovz") o1 o2
   MVN o1 o2     -> op2 (text "\tmvn") o1 o2
   ORN o1 o2 o3  -> op3 (text "\torn") o1 o2 o3
   ORR o1 o2 o3  -> op3 (text "\torr") o1 o2 o3


=====================================
compiler/GHC/CmmToAsm/AArch64/Regs.hs
=====================================
@@ -77,6 +77,8 @@ litToImm (CmmInt i w)        = ImmInteger (narrowS w i)
                 -- narrow to the width: a CmmInt might be out of
                 -- range, but we assume that ImmInteger only contains
                 -- in-range values.  A signed value should be fine here.
+                -- AK: We do call this with out of range values, however
+                -- it just truncates as we would expect.
 litToImm (CmmFloat f W32)    = ImmFloat f
 litToImm (CmmFloat f W64)    = ImmDouble f
 litToImm (CmmLabel l)        = ImmCLbl l
@@ -147,6 +149,13 @@ classOfRealReg (RealRegSingle i)
         | i < 32        = RcInteger
         | otherwise     = RcDouble
 
+fmtOfRealReg :: RealReg -> Format
+fmtOfRealReg real_reg =
+  case classOfRealReg real_reg of
+            RcInteger -> II64
+            RcDouble  -> FF64
+            RcFloat   -> panic "No float regs on arm"
+
 regDotColor :: RealReg -> SDoc
 regDotColor reg
  = case classOfRealReg reg of


=====================================
compiler/GHC/Driver/Flags.hs
=====================================
@@ -695,7 +695,7 @@ data WarningFlag =
    | Opt_WarnIncompleteRecordSelectors               -- Since 9.10
    | Opt_WarnBadlyStagedTypes                        -- Since 9.10
    | Opt_WarnInconsistentFlags                       -- Since 9.8
-   deriving (Eq, Ord, Show, Enum)
+   deriving (Eq, Ord, Show, Enum, Bounded)
 
 -- | Return the names of a WarningFlag
 --


=====================================
compiler/GHC/Driver/Session.hs
=====================================
@@ -2165,127 +2165,125 @@ wWarningFlags :: [FlagSpec WarningFlag]
 wWarningFlags = map snd (sortBy (comparing fst) wWarningFlagsDeps)
 
 wWarningFlagsDeps :: [(Deprecation, FlagSpec WarningFlag)]
-wWarningFlagsDeps = mconcat [
+wWarningFlagsDeps = [minBound..maxBound] >>= \x -> case x of
 -- See Note [Updating flag description in the User's Guide]
 -- See Note [Supporting CLI completion]
--- Please keep the list of flags below sorted alphabetically
-  warnSpec    Opt_WarnAlternativeLayoutRuleTransitional,
-  warnSpec    Opt_WarnAmbiguousFields,
-  depWarnSpec Opt_WarnAutoOrphans
-              "it has no effect",
-  warnSpec    Opt_WarnCPPUndef,
-  warnSpec    Opt_WarnUnbangedStrictPatterns,
-  warnSpec    Opt_WarnDeferredTypeErrors,
-  warnSpec    Opt_WarnDeferredOutOfScopeVariables,
-  warnSpec    Opt_WarnDeprecatedFlags,
-  warnSpec    Opt_WarnDerivingDefaults,
-  warnSpec    Opt_WarnDerivingTypeable,
-  warnSpec    Opt_WarnDodgyExports,
-  warnSpec    Opt_WarnDodgyForeignImports,
-  warnSpec    Opt_WarnDodgyImports,
-  warnSpec    Opt_WarnEmptyEnumerations,
-  subWarnSpec "duplicate-constraints"
-              Opt_WarnDuplicateConstraints
-              "it is subsumed by -Wredundant-constraints",
-  warnSpec    Opt_WarnRedundantConstraints,
-  warnSpec    Opt_WarnDuplicateExports,
-  depWarnSpec Opt_WarnHiShadows
-              "it is not used, and was never implemented",
-  warnSpec    Opt_WarnInaccessibleCode,
-  warnSpec    Opt_WarnImplicitPrelude,
-  depWarnSpec Opt_WarnImplicitKindVars
-              "it is now an error",
-  warnSpec    Opt_WarnIncompletePatterns,
-  warnSpec    Opt_WarnIncompletePatternsRecUpd,
-  warnSpec    Opt_WarnIncompleteUniPatterns,
-  warnSpec    Opt_WarnInlineRuleShadowing,
-  warnSpec    Opt_WarnIdentities,
-  warnSpec    Opt_WarnMissingFields,
-  warnSpec    Opt_WarnMissingImportList,
-  warnSpec    Opt_WarnMissingExportList,
-  subWarnSpec "missing-local-sigs"
-              Opt_WarnMissingLocalSignatures
-              "it is replaced by -Wmissing-local-signatures",
-  warnSpec    Opt_WarnMissingLocalSignatures,
-  warnSpec    Opt_WarnMissingMethods,
-  depWarnSpec Opt_WarnMissingMonadFailInstances
-              "fail is no longer a method of Monad",
-  warnSpec    Opt_WarnSemigroup,
-  warnSpec    Opt_WarnMissingSignatures,
-  warnSpec    Opt_WarnMissingKindSignatures,
-  warnSpec    Opt_WarnMissingPolyKindSignatures,
-  subWarnSpec "missing-exported-sigs"
-              Opt_WarnMissingExportedSignatures
-              "it is replaced by -Wmissing-exported-signatures",
-  warnSpec    Opt_WarnMissingExportedSignatures,
-  warnSpec    Opt_WarnMonomorphism,
-  warnSpec    Opt_WarnNameShadowing,
-  warnSpec    Opt_WarnNonCanonicalMonadInstances,
-  depWarnSpec Opt_WarnNonCanonicalMonadFailInstances
-              "fail is no longer a method of Monad",
-  warnSpec    Opt_WarnNonCanonicalMonoidInstances,
-  warnSpec    Opt_WarnOrphans,
-  warnSpec    Opt_WarnOverflowedLiterals,
-  warnSpec    Opt_WarnOverlappingPatterns,
-  warnSpec    Opt_WarnMissedSpecs,
-  warnSpec    Opt_WarnAllMissedSpecs,
-  warnSpec'   Opt_WarnSafe setWarnSafe,
-  warnSpec    Opt_WarnTrustworthySafe,
-  warnSpec    Opt_WarnInferredSafeImports,
-  warnSpec    Opt_WarnMissingSafeHaskellMode,
-  warnSpec    Opt_WarnTabs,
-  warnSpec    Opt_WarnTypeDefaults,
-  warnSpec    Opt_WarnTypedHoles,
-  warnSpec    Opt_WarnPartialTypeSignatures,
-  warnSpec    Opt_WarnUnrecognisedPragmas,
-  warnSpec    Opt_WarnMisplacedPragmas,
-  warnSpec'   Opt_WarnUnsafe setWarnUnsafe,
-  warnSpec    Opt_WarnUnsupportedCallingConventions,
-  warnSpec    Opt_WarnUnsupportedLlvmVersion,
-  warnSpec    Opt_WarnMissedExtraSharedLib,
-  warnSpec    Opt_WarnUntickedPromotedConstructors,
-  warnSpec    Opt_WarnUnusedDoBind,
-  warnSpec    Opt_WarnUnusedForalls,
-  warnSpec    Opt_WarnUnusedImports,
-  warnSpec    Opt_WarnUnusedLocalBinds,
-  warnSpec    Opt_WarnUnusedMatches,
-  warnSpec    Opt_WarnUnusedPatternBinds,
-  warnSpec    Opt_WarnUnusedTopBinds,
-  warnSpec    Opt_WarnUnusedTypePatterns,
-  warnSpec    Opt_WarnUnusedRecordWildcards,
-  warnSpec    Opt_WarnRedundantBangPatterns,
-  warnSpec    Opt_WarnRedundantRecordWildcards,
-  warnSpec    Opt_WarnRedundantStrictnessFlags,
-  warnSpec    Opt_WarnWrongDoBind,
-  warnSpec    Opt_WarnMissingPatternSynonymSignatures,
-  warnSpec    Opt_WarnMissingDerivingStrategies,
-  warnSpec    Opt_WarnSimplifiableClassConstraints,
-  warnSpec    Opt_WarnMissingHomeModules,
-  warnSpec    Opt_WarnUnrecognisedWarningFlags,
-  warnSpec    Opt_WarnStarBinder,
-  warnSpec    Opt_WarnStarIsType,
-  depWarnSpec Opt_WarnSpaceAfterBang
-              "bang patterns can no longer be written with a space",
-  warnSpec    Opt_WarnPartialFields,
-  warnSpec    Opt_WarnPrepositiveQualifiedModule,
-  warnSpec    Opt_WarnUnusedPackages,
-  warnSpec    Opt_WarnCompatUnqualifiedImports,
-  warnSpec    Opt_WarnInvalidHaddock,
-  warnSpec    Opt_WarnOperatorWhitespaceExtConflict,
-  warnSpec    Opt_WarnOperatorWhitespace,
-  warnSpec    Opt_WarnImplicitLift,
-  warnSpec    Opt_WarnMissingExportedPatternSynonymSignatures,
-  warnSpec    Opt_WarnForallIdentifier,
-  warnSpec    Opt_WarnUnicodeBidirectionalFormatCharacters,
-  warnSpec    Opt_WarnGADTMonoLocalBinds,
-  warnSpec    Opt_WarnTypeEqualityOutOfScope,
-  warnSpec    Opt_WarnTypeEqualityRequiresOperators,
-  warnSpec    Opt_WarnTermVariableCapture,
-  warnSpec    Opt_WarnMissingRoleAnnotations,
-  warnSpec    Opt_WarnImplicitRhsQuantification,
-  warnSpec    Opt_WarnIncompleteExportWarnings,
-  warnSpec    Opt_WarnIncompleteRecordSelectors
- ]
+  Opt_WarnAlternativeLayoutRuleTransitional -> warnSpec x
+  Opt_WarnAmbiguousFields -> warnSpec x
+  Opt_WarnAutoOrphans -> depWarnSpec x "it has no effect"
+  Opt_WarnCPPUndef -> warnSpec x
+  Opt_WarnBadlyStagedTypes -> warnSpec x
+  Opt_WarnUnbangedStrictPatterns -> warnSpec x
+  Opt_WarnDeferredTypeErrors -> warnSpec x
+  Opt_WarnDeferredOutOfScopeVariables -> warnSpec x
+  Opt_WarnDeprecatedFlags -> warnSpec x
+  Opt_WarnDerivingDefaults -> warnSpec x
+  Opt_WarnDerivingTypeable -> warnSpec x
+  Opt_WarnDodgyExports -> warnSpec x
+  Opt_WarnDodgyForeignImports -> warnSpec x
+  Opt_WarnDodgyImports -> warnSpec x
+  Opt_WarnEmptyEnumerations -> warnSpec x
+  Opt_WarnDuplicateConstraints
+    -> subWarnSpec "duplicate-constraints" x "it is subsumed by -Wredundant-constraints"
+  Opt_WarnRedundantConstraints -> warnSpec x
+  Opt_WarnDuplicateExports -> warnSpec x
+  Opt_WarnHiShadows
+    -> depWarnSpec x "it is not used, and was never implemented"
+  Opt_WarnInaccessibleCode -> warnSpec x
+  Opt_WarnImplicitPrelude -> warnSpec x
+  Opt_WarnImplicitKindVars -> depWarnSpec x "it is now an error"
+  Opt_WarnIncompletePatterns -> warnSpec x
+  Opt_WarnIncompletePatternsRecUpd -> warnSpec x
+  Opt_WarnIncompleteUniPatterns -> warnSpec x
+  Opt_WarnInconsistentFlags -> warnSpec x
+  Opt_WarnInlineRuleShadowing -> warnSpec x
+  Opt_WarnIdentities -> warnSpec x
+  Opt_WarnLoopySuperclassSolve -> warnSpec x
+  Opt_WarnMissingFields -> warnSpec x
+  Opt_WarnMissingImportList -> warnSpec x
+  Opt_WarnMissingExportList -> warnSpec x
+  Opt_WarnMissingLocalSignatures
+    -> subWarnSpec "missing-local-sigs" x
+                   "it is replaced by -Wmissing-local-signatures"
+       ++ warnSpec x
+  Opt_WarnMissingMethods -> warnSpec x
+  Opt_WarnMissingMonadFailInstances
+    -> depWarnSpec x "fail is no longer a method of Monad"
+  Opt_WarnSemigroup -> warnSpec x
+  Opt_WarnMissingSignatures -> warnSpec x
+  Opt_WarnMissingKindSignatures -> warnSpec x
+  Opt_WarnMissingPolyKindSignatures -> warnSpec x
+  Opt_WarnMissingExportedSignatures
+    -> subWarnSpec "missing-exported-sigs" x
+                   "it is replaced by -Wmissing-exported-signatures"
+       ++ warnSpec x
+  Opt_WarnMonomorphism -> warnSpec x
+  Opt_WarnNameShadowing -> warnSpec x
+  Opt_WarnNonCanonicalMonadInstances -> warnSpec x
+  Opt_WarnNonCanonicalMonadFailInstances
+    -> depWarnSpec x "fail is no longer a method of Monad"
+  Opt_WarnNonCanonicalMonoidInstances -> warnSpec x
+  Opt_WarnOrphans -> warnSpec x
+  Opt_WarnOverflowedLiterals -> warnSpec x
+  Opt_WarnOverlappingPatterns -> warnSpec x
+  Opt_WarnMissedSpecs -> warnSpec x
+  Opt_WarnAllMissedSpecs -> warnSpec x
+  Opt_WarnSafe -> warnSpec' x setWarnSafe
+  Opt_WarnTrustworthySafe -> warnSpec x
+  Opt_WarnInferredSafeImports -> warnSpec x
+  Opt_WarnMissingSafeHaskellMode -> warnSpec x
+  Opt_WarnTabs -> warnSpec x
+  Opt_WarnTypeDefaults -> warnSpec x
+  Opt_WarnTypedHoles -> warnSpec x
+  Opt_WarnPartialTypeSignatures -> warnSpec x
+  Opt_WarnUnrecognisedPragmas -> warnSpec x
+  Opt_WarnMisplacedPragmas -> warnSpec x
+  Opt_WarnUnsafe -> warnSpec' x setWarnUnsafe
+  Opt_WarnUnsupportedCallingConventions -> warnSpec x
+  Opt_WarnUnsupportedLlvmVersion -> warnSpec x
+  Opt_WarnMissedExtraSharedLib -> warnSpec x
+  Opt_WarnUntickedPromotedConstructors -> warnSpec x
+  Opt_WarnUnusedDoBind -> warnSpec x
+  Opt_WarnUnusedForalls -> warnSpec x
+  Opt_WarnUnusedImports -> warnSpec x
+  Opt_WarnUnusedLocalBinds -> warnSpec x
+  Opt_WarnUnusedMatches -> warnSpec x
+  Opt_WarnUnusedPatternBinds -> warnSpec x
+  Opt_WarnUnusedTopBinds -> warnSpec x
+  Opt_WarnUnusedTypePatterns -> warnSpec x
+  Opt_WarnUnusedRecordWildcards -> warnSpec x
+  Opt_WarnRedundantBangPatterns -> warnSpec x
+  Opt_WarnRedundantRecordWildcards -> warnSpec x
+  Opt_WarnRedundantStrictnessFlags -> warnSpec x
+  Opt_WarnWrongDoBind -> warnSpec x
+  Opt_WarnMissingPatternSynonymSignatures -> warnSpec x
+  Opt_WarnMissingDerivingStrategies -> warnSpec x
+  Opt_WarnSimplifiableClassConstraints -> warnSpec x
+  Opt_WarnMissingHomeModules -> warnSpec x
+  Opt_WarnUnrecognisedWarningFlags -> warnSpec x
+  Opt_WarnStarBinder -> warnSpec x
+  Opt_WarnStarIsType -> warnSpec x
+  Opt_WarnSpaceAfterBang
+    -> depWarnSpec x "bang patterns can no longer be written with a space"
+  Opt_WarnPartialFields -> warnSpec x
+  Opt_WarnPrepositiveQualifiedModule -> warnSpec x
+  Opt_WarnUnusedPackages -> warnSpec x
+  Opt_WarnCompatUnqualifiedImports -> warnSpec x
+  Opt_WarnInvalidHaddock -> warnSpec x
+  Opt_WarnOperatorWhitespaceExtConflict -> warnSpec x
+  Opt_WarnOperatorWhitespace -> warnSpec x
+  Opt_WarnImplicitLift -> warnSpec x
+  Opt_WarnMissingExportedPatternSynonymSignatures -> warnSpec x
+  Opt_WarnForallIdentifier -> warnSpec x
+  Opt_WarnUnicodeBidirectionalFormatCharacters -> warnSpec x
+  Opt_WarnGADTMonoLocalBinds -> warnSpec x
+  Opt_WarnTypeEqualityOutOfScope -> warnSpec x
+  Opt_WarnTypeEqualityRequiresOperators -> warnSpec x
+  Opt_WarnTermVariableCapture -> warnSpec x
+  Opt_WarnMissingRoleAnnotations -> warnSpec x
+  Opt_WarnImplicitRhsQuantification -> warnSpec x
+  Opt_WarnIncompleteExportWarnings -> warnSpec x
+  Opt_WarnIncompleteRecordSelectors -> warnSpec x
 
 warningGroupsDeps :: [(Deprecation, FlagSpec WarningGroup)]
 warningGroupsDeps = map mk warningGroups


=====================================
compiler/GHC/Unit/Types.hs
=====================================
@@ -149,7 +149,8 @@ instance Uniquable Module where
 
 instance Binary a => Binary (GenModule a) where
   put_ bh (Module p n) = put_ bh p >> put_ bh n
-  get bh = do p <- get bh; n <- get bh; return (Module p n)
+  -- Module has strict fields, so use $! in order not to allocate a thunk
+  get bh = do p <- get bh; n <- get bh; return $! Module p n
 
 instance NFData (GenModule a) where
   rnf (Module unit name) = unit `seq` name `seq` ()
@@ -317,13 +318,14 @@ instance Binary InstantiatedUnit where
     cid   <- get bh
     insts <- get bh
     let fs = mkInstantiatedUnitHash cid insts
-    return InstantiatedUnit {
-            instUnitInstanceOf = cid,
-            instUnitInsts = insts,
-            instUnitHoles = unionManyUniqDSets (map (moduleFreeHoles.snd) insts),
-            instUnitFS = fs,
-            instUnitKey = getUnique fs
-           }
+    -- InstantiatedUnit has strict fields, so use $! in order not to allocate a thunk
+    return $! InstantiatedUnit {
+                instUnitInstanceOf = cid,
+                instUnitInsts = insts,
+                instUnitHoles = unionManyUniqDSets (map (moduleFreeHoles.snd) insts),
+                instUnitFS = fs,
+                instUnitKey = getUnique fs
+              }
 
 instance IsUnitId u => Eq (GenUnit u) where
   uid1 == uid2 = unitUnique uid1 == unitUnique uid2
@@ -369,10 +371,12 @@ instance Binary Unit where
   put_ bh HoleUnit =
     putByte bh 2
   get bh = do b <- getByte bh
-              case b of
+              u <- case b of
                 0 -> fmap RealUnit (get bh)
                 1 -> fmap VirtUnit (get bh)
                 _ -> pure HoleUnit
+              -- Unit has strict fields that need forcing; otherwise we allocate a thunk.
+              pure $! u
 
 -- | Retrieve the set of free module holes of a 'Unit'.
 unitFreeModuleHoles :: GenUnit u -> UniqDSet ModuleName


=====================================
configure.ac
=====================================
@@ -55,6 +55,8 @@ USER_CONF_CC_OPTS_STAGE2="$CONF_CC_OPTS_STAGE2"
 USER_CONF_CXX_OPTS_STAGE2="$CONF_CXX_OPTS_STAGE2"
 USER_CONF_GCC_LINKER_OPTS_STAGE2="$CONF_GCC_LINKER_OPTS_STAGE2"
 
+USER_LD="$LD"
+
 dnl ----------------------------------------------------------
 dnl ** Find unixy sort and find commands,
 dnl ** which are needed by FP_SETUP_PROJECT_VERSION
@@ -491,6 +493,7 @@ FP_PROG_LD_IS_GNU
 FP_PROG_LD_NO_COMPACT_UNWIND
 FP_PROG_LD_FILELIST
 
+
 dnl ** Which nm to use?
 dnl --------------------------------------------------------------
 FP_FIND_NM
@@ -624,8 +627,6 @@ dnl     If gcc, make sure it's at least 4.7
 dnl
 FP_GCC_VERSION
 
-dnl ** See whether cc supports -no-pie
-FP_GCC_SUPPORTS_NO_PIE
 
 dnl ** Check support for the extra flags passed by GHC when compiling via C
 FP_GCC_SUPPORTS_VIA_C_FLAGS
@@ -665,9 +666,16 @@ AC_SUBST(LlvmTarget)
 
 dnl ** See whether cc supports --target=<triple> and set
 dnl CONF_CC_OPTS_STAGE[012] accordingly.
-FP_CC_SUPPORTS_TARGET([$CC_STAGE0], [CONF_CC_OPTS_STAGE0], [CONF_CXX_OPTS_STAGE0], [CONF_GCC_LINKER_OPTS_STAGE0])
-FP_CC_SUPPORTS_TARGET([$CC], [CONF_CC_OPTS_STAGE1], [CONF_CXX_OPTS_STAGE1], [CONF_GCC_LINKER_OPTS_STAGE1])
-FP_CC_SUPPORTS_TARGET([$CC], [CONF_CC_OPTS_STAGE2], [CONF_CXX_OPTS_STAGE2], [CONF_GCC_LINKER_OPTS_STAGE2])
+FP_CC_SUPPORTS_TARGET([$CC_STAGE0], [CONF_CC_OPTS_STAGE0], [CONF_CXX_OPTS_STAGE0])
+FP_CC_SUPPORTS_TARGET([$CC], [CONF_CC_OPTS_STAGE1], [CONF_CXX_OPTS_STAGE1])
+FP_CC_SUPPORTS_TARGET([$CC], [CONF_CC_OPTS_STAGE2], [CONF_CXX_OPTS_STAGE2])
+
+FP_PROG_CC_LINKER_TARGET([$CC_STAGE0], [CONF_CC_OPTS_STAGE0], [CONF_GCC_LINKER_OPTS_STAGE0])
+FP_PROG_CC_LINKER_TARGET([$CC], [CONF_CC_OPTS_STAGE1], [CONF_GCC_LINKER_OPTS_STAGE1])
+FP_PROG_CC_LINKER_TARGET([$CC], [CONF_CC_OPTS_STAGE2], [CONF_GCC_LINKER_OPTS_STAGE2])
+
+dnl ** See whether cc used as a linker supports -no-pie
+FP_GCC_SUPPORTS_NO_PIE
 
 dnl Pass -Qunused-arguments or otherwise GHC will have very noisy invocations of Clang
 dnl TODO: Do we need -Qunused-arguments in CXX and GCC linker too?


=====================================
hadrian/src/Rules/ToolArgs.hs
=====================================
@@ -159,6 +159,8 @@ toolTargets = [ binary
               , ghcBoot
               , ghcBootTh
               , ghcPlatform
+              , ghcToolchain
+              , ghcToolchainBin
               , ghcHeap
               , ghci
               , ghcPkg  -- # executable


=====================================
hadrian/src/Settings/Builders/Cabal.hs
=====================================
@@ -188,11 +188,9 @@ configureArgs cFlags' ldFlags' = do
                            , cFlags'
                            ]
         ldFlags  = ldArgs <> ldFlags'
-    cldFlags <- unwords <$> (cFlags <> ldFlags)
     mconcat
         [ conf "CFLAGS"   cFlags
         , conf "LDFLAGS"  ldFlags
-        , not (null cldFlags) ? arg ("--gcc-options=" ++ cldFlags)
         , conf "--with-iconv-includes"    $ arg =<< getSetting IconvIncludeDir
         , conf "--with-iconv-libraries"   $ arg =<< getSetting IconvLibDir
         , conf "--with-gmp-includes"      $ arg =<< getSetting GmpIncludeDir


=====================================
m4/find_merge_objects.m4
=====================================
@@ -45,7 +45,7 @@ AC_DEFUN([FIND_MERGE_OBJECTS],[
       if test "$result" = "1"; then
           AC_MSG_NOTICE([$MergeObjsCmd is broken due to binutils 22266, looking for another linker...])
           MergeObjsCmd=""
-          AC_CHECK_TARGET_TOOL([MergeObjsCmd], [ld])
+          AC_PATH_TARGET_TOOL([MergeObjsCmd], [ld])
           CHECK_FOR_GOLD_T22266($MergeObjsCmd)
           if test "$result" = "1"; then
               AC_MSG_ERROR([Linker is affected by binutils 22266 but couldn't find another unaffected linker. Please set the MergeObjsCmd variable to a functional linker.])


=====================================
m4/fp_cc_supports_target.m4
=====================================
@@ -10,13 +10,17 @@
 # $1 = CC
 # $2 = CC_OPTS variable
 # $3 = CXX_OPTS variable
-# $4 = GCC_LINK_OPTS variable
 AC_DEFUN([FP_CC_SUPPORTS_TARGET],
 [
    AC_REQUIRE([GHC_LLVM_TARGET_SET_VAR])
    AC_MSG_CHECKING([whether $1 supports --target])
+
    echo 'int main() { return 0; }' > conftest.c
-   if $1 --target=$LlvmTarget -Werror conftest.c > /dev/null 2>&1 ; then
+   if test "$target_cpu" = "javascript" ; then
+       # See Note [Don't pass --target to emscripten toolchain] in GHC.Toolchain.Program
+       CONF_CC_SUPPORTS_TARGET=NO
+       AC_MSG_RESULT([no])
+   elif $1 --target=$LlvmTarget -Werror conftest.c > /dev/null 2>&1 ; then
        CONF_CC_SUPPORTS_TARGET=YES
        AC_MSG_RESULT([yes])
    else
@@ -28,7 +32,6 @@ AC_DEFUN([FP_CC_SUPPORTS_TARGET],
    if test $CONF_CC_SUPPORTS_TARGET = YES ; then
        $2="--target=$LlvmTarget $$2"
        $3="--target=$LlvmTarget $$3"
-       $4="--target=$LlvmTarget $$4"
    fi
 ])
 


=====================================
m4/fp_gcc_supports_no_pie.m4
=====================================
@@ -7,8 +7,9 @@ AC_DEFUN([FP_GCC_SUPPORTS_NO_PIE],
    AC_REQUIRE([AC_PROG_CC])
    AC_MSG_CHECKING([whether CC supports -no-pie])
    echo 'int main() { return 0; }' > conftest.c
+   "$CC" $CONF_GCC_CC_OPTS_STAGE2 -c conftest.c
    # Some GCC versions only warn when passed an unrecognized flag.
-   if $CC -no-pie -Werror -x c conftest.c -o conftest > conftest.txt 2>&1 && ! grep -i unrecognized conftest.txt > /dev/null 2>&1; then
+   if "$CC" $CONF_GCC_LINKER_OPTS_STAGE2 -no-pie -Werror conftest.o -o conftest > conftest.txt 2>&1 && ! grep -i unrecognized conftest.txt > /dev/null 2>&1; then
        CONF_GCC_SUPPORTS_NO_PIE=YES
        AC_MSG_RESULT([yes])
    else


=====================================
m4/fp_prog_cc_linker_target.m4
=====================================
@@ -0,0 +1,31 @@
+# FP_PROG_CC_LINKER_TARGET
+# -------------------
+# Check to see if the C compiler used as a linker supports `--target`
+#
+# $1 - The compiler
+# $2 - Variable which contains the options passed to the C compiler when compiling a C file
+# $3 - Variable which contains the options passed to the C compiler when used as
+#      a linker
+AC_DEFUN([FP_PROG_CC_LINKER_TARGET],
+[
+    AC_MSG_CHECKING([whether $CC used as a linker understands --target])
+
+    echo 'int foo() { return 0; }' > conftest1.c
+    echo 'int main() { return 0; }' > conftest2.c
+    "$1" $$2 -c conftest1.c || AC_MSG_ERROR([Failed to compile conftest1.c])
+    "$1" $$2 -c conftest2.c || AC_MSG_ERROR([Failed to compile conftest2.c])
+
+    if test "$target_cpu" = "javascript"
+    then
+        # See Note [Don't pass --target to emscripten toolchain] in GHC.Toolchain.Program
+        CONF_CC_SUPPORTS_TARGET=NO
+        AC_MSG_RESULT([no])
+    elif "$CC" $$3 --target=$LlvmTarget -o conftest conftest1.o conftest2.o;
+    then
+        $3="--target=$LlvmTarget $$3"
+        AC_MSG_RESULT([yes])
+    else
+        AC_MSG_RESULT([no])
+    fi
+    rm -rf conftest*
+])# FP_PROG_CC_LINKER_TARGET


=====================================
m4/ghc_toolchain.m4
=====================================
@@ -106,6 +106,10 @@ AC_DEFUN([FIND_GHC_TOOLCHAIN],
     echo "--readelf=$READELF" >> acargs
     echo "--windres=$WindresCmd" >> acargs
 
+    if test -n "$USER_LD"; then
+      echo "--ld=$USER_LD" >> acargs
+    fi
+
     ENABLE_GHC_TOOLCHAIN_NOT_ARG([locally-executable], [$CrossCompiling])
     ENABLE_GHC_TOOLCHAIN_ARG([unregisterised], [$Unregisterised])
     ENABLE_GHC_TOOLCHAIN_ARG([tables-next-to-code], [$TablesNextToCode])
@@ -144,7 +148,7 @@ AC_DEFUN([FIND_GHC_TOOLCHAIN_BIN],[
                 -ilibraries/ghc-platform/src -iutils/ghc-toolchain/src \
                 -XNoImplicitPrelude \
                 -odir actmp-ghc-toolchain -hidir actmp-ghc-toolchain \
-                utils/ghc-toolchain/exe/Main.hs -o acghc-toolchain
+                utils/ghc-toolchain/exe/Main.hs -o acghc-toolchain || AC_MSG_ERROR([Could not compile ghc-toolchain])
             GHC_TOOLCHAIN_BIN="./acghc-toolchain"
             ;;
         *)


=====================================
utils/ghc-toolchain/exe/Main.hs
=====================================
@@ -31,6 +31,7 @@ import GHC.Toolchain.Tools.Ranlib
 import GHC.Toolchain.Tools.Nm
 import GHC.Toolchain.Tools.MergeObjs
 import GHC.Toolchain.Tools.Readelf
+import GHC.Toolchain.NormaliseTriple (normaliseTriple)
 
 data Opts = Opts
     { optTriple    :: String
@@ -49,6 +50,9 @@ data Opts = Opts
     , optReadelf   :: ProgOpt
     , optMergeObjs :: ProgOpt
     , optWindres   :: ProgOpt
+    -- Note we don't actually configure LD into anything but
+    -- see #23857 and #22550 for the very unfortunate story.
+    , optLd        :: ProgOpt
     , optUnregisterised :: Maybe Bool
     , optTablesNextToCode :: Maybe Bool
     , optUseLibFFIForAdjustors :: Maybe Bool
@@ -91,6 +95,7 @@ emptyOpts = Opts
     , optReadelf   = po0
     , optMergeObjs = po0
     , optWindres   = po0
+    , optLd        = po0
     , optUnregisterised = Nothing
     , optTablesNextToCode = Nothing
     , optUseLibFFIForAdjustors = Nothing
@@ -102,7 +107,7 @@ emptyOpts = Opts
     po0 = emptyProgOpt
 
 _optCc, _optCxx, _optCpp, _optHsCpp, _optCcLink, _optAr, _optRanlib, _optNm,
-    _optReadelf, _optMergeObjs, _optWindres
+    _optReadelf, _optMergeObjs, _optWindres, _optLd
     :: Lens Opts ProgOpt
 _optCc      = Lens optCc      (\x o -> o {optCc=x})
 _optCxx     = Lens optCxx     (\x o -> o {optCxx=x})
@@ -115,6 +120,7 @@ _optNm      = Lens optNm      (\x o -> o {optNm=x})
 _optReadelf = Lens optReadelf (\x o -> o {optReadelf=x})
 _optMergeObjs = Lens optMergeObjs (\x o -> o {optMergeObjs=x})
 _optWindres = Lens optWindres (\x o -> o {optWindres=x})
+_optLd = Lens optLd (\x o -> o {optLd= x})
 
 _optTriple :: Lens Opts String
 _optTriple = Lens optTriple (\x o -> o {optTriple=x})
@@ -169,6 +175,7 @@ options =
     , progOpts "readelf" "readelf utility" _optReadelf
     , progOpts "merge-objs" "linker for merging objects" _optMergeObjs
     , progOpts "windres" "windres utility" _optWindres
+    , progOpts "ld" "linker" _optLd
     ]
   where
     progOpts :: String -> String -> Lens Opts ProgOpt -> [OptDescr (Opts -> Opts)]
@@ -367,18 +374,24 @@ ldOverrideWhitelist a =
     _ -> False
 
 
+
 mkTarget :: Opts -> M Target
 mkTarget opts = do
+    normalised_triple <- normaliseTriple (optTriple opts)
     -- Use Llvm target if specified, otherwise use triple as llvm target
-    let tgtLlvmTarget = fromMaybe (optTriple opts) (optLlvmTriple opts)
-    cc0 <- findCc tgtLlvmTarget (optCc opts)
-    cxx <- findCxx tgtLlvmTarget (optCxx opts)
+    let tgtLlvmTarget = fromMaybe normalised_triple (optLlvmTriple opts)
+
+    (archOs, tgtVendor) <- do
+      cc0 <- findBasicCc (optCc opts)
+      parseTriple cc0 normalised_triple
+
+    cc0 <- findCc archOs tgtLlvmTarget (optCc opts)
+    cxx <- findCxx archOs tgtLlvmTarget (optCxx opts)
     cpp <- findCpp (optCpp opts) cc0
     hsCpp <- findHsCpp (optHsCpp opts) cc0
-    (archOs, tgtVendor) <- parseTriple cc0 (optTriple opts)
     cc <- addPlatformDepCcFlags archOs cc0
     readelf <- optional $ findReadelf (optReadelf opts)
-    ccLink <- findCcLink tgtLlvmTarget (optCcLink opts) (ldOverrideWhitelist archOs && fromMaybe True (optLdOverride opts)) archOs cc readelf
+    ccLink <- findCcLink tgtLlvmTarget (optLd opts) (optCcLink opts) (ldOverrideWhitelist archOs && fromMaybe True (optLdOverride opts)) archOs cc readelf
 
     ar <- findAr tgtVendor (optAr opts)
     -- TODO: We could have


=====================================
utils/ghc-toolchain/ghc-toolchain.cabal
=====================================
@@ -17,6 +17,7 @@ library
                       GHC.Toolchain.Prelude,
                       GHC.Toolchain.Program,
                       GHC.Toolchain.ParseTriple,
+                      GHC.Toolchain.NormaliseTriple,
                       GHC.Toolchain.CheckArm,
                       GHC.Toolchain.Target,
                       GHC.Toolchain.Tools.Ar,


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/NormaliseTriple.hs
=====================================
@@ -0,0 +1,13 @@
+module GHC.Toolchain.NormaliseTriple where
+
+import GHC.Toolchain.Prelude
+import GHC.Toolchain.Program
+import Data.Text (strip, pack, unpack)
+
+-- | Normalise the triple by calling `config.sub` on the given triple.
+normaliseTriple :: String -> M String
+normaliseTriple triple = do
+  let norm = unpack . strip . pack
+  normalised_triple <- norm <$> readProgramStdout shProgram ["config.sub", triple]
+  logInfo $ unwords ["Normalised triple:", triple, "~>", normalised_triple]
+  return normalised_triple


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/ParseTriple.hs
=====================================
@@ -20,12 +20,12 @@ parseTriple cc triple
   | [archName, vendorName, osName] <- parts
   = do arch <- parseArch cc archName
        os   <- parseOs osName
-       return (ArchOS arch os, Just vendorName)
+       return (ArchOS arch os, Just (parseVendor vendorName))
 
   | [archName, vendorName, osName, _abi] <- parts
   = do arch <- parseArch cc archName
        os   <- parseOs osName
-       return (ArchOS arch os, Just vendorName)
+       return (ArchOS arch os, Just (parseVendor vendorName))
 
   | otherwise
   = throwE $ "malformed triple " ++ triple
@@ -54,6 +54,7 @@ parseArch cc arch =
       "hppa" -> pure ArchUnknown
       "wasm32" -> pure ArchWasm32
       "javascript" -> pure ArchJavaScript
+      "loongarch64" -> pure ArchLoongArch64
       _ -> throwE $ "Unknown architecture " ++ arch
 
 parseOs :: String -> M OS
@@ -80,6 +81,20 @@ parseOs os =
       "ghcjs" -> pure OSGhcjs
       _ -> throwE $ "Unknown operating system " ++ os
 
+parseVendor :: String -> String
+parseVendor vendor =
+  case vendor of
+    -- like i686-pc-linux-gnu, i686-gentoo-freebsd8, x86_64-w64-mingw32
+    "pc" -> "unknown"
+    "gentoo" -> "unknown"
+    "w64" -> "unknown"
+    -- like armv5tel-softfloat-linux-gnueabi
+    "softfloat" -> "unknown"
+    -- like armv7a-hardfloat-linux-gnueabi
+    "hardfloat" -> "unknown"
+    -- Pass through by default
+    _ -> vendor
+
 splitOn :: Char -> String -> [String]
 splitOn sep = go
   where


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/PlatformDetails.hs
=====================================
@@ -46,7 +46,7 @@ checkWordSize cc = checking "word size" $ do
 
 checkEndianness :: Cc -> M Endianness
 checkEndianness cc = do
-    checkEndiannessParamH cc <|> checkEndiannessLimitsH cc
+    checkEndiannessParamH cc <|> checkEndiannessLimitsH cc <|> checkEndianness__BYTE_ORDER__ cc
 
 checkEndiannessParamH :: Cc -> M Endianness
 checkEndiannessParamH cc = checking "endianness (param.h)" $ do
@@ -92,6 +92,28 @@ checkEndiannessLimitsH cc = checking "endianness (limits.h)" $ do
         , "#endif"
         ]
 
+checkEndianness__BYTE_ORDER__ :: Cc -> M Endianness
+checkEndianness__BYTE_ORDER__ cc = checking "endianness (__BYTE_ORDER__)" $ do
+    out <- preprocess cc prog
+    case reverse $ lines out of
+      "big":_ -> return BigEndian
+      "little":_ -> return LittleEndian
+      "unknown":_ -> throwE "unknown endianness"
+      _ -> throwE "unrecognized output"
+  where
+    prog = unlines
+        [ "#include <sys/param.h>"
+        , "#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__"
+        , "little"
+        , "#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__"
+        , "big"
+        , "#else"
+        , "unknown"
+        , "#endif"
+        ]
+
+
+
 checkLeadingUnderscore :: Cc -> Nm -> M Bool
 checkLeadingUnderscore cc nm = checking ctxt $ withTempDir $ \dir -> do
     let test_o = dir </> "test.o"


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/Program.hs
=====================================
@@ -1,5 +1,6 @@
 module GHC.Toolchain.Program
     ( Program(..)
+    , shProgram
     , _prgPath
     , _prgFlags
     , addFlagIfNew
@@ -29,6 +30,7 @@ import System.Directory
 import System.Exit
 import System.Process hiding (env)
 
+import GHC.Platform.ArchOS
 import GHC.Toolchain.Prelude
 import GHC.Toolchain.Utils
 
@@ -37,6 +39,9 @@ data Program = Program { prgPath :: FilePath
                        }
     deriving (Read, Eq, Ord)
 
+shProgram :: Program
+shProgram = Program "sh" []
+
 instance Show Program where
   -- Normalise filepaths before showing to aid with diffing the target files.
   show (Program p f) = unwords
@@ -178,17 +183,37 @@ compile ext extraFlags lens c outPath program = do
     callProgram (view lens c) $ extraFlags ++ ["-o", outPath, srcPath]
     expectFileExists outPath "compiler produced no output"
 
+-- Note [Don't pass --target to emscripten toolchain]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-- Emscripten's CC wrapper is a bit wonky in that it accepts the `--target`
+-- flag when used as a linker yet rejects it as a compiler (e.g. with `-c`).
+-- This is exacerbated by the fact that Cabal currently in some cases
+-- combines (and therefore conflates) link and compilation flags.
+--
+-- Ultimately this should be fixed in Cabal but in the meantime we work around it
+-- by handling this toolchain specifically in the various
+-- "supports --target" checks in `configure` and `ghc-toolchain`.
+--
+-- Fixes #23744.
+
 -- | Does compiler program support the @--target=<triple>@ option? If so, we should
 -- pass it whenever possible to avoid ambiguity and potential compile-time
 -- errors (e.g. see #20162).
-supportsTarget :: Lens compiler Program
+supportsTarget :: ArchOS
+               -> Lens compiler Program
                -> (compiler -> M ()) -- ^ Action to check if compiler with @--target@ flag works
                -> String             -- ^ The LLVM target to use if @cc@ supports @--target@
                -> compiler           -- ^ The compiler to check @--target@ support for
                -> M compiler         -- ^ Return compiler with @--target@ flag if supported
-supportsTarget lens checkWorks llvmTarget c
--- TODO: #23603
-  | any ("--target=" `isPrefixOf`) (view (lens % _prgFlags) c) = return c
+supportsTarget archOs lens checkWorks llvmTarget c
+    -- See Note [Don't pass --target to emscripten toolchain].
+  | ArchJavaScript <- archOS_arch archOs
+  = return c
+
+    -- No reason to check if the options already contain a --target flag
+  | any ("--target=" `isPrefixOf`) (view (lens % _prgFlags) c)
+  = return c
+
   | otherwise
   = let c' = over (lens % _prgFlags) (("--target="++llvmTarget):) c
      in (c' <$ checkWorks (over (lens % _prgFlags) ("-Werror":) c')) <|> return c


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cc.hs
=====================================
@@ -4,12 +4,14 @@
 module GHC.Toolchain.Tools.Cc
     ( Cc(..)
     , _ccProgram
+    , findBasicCc
     , findCc
       -- * Helpful utilities
     , preprocess
     , compileC
     , compileAsm
     , addPlatformDepCcFlags
+    , checkC99Support
     ) where
 
 import Control.Monad
@@ -32,19 +34,29 @@ _ccProgram = Lens ccProgram (\x o -> o{ccProgram=x})
 _ccFlags :: Lens Cc [String]
 _ccFlags = _ccProgram % _prgFlags
 
-findCc :: String -- ^ The llvm target to use if Cc supports --target
-       -> ProgOpt -> M Cc
-findCc llvmTarget progOpt = checking "for C compiler" $ do
+-- We use this to find a minimally-functional compiler needed to call
+-- parseTriple.
+findBasicCc :: ProgOpt -> M Cc
+findBasicCc progOpt = checking "for C compiler" $ do
     -- TODO: We keep the candidate order we had in configure, but perhaps
     -- there's a more optimal one
     ccProgram <- findProgram "C compiler" progOpt ["gcc", "clang", "cc"]
+    return $ Cc{ccProgram}
 
-    cc' <- ignoreUnusedArgs $ Cc {ccProgram}
-    cc  <- ccSupportsTarget llvmTarget cc'
-    checking "whether Cc works" $ checkCcWorks cc
-    checkC99Support cc
-    checkCcSupportsExtraViaCFlags cc
-    return cc
+findCc :: ArchOS
+       -> String -- ^ The llvm target to use if Cc supports --target
+       -> ProgOpt -> M Cc
+findCc archOs llvmTarget progOpt = do
+    cc0 <- findBasicCc progOpt
+    cc1 <- ignoreUnusedArgs cc0
+    cc2 <- ccSupportsTarget archOs llvmTarget cc1
+    checking "whether Cc works" $ checkCcWorks cc2
+    cc3 <- oneOf "cc doesn't support C99" $ map checkC99Support
+        [ cc2
+        , cc2 & _ccFlags %++ "-std=gnu99"
+        ]
+    checkCcSupportsExtraViaCFlags cc3
+    return cc3
 
 checkCcWorks :: Cc -> M ()
 checkCcWorks cc = withTempDir $ \dir -> do
@@ -71,11 +83,12 @@ ignoreUnusedArgs cc
 -- Does Cc support the --target=<triple> option? If so, we should pass it
 -- whenever possible to avoid ambiguity and potential compile-time errors (e.g.
 -- see #20162).
-ccSupportsTarget :: String -> Cc -> M Cc
-ccSupportsTarget target cc = checking "whether Cc supports --target" $
-                             supportsTarget _ccProgram checkCcWorks target cc
+ccSupportsTarget :: ArchOS -> String -> Cc -> M Cc
+ccSupportsTarget archOs target cc =
+    checking "whether Cc supports --target" $
+    supportsTarget archOs _ccProgram checkCcWorks target cc
 
-checkC99Support :: Cc -> M ()
+checkC99Support :: Cc -> M Cc
 checkC99Support cc = checking "for C99 support" $ withTempDir $ \dir -> do
     let test_o = dir </> "test.o"
     compileC cc test_o $ unlines
@@ -84,6 +97,7 @@ checkC99Support cc = checking "for C99 support" $ withTempDir $ \dir -> do
         , "# error \"Compiler does not advertise C99 conformance\""
         , "#endif"
         ]
+    return cc
 
 checkCcSupportsExtraViaCFlags :: Cc -> M ()
 checkCcSupportsExtraViaCFlags cc = checking "whether cc supports extra via-c flags" $ withTempDir $ \dir -> do


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cpp.hs
=====================================
@@ -10,7 +10,7 @@ import GHC.Toolchain.Prelude
 import GHC.Toolchain.Program
 
 import GHC.Toolchain.Tools.Cc
-import GHC.Toolchain.Utils (withTempDir)
+import GHC.Toolchain.Utils (withTempDir, oneOf)
 
 newtype Cpp = Cpp { cppProgram :: Program
                     }
@@ -83,7 +83,12 @@ findCpp :: ProgOpt -> Cc -> M Cpp
 findCpp progOpt cc = checking "for C preprocessor" $ do
   -- Use the specified CPP or try to use the c compiler
   foundCppProg <- findProgram "C preprocessor" progOpt [] <|> pure (programFromOpt progOpt (prgPath $ ccProgram cc) [])
+  -- Check whether the C preprocessor needs -std=gnu99 (only very old toolchains need this)
+  Cc cpp2 <- oneOf "cc doesn't support C99" $ map checkC99Support
+        [ Cc foundCppProg
+        , Cc (foundCppProg & _prgFlags %++ "-std=gnu99")
+        ]
   -- Always add the -E flag to the CPP, regardless of the user options
-  let cppProgram = addFlagIfNew "-E" foundCppProg
+  let cppProgram = addFlagIfNew "-E" cpp2
   return Cpp{cppProgram}
 


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/Tools/Cxx.hs
=====================================
@@ -8,6 +8,8 @@ module GHC.Toolchain.Tools.Cxx
     ) where
 
 import System.FilePath
+
+import GHC.Platform.ArchOS
 import GHC.Toolchain.Prelude
 import GHC.Toolchain.Program
 import GHC.Toolchain.Utils
@@ -19,18 +21,20 @@ newtype Cxx = Cxx { cxxProgram :: Program
 _cxxProgram :: Lens Cxx Program
 _cxxProgram = Lens cxxProgram (\x o -> o{cxxProgram=x})
 
-findCxx :: String -- ^ The llvm target to use if Cc supports --target
+findCxx :: ArchOS
+        -> String -- ^ The llvm target to use if Cc supports --target
         -> ProgOpt -> M Cxx
-findCxx target progOpt = checking "for C++ compiler" $ do
+findCxx archOs target progOpt = checking "for C++ compiler" $ do
     -- TODO: We use the search order in configure, but there could be a more optimal one
     cxxProgram <- findProgram "C++ compiler" progOpt ["g++", "clang++", "c++"]
-    cxx        <- cxxSupportsTarget target Cxx{cxxProgram}
+    cxx        <- cxxSupportsTarget archOs target Cxx{cxxProgram}
     checkCxxWorks cxx
     return cxx
 
-cxxSupportsTarget :: String -> Cxx -> M Cxx
-cxxSupportsTarget target cxx = checking "whether C++ supports --target" $
-                               supportsTarget _cxxProgram checkCxxWorks target cxx
+cxxSupportsTarget :: ArchOS -> String -> Cxx -> M Cxx
+cxxSupportsTarget archOs target cxx =
+    checking "whether C++ supports --target" $
+    supportsTarget archOs _cxxProgram checkCxxWorks target cxx
 
 checkCxxWorks :: Cxx -> M ()
 checkCxxWorks cxx = withTempDir $ \dir -> do


=====================================
utils/ghc-toolchain/src/GHC/Toolchain/Tools/Link.hs
=====================================
@@ -42,21 +42,28 @@ _ccLinkProgram :: Lens CcLink Program
 _ccLinkProgram = Lens ccLinkProgram (\x o -> o{ccLinkProgram=x})
 
 findCcLink :: String -- ^ The llvm target to use if CcLink supports --target
+           -> ProgOpt
            -> ProgOpt
            -> Bool   -- ^ Whether we should search for a more efficient linker
            -> ArchOS -> Cc -> Maybe Readelf -> M CcLink
-findCcLink target progOpt ldOverride archOs cc readelf = checking "for C compiler for linking command" $ do
+findCcLink target ld progOpt ldOverride archOs cc readelf = checking "for C compiler for linking command" $ do
   -- Use the specified linker or try using the C compiler
   rawCcLink <- findProgram "C compiler for linking" progOpt [] <|> pure (programFromOpt progOpt (prgPath $ ccProgram cc) [])
-  ccLinkProgram <- case poFlags progOpt of
-                     Just _ ->
+  -- See #23857 for why we check to see if LD is set here
+  -- TLDR: If the user explicitly sets LD then in ./configure
+  -- we don't perform a linker search (and set -fuse-ld), so
+  -- we do the same here for consistency.
+  ccLinkProgram <- case (poPath ld, poFlags progOpt) of
+                     (_, Just _) ->
                          -- If the user specified linker flags don't second-guess them
                          pure rawCcLink
-                     Nothing -> do
+                     (Just {}, _) ->
+                         pure rawCcLink
+                     _ -> do
                          -- If not then try to find decent linker flags
                          findLinkFlags ldOverride cc rawCcLink <|> pure rawCcLink
   ccLinkProgram <- linkSupportsTarget archOs cc target ccLinkProgram
-  ccLinkSupportsNoPie         <- checkSupportsNoPie  ccLinkProgram
+  ccLinkSupportsNoPie         <- checkSupportsNoPie  cc ccLinkProgram
   ccLinkSupportsCompactUnwind <- checkSupportsCompactUnwind archOs cc ccLinkProgram
   ccLinkSupportsFilelist      <- checkSupportsFilelist cc ccLinkProgram
   ccLinkIsGnu                 <- checkLinkIsGnu archOs ccLinkProgram
@@ -90,12 +97,9 @@ findLinkFlags enableOverride cc ccLink
 linkSupportsTarget :: ArchOS -> Cc -> String -> Program -> M Program
 -- Javascript toolchain provided by emsdk just ignores --target flag so
 -- we have this special case to match with ./configure (#23744)
-linkSupportsTarget archOS _ _ c
-  | ArchJavaScript <- archOS_arch archOS
-  = return c
-linkSupportsTarget _ cc target link
-  = checking "whether cc linker supports --target" $
-    supportsTarget (Lens id const) (checkLinkWorks cc) target link
+linkSupportsTarget archOs cc target link =
+    checking "whether cc linker supports --target" $
+    supportsTarget archOs (Lens id const) (checkLinkWorks cc) target link
 
 -- | Should we attempt to find a more efficient linker on this platform?
 --
@@ -112,16 +116,15 @@ doLinkerSearch = False
 #endif
 
 -- | See Note [No PIE when linking] in GHC.Driver.Session
-checkSupportsNoPie :: Program -> M Bool
-checkSupportsNoPie ccLink = checking "whether the cc linker supports -no-pie" $
+checkSupportsNoPie :: Cc -> Program -> M Bool
+checkSupportsNoPie cc ccLink = checking "whether the cc linker supports -no-pie" $
   withTempDir $ \dir -> do
-    let test_c = dir </> "test.c"
-    writeFile test_c "int main() { return 0; }"
-
+    let test_o  = dir </> "test.o"
     let test = dir </> "test"
+    compileC cc test_o "int main() { return 0; }"
     -- Check output as some GCC versions only warn and don't respect -Werror
     -- when passed an unrecognized flag.
-    (code, out, err) <- readProgram ccLink ["-no-pie", "-Werror", "-x", "c", test_c, "-o", test]
+    (code, out, err) <- readProgram ccLink ["-no-pie", "-Werror", test_o, "-o", test]
     return (isSuccess code && not ("unrecognized" `isInfixOf` out) && not ("unrecognized" `isInfixOf` err))
 
 -- ROMES:TODO: This check is wrong here and in configure because with ld.gold parses "-n" "o_compact_unwind"



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/265e36bf780ae0cb3b2214f699d51af124e82a8e...c48f2a4f2b07a049bbe12c7da64fd362b430c290

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/265e36bf780ae0cb3b2214f699d51af124e82a8e...c48f2a4f2b07a049bbe12c7da64fd362b430c290
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231010/9a9a61a7/attachment-0001.html>


More information about the ghc-commits mailing list