[Git][ghc/ghc][wip/ncg-simd] 22 commits: Expand LLVM version matching regex for compability with bsd systems

sheaf (@sheaf) gitlab at gitlab.haskell.org
Wed Jul 3 11:57:13 UTC 2024



sheaf pushed to branch wip/ncg-simd at Glasgow Haskell Compiler / GHC


Commits:
77ce65a5 by Matthew Pickering at 2024-06-27T07:57:14-04:00
Expand LLVM version matching regex for compability with bsd systems

sed on BSD systems (such as darwin) does not support the + operation.

Therefore we take the simple minded approach of manually expanding
group+ to groupgroup*.

Fixes #24999

- - - - -
bdfe4a9e by Matthew Pickering at 2024-06-27T07:57:14-04:00
ci: On darwin configure LLVMAS linker to match LLC and OPT toolchain

The version check was previously broken so the toolchain was not
detected at all.

- - - - -
07e03a69 by Matthew Pickering at 2024-06-27T07:57:15-04:00
Update nixpkgs commit for darwin toolchain

One dependency (c-ares) changed where it hosted the releases which
breaks the build with the old nixpkgs commit.

- - - - -
144afed7 by Rodrigo Mesquita at 2024-06-27T07:57:50-04:00
base: Add changelog entry for #24998

- - - - -
eebe1658 by Sylvain Henry at 2024-06-28T07:13:26-04:00
X86/DWARF: support no tables-next-to-code and asm-shortcutting (#22792)

- Without TNTC (tables-next-to-code), we must be careful to not
  duplicate labels in pprNatCmmDecl. Especially, as a CmmProc is
  identified by the label of its entry block (and not of its info
  table), we can't reuse the same label to delimit the block end and the
  proc end.

- We generate debug infos from Cmm blocks. However, when
  asm-shortcutting is enabled, some blocks are dropped at the asm
  codegen stage and some labels in the DebugBlocks become missing.
  We fix this by filtering the generated debug-info after the asm
  codegen to only keep valid infos.

Also add some related documentation.

- - - - -
6e86d82b by Sylvain Henry at 2024-06-28T07:14:06-04:00
PPC NCG: handle JMP to ForeignLabels (#23969)

- - - - -
9e4b4b0a by Sylvain Henry at 2024-06-28T07:14:06-04:00
PPC NCG: support loading 64-bit value on 32-bit arch (#23969)

- - - - -
50caef3e by Sylvain Henry at 2024-06-28T07:14:46-04:00
Fix warnings in genapply

- - - - -
37139b17 by Matthew Pickering at 2024-06-28T07:15:21-04:00
libraries: Update os-string to 2.0.4

This updates the os-string submodule to 2.0.4 which removes the usage of
`TemplateHaskell` pragma.

- - - - -
0f3d3bd6 by Sylvain Henry at 2024-06-30T00:47:40-04:00
Bump array submodule

- - - - -
354c350c by Sylvain Henry at 2024-06-30T00:47:40-04:00
GHCi: Don't use deprecated sizeofMutableByteArray#

- - - - -
35d65098 by Ben Gamari at 2024-06-30T00:47:40-04:00
primops: Undeprecate addr2Int# and int2Addr#

addr2Int# and int2Addr# were marked as deprecated with the introduction
of the OCaml code generator (1dfaee318171836b32f6b33a14231c69adfdef2f)
due to its use of tagged integers. However, this backend has long
vanished and `base` has all along been using `addr2Int#` in the Show
instance for Ptr.

While it's unlikely that we will have another backend which has tagged
integers, we may indeed support platforms which have tagged pointers.
Consequently we undeprecate the operations but warn the user that the
operations may not be portable.

- - - - -
3157d817 by Sylvain Henry at 2024-06-30T00:47:41-04:00
primops: Undeprecate par#

par# is still used in base and it's not clear how to replace it with
spark# (see #24825)

- - - - -
c8d5b959 by Ben Gamari at 2024-06-30T00:47:41-04:00
Primops: Make documentation generation more efficient

Previously we would do a linear search through all primop names, doing a
String comparison on the name of each when preparing the HsDocStringMap.
Fix this.

- - - - -
65165fe4 by Ben Gamari at 2024-06-30T00:47:41-04:00
primops: Ensure that deprecations are properly tracked

We previously failed to insert DEPRECATION pragmas into GHC.Prim's
ModIface, meaning that they would appear in the Haddock documentation
but not issue warnings. Fix this.

See #19629. Haddock also needs to be fixed: https://github.com/haskell/haddock/issues/223

Co-authored-by: Sylvain Henry <sylvain at haskus.fr>

- - - - -
bc1d435e by Mario Blažević at 2024-06-30T00:48:20-04:00
Improved pretty-printing of unboxed TH sums and tuples, fixes #24997

- - - - -
54084843 by sheaf at 2024-07-03T13:55:01+02:00
Testsuite: use py-cpuinfo to compute CPU features

This replaces the rather hacky logic we had in place for checking
CPU features. In particular, this means that feature availability now
works properly on Windows.

- - - - -
87974f83 by sheaf at 2024-07-03T13:55:42+02:00
The X86 SIMD patch.

This commit adds support for 128 bit wide SIMD vectors and vector
operations to GHC's X86 native code generator.

Main changes:

  - Introduction of vector formats (`GHC.CmmToAsm.Format`)
  - Introduction of 128-bit virtual register (`GHC.Platform.Reg`),
    and removal of unused Float virtual register.
  - Refactor of `GHC.Platform.Reg.Class.RegClass`: it now only contains
    two classes, `RcInteger` (for general purpose registers) and `RcFloatOrVector`
    (for registers that can be used for scalar floating point values as well
    as vectors).
  - Modify `GHC.CmmToAsm.X86.Instr.regUsageOfInstr` to keep track
    of which format each register is used at, so that the register
    allocator can know if it needs to spill the entire vector register
    or just the lower 64 bits.
  - Modify spill/load/reg-2-reg code to account for vector registers
    (`GHC.CmmToAsm.X86.Instr.{mkSpillInstr, mkLoadInstr, mkRegRegMoveInstr, takeRegRegMoveInstr}`).
  - Modify the register allocator code (`GHC.CmmToAsm.Reg.*`) to propagate
    the format we are storing in any given register, for instance changing
    `Reg` to `RegFormat` or `GlobalReg` to `GlobalRegUse`.
  - Add logic to lower vector `MachOp`s to X86 assembly
    (see `GHC.CmmToAsm.X86.CodeGen`)
  - Minor cleanups to genprimopcode, to remove the llvm_only attribute
    which is no longer applicable.

Tests for this feature are provided in the "testsuite/tests/simd" directory.

Fixes #7741

Keeping track of register formats adds a small memory overhead to the
register allocator (in particular, regUsageOfInstr now allocates more
to keep track of the `Format` each register is used at). This explains
the following metric increases.

-------------------------
Metric Increase:
    T12707
    T13035
    T13379
    T3294
    T4801
    T5321FD
    T5321Fun
    T783
-------------------------

- - - - -
fefd5622 by sheaf at 2024-07-03T13:56:56+02:00
Add vector fused multiply-add operations

This commit adds fused multiply add operations such as `fmaddDoubleX2#`.
These are handled both in the X86 NCG and the LLVM backends.

- - - - -
28d0c8fd by sheaf at 2024-07-03T13:56:58+02:00
Add vector shuffle primops

This adds vector shuffle primops, such as

```
shuffleFloatX4# :: FloatX4# -> FloatX4# -> (# Int#, Int#, Int#, Int# #) -> FloatX4#
```

which shuffle the components of the input two vectors into the output vector.

NB: the indices must be compile time literals, to match the X86 SHUFPD
instruction immediate and the LLVM shufflevector instruction.

These are handled in the X86 NCG and the LLVM backend.

Tested in simd009.

- - - - -
f493411d by sheaf at 2024-07-03T13:56:58+02:00
Add Broadcast MachOps

This adds proper MachOps for broadcast instructions, allowing us to
produce better code for broadcasting a value than simply packing that
value (doing many vector insertions in a row).

These are lowered in the X86 NCG and LLVM backends. In the LLVM backend,
it uses the previously introduced shuffle instructions.

- - - - -
e2b57729 by sheaf at 2024-07-03T13:56:59+02:00
Fix treatment of signed zero in vector negation

This commit fixes the handling of signed zero in floating-point vector
negation.

A slight hack was introduced to work around the fact that Cmm doesn't
currently have a notion of signed floating point literals
(see get_float_broadcast_value_reg). This can be removed once CmmFloat
can express the value -0.0.

The simd006 test has been updated to use a stricter notion of equality
of floating-point values, which ensure the validity of this change.

- - - - -


30 changed files:

- .gitlab/darwin/nix/sources.json
- .gitlab/darwin/toolchain.nix
- compiler/GHC/Builtin/PrimOps.hs
- compiler/GHC/Builtin/Utils.hs
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/Cmm/CallConv.hs
- compiler/GHC/Cmm/DebugBlock.hs
- compiler/GHC/Cmm/Graph.hs
- compiler/GHC/Cmm/Liveness.hs
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Opt.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/ProcPoint.hs
- compiler/GHC/Cmm/Reg.hs
- compiler/GHC/Cmm/Sink.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/AArch64.hs
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/AArch64/Regs.hs
- compiler/GHC/CmmToAsm/Config.hs
- compiler/GHC/CmmToAsm/Format.hs
- compiler/GHC/CmmToAsm/Instr.hs
- compiler/GHC/CmmToAsm/PPC.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/PPC/Instr.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/31b1cc978c37551faf1810d2daf5dc87f8a3641d...e2b57729572cd502b27b91635e0f37376306d583

-- 
This project does not include diff previews in email notifications.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/31b1cc978c37551faf1810d2daf5dc87f8a3641d...e2b57729572cd502b27b91635e0f37376306d583
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240703/719060f2/attachment-0001.html>


More information about the ghc-commits mailing list