[Git][ghc/ghc][wip/ncg-simd] 23 commits: Update directory submodule to latest master

Wed Sep 4 10:51:36 UTC 2024

sheaf pushed to branch wip/ncg-simd at Glasgow Haskell Compiler / GHC

Commits:
c68be356 by Matthew Pickering at 2024-08-26T11:05:11-04:00
Update directory submodule to latest master

The primary reason for this bump is to fix the warning from `ghc-pkg
check`:

```
Warning: include-dirs: /data/home/ubuntu/.ghcup/ghc/9.6.2/lib/ghc-9.6.2/lib/../lib/aarch64-linux-ghc-9.6.2/directory-1.3.8.1/include doesn't exist or isn't a directory
```

This also requires adding the `file-io` package as a boot library (which
is discussed in #25145)

Fixes #23594 #25145

- - - - -
4ee094d4 by Matthew Pickering at 2024-08-26T11:05:47-04:00
Fix aarch64-alpine target platform description

We are producing bindists where the target triple is

aarch64-alpine-linux

when it should be

aarch64-unknown-linux

This is because the bootstrapped compiler originally set the target
triple to `aarch64-alpine-linux` which is when propagated forwards by
setting `bootstrap_target` from the bootstrap compiler target.

In order to break this chain we explicitly specify build/host/target for
aarch64-alpine.

This requires a new configure flag `--enable-ignore-` which just
switches off a validation check that the target platform of the
bootstrap compiler is the same as the build platform. It is the same,
but the name is just wrong.

These commits can be removed when the bootstrap compiler has the correct
target triple (I looked into patching this on ci-images, but it looked
hard to do correctly as the build/host platform is not in the settings
file).

Fixes #25200

- - - - -
e0e0f2b2 by Matthew Pickering at 2024-08-26T11:05:47-04:00
Bump nixpkgs commit for gen_ci script

- - - - -
63a27091 by doyougnu at 2024-08-26T20:39:30-04:00
rts: win32: emit additional debugging information

-- migration from haskell.nix

- - - - -
aaab3d10 by Vladislav Zavialov at 2024-08-26T20:40:06-04:00
Only export defaults when NamedDefaults are enabled (#25206)

This is a reinterpretation of GHC Proposal #409 that avoids a breaking
change introduced in fa0dbaca6c "Implements the Exportable Named Default proposal"

Consider a module M that has no explicit export list:

	module M where
	default (Rational)

Should it export the default (Rational)?

The proposal says "yes", and there's a test case for that:

	default/DefaultImport04.hs

However, as it turns out, this change in behavior breaks existing
programs, e.g. the colour-2.3.6 package can no longer be compiled,
as reported in #25206.

In this patch, we make implicit exports of defaults conditional on
the NamedDefaults extension. This fix is unintrusive and compliant
with the existing proposal text (i.e. it does not require a proposal
amendment). Should the proposal be amended, we can go for a simpler
solution, such as requiring all defaults to be exported explicitly.

Test case: testsuite/tests/default/T25206.hs

- - - - -
3a5bebf8 by Matthew Pickering at 2024-08-28T14:16:42-04:00
simplifier: Fix space leak during demand analysis

The lazy structure (a list) in a strict field in `DmdType` is not fully
forced which leads to a very large thunk build-up.

It seems there is likely still more work to be done here as it seems we
may be trading space usage for work done. For now, this is the right
choice as rather than using all the memory on my computer, compilation
just takes a little bit longer.

See #25196

- - - - -
c2525e9e by Ryan Scott at 2024-08-28T14:17:17-04:00
Add missing parenthesizeHsType in cvtp's InvisP case

We need to ensure that when we convert an `InvisP` (invisible type pattern) to
a `Pat`, we parenthesize it (at precedence `appPrec`) so that patterns such as
`@(a :: k)` will parse correctly when roundtripped back through the parser.

Fixes #25209.

- - - - -
1499764f by Sjoerd Visscher at 2024-08-29T16:52:56+02:00
Haddock: Add no-compilation flag

This flag makes sure to avoid recompilation of the code when generating documentation by only reading the .hi and .hie files, and throw an error if it can't find them.

- - - - -
768fe644 by Andreas Klebinger at 2024-09-03T13:15:20-04:00
Add functions to check for weakly pinned arrays.

This commit adds `isByteArrayWeaklyPinned#` and `isMutableByteArrayWeaklyPinned#` primops.
These check if a bytearray is *weakly* pinned. Which means it can still be explicitly moved
by the user via compaction but won't be moved by the RTS.

This moves us one more stop closer to nailing down #22255.

- - - - -
b16605e7 by Arsen Arsenović at 2024-09-03T13:16:05-04:00
ghc-toolchain: Don't leave stranded a.outs when testing for -g0

This happened because, when ghc-toolchain tests for -g0, it does so by
compiling an empty program.  This compilation creates an a.out.

Since we create a temporary directory, lets place the test program
compilation in it also, so that it gets cleaned up.

Fixes: 25b0b40467d0a12601497117c0ad14e1fcab0b74
Closes: https://gitlab.haskell.org/ghc/ghc/-/issues/25203

- - - - -
83e70b14 by Torsten Schmits at 2024-09-03T13:16:41-04:00
Build foreign objects for TH with interpreter's way when loading from iface

Fixes #25211

When linking bytecode for TH from interface core bindings with
`-fprefer-byte-code`, foreign sources are loaded from the interface as
well and compiled to object code in an ad-hoc manner.

The results are then loaded by the interpreter, whose way may differ
from the current build's target way.

This patch ensures that foreign objects are compiled with the
interpreter's way.

- - - - -
250852e8 by sheaf at 2024-09-04T12:51:07+02:00
The X86 SIMD patch.

This commit adds support for 128 bit wide SIMD vectors and vector
operations to GHC's X86 native code generator.

Main changes:

  - Introduction of vector formats (`GHC.CmmToAsm.Format`)
  - Introduction of 128-bit virtual register (`GHC.Platform.Reg`),
    and removal of unused Float virtual register.
  - Refactor of `GHC.Platform.Reg.Class.RegClass`: it now only contains
    two classes, `RcInteger` (for general purpose registers) and `RcFloatOrVector`
    (for registers that can be used for scalar floating point values as well
    as vectors).
  - Modify `GHC.CmmToAsm.X86.Instr.regUsageOfInstr` to keep track
    of which format each register is used at, so that the register
    allocator can know if it needs to spill the entire vector register
    or just the lower 64 bits.
  - Modify spill/load/reg-2-reg code to account for vector registers
    (`GHC.CmmToAsm.X86.Instr.{mkSpillInstr, mkLoadInstr, mkRegRegMoveInstr, takeRegRegMoveInstr}`).
  - Modify the register allocator code (`GHC.CmmToAsm.Reg.*`) to propagate
    the format we are storing in any given register, for instance changing
    `Reg` to `RegFormat` or `GlobalReg` to `GlobalRegUse`.
  - Add logic to lower vector `MachOp`s to X86 assembly
    (see `GHC.CmmToAsm.X86.CodeGen`)
  - Minor cleanups to genprimopcode, to remove the llvm_only attribute
    which is no longer applicable.

Tests for this feature are provided in the "testsuite/tests/simd" directory.

Fixes #7741

Keeping track of register formats adds a small memory overhead to the
register allocator (in particular, regUsageOfInstr now allocates more
to keep track of the `Format` each register is used at). This explains
the following metric increases.

-------------------------
Metric Increase:
    T12707
    T13035
    T13379
    T3294
    T4801
    T5321FD
    T5321Fun
    T783
-------------------------

- - - - -
fd854ca8 by sheaf at 2024-09-04T12:51:07+02:00
Use xmm registers in genapply

This commit updates genapply to use xmm, ymm and zmm registers, for
stg_ap_v16/stg_ap_v32/stg_ap_v64, respectively.

It also updates the Cmm lexer and parser to produce Cmm vectors rather
than 128/256/512 bit wide scalars for V16/V32/V64, removing bits128,
bits256 and bits512 in favour of vectors.

The Cmm Lint check is weakened for vectors, as (in practice, e.g. on X86)
it is okay to use a single vector register to hold multiple different
types of data, and we don't know just from seeing e.g. "XMM1" how to
interpret the 128 bits of data within.

Fixes #25062

- - - - -
4981eff1 by sheaf at 2024-09-04T12:51:23+02:00
Add vector fused multiply-add operations

This commit adds fused multiply add operations such as `fmaddDoubleX2#`.
These are handled both in the X86 NCG and the LLVM backends.

- - - - -
f704d915 by sheaf at 2024-09-04T12:51:26+02:00
Add vector shuffle primops

This adds vector shuffle primops, such as

```
shuffleFloatX4# :: FloatX4# -> FloatX4# -> (# Int#, Int#, Int#, Int# #) -> FloatX4#
```

which shuffle the components of the input two vectors into the output vector.

NB: the indices must be compile time literals, to match the X86 SHUFPD
instruction immediate and the LLVM shufflevector instruction.

These are handled in the X86 NCG and the LLVM backend.

Tested in simd009.

- - - - -
e2a39d94 by sheaf at 2024-09-04T12:51:26+02:00
Add Broadcast MachOps

This adds proper MachOps for broadcast instructions, allowing us to
produce better code for broadcasting a value than simply packing that
value (doing many vector insertions in a row).

These are lowered in the X86 NCG and LLVM backends. In the LLVM backend,
it uses the previously introduced shuffle instructions.

- - - - -
7d068fd5 by sheaf at 2024-09-04T12:51:26+02:00
Fix treatment of signed zero in vector negation

This commit fixes the handling of signed zero in floating-point vector
negation.

A slight hack was introduced to work around the fact that Cmm doesn't
currently have a notion of signed floating point literals
(see get_float_broadcast_value_reg). This can be removed once CmmFloat
can express the value -0.0.

The simd006 test has been updated to use a stricter notion of equality
of floating-point values, which ensure the validity of this change.

- - - - -
a0eb9e26 by sheaf at 2024-09-04T12:51:26+02:00
Add min/max primops

This commit adds min/max primops, such as

  minDouble# :: Double# -> Double# -> Double#
  minFloatX4# :: FloatX4# -> FloatX4# -> FloatX4#
  minWord16X8# :: Word16X8# -> Word16X8# -> Word16X8#

These are supported in:
  - the X86, AArch64 and PowerPC NCGs,
  - the LLVM backend,
  - the WebAssembly and JavaScript backends.

Fixes #25120

- - - - -
1d2363a8 by sheaf at 2024-09-04T12:51:27+02:00
Add test for C calls & SIMD vectors

- - - - -
16ba9654 by sheaf at 2024-09-04T12:51:27+02:00
Add test for #25169

- - - - -
e9267961 by sheaf at 2024-09-04T12:51:27+02:00
Fix #25169 using Plan A from the ticket

We now compile certain low-level Cmm functions in the RTS multiple
times, with different levels of vector support. We then dispatch
at runtime in the RTS, based on what instructions are supported.

See Note [realArgRegsCover] in GHC.Cmm.CallConv.

Fixes #25169

-------------------------
Metric Increase:
    T10421
    T12425
    T1969
    T9198
-------------------------

- - - - -
15529ecb by sheaf at 2024-09-04T12:51:27+02:00
Fix C calls with SIMD vectors

This commit fixes the code generation for C calls, to take into account
the calling convention.

This is particularly tricky on Windows, where all vectors are expected
to be passed by reference. See Note [The Windows X64 C calling convention]
in GHC.CmmToAsm.X86.CodeGen.

- - - - -
c37faadf by sheaf at 2024-09-04T12:51:27+02:00
LLVM: propagate GlobalRegUse information

This commit ensures we keep track of how any particular global register
is being used in the LLVM backend. This informs the LLVM type
annotations, and avoids type mismatches of the following form:

  argument is not of expected type '<2 x double>'
    call ccc <2 x double> (<2 x double>)
      (<4 x i32> arg)

- - - - -

30 changed files:

- .gitlab/generate-ci/flake.lock
- .gitlab/generate-ci/gen_ci.hs
- .gitlab/jobs.yaml
- .gitmodules
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CallConv.hs
- compiler/GHC/Cmm/Graph.hs
- compiler/GHC/Cmm/Lexer.x
- compiler/GHC/Cmm/Lint.hs
- compiler/GHC/Cmm/Liveness.hs
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Opt.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/ProcPoint.hs
- compiler/GHC/Cmm/Reg.hs
- compiler/GHC/Cmm/Sink.hs
- compiler/GHC/Cmm/Type.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/AArch64.hs
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/AArch64/Regs.hs
- compiler/GHC/CmmToAsm/Config.hs
- compiler/GHC/CmmToAsm/Format.hs
- compiler/GHC/CmmToAsm/Instr.hs
- compiler/GHC/CmmToAsm/PPC.hs

The diff was not included because it is too large.

View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/61691275d45fd07552cf46da611cdad9c1fd77c7...c37faadfd5b1e3f484a7b3b83c0e0632ff662b98

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/61691275d45fd07552cf46da611cdad9c1fd77c7...c37faadfd5b1e3f484a7b3b83c0e0632ff662b98
You're receiving this email because of your account on gitlab.haskell.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240904/d3053dc4/attachment-0001.html>