[Git][ghc/ghc][wip/T23083] 21 commits: Add fused multiply-add instructions

Sebastian Graf (@sgraf812) gitlab at gitlab.haskell.org
Fri May 12 11:39:15 UTC 2023



Sebastian Graf pushed to branch wip/T23083 at Glasgow Haskell Compiler / GHC


Commits:
87eebf98 by sheaf at 2023-05-11T11:55:22-04:00
Add fused multiply-add instructions

This patch adds eight new primops that fuse a multiplication and an
addition or subtraction:

  - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#`

fmadd x y z is x * y + z, computed with a single rounding step.

This patch implements code generation for these primops in the following
backends:

  - X86, AArch64 and PowerPC NCG,
  - LLVM
  - C

WASM uses the C implementation. The primops are unsupported in the
JavaScript backend.

The following constant folding rules are also provided:

  - compute a * b + c when a, b, c are all literals,
  - x * y + 0 ==> x * y,
  - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x.

NB: the constant folding rules incorrectly handle signed zero.
This is a known limitation with GHC's floating-point constant folding
rules (#21227), which we hope to resolve in the future.

- - - - -
ad16a066 by Krzysztof Gogolewski at 2023-05-11T11:55:59-04:00
Add a test for #21278

- - - - -
05cea68c by Matthew Pickering at 2023-05-11T11:56:36-04:00
rts: Refine memory retention behaviour to account for pinned/compacted objects

When using the copying collector there is still a lot of data which
isn't copied (such as pinned, compacted, large objects etc). The logic
to decide how much memory to retain didn't take into account that these
wouldn't be copied. Therefore we pessimistically retained 2* the amount
of memory for these blocks even though they wouldn't be copied by the
collector.

The solution is to split up the heap into two parts, the parts which
will be copied and the parts which won't be copied. Then the appropiate
factor is applied to each part individually (2 * for copying and 1.2 *
for not copying).

The T23221 test demonstrates this improvement with a program which first
allocates many unpinned ByteArray# followed by many pinned ByteArray#
and observes the difference in the ultimate memory baseline between the
two.

There are some charts on #23221.

Fixes #23221

- - - - -
1bb24432 by Cheng Shao at 2023-05-11T11:57:15-04:00
hadrian: fix no_dynamic_libs flavour transformer

This patch fixes the no_dynamic_libs flavour transformer and make
fully_static reuse it. Previously building with no_dynamic_libs fails
since ghc program is still dynamic and transitively brings in dyn ways
of rts which are produced by no rules.

- - - - -
0ed493a3 by Josh Meredith at 2023-05-11T23:08:27-04:00
JS: refactor jsSaturate to return a saturated JStat (#23328)

- - - - -
a856d98e by Pierre Le Marre at 2023-05-11T23:09:08-04:00
Doc: Fix out-of-sync using-optimisation page

- Make explicit that default flag values correspond to their -O0 value.
- Fix -fignore-interface-pragmas, -fstg-cse, -fdo-eta-reduction,
  -fcross-module-specialise, -fsolve-constant-dicts, -fworker-wrapper.

- - - - -
c176ad18 by sheaf at 2023-05-12T06:10:57-04:00
Don't panic in mkNewTyConRhs

This function could come across invalid newtype constructors, as we
only perform validity checking of newtypes once we are outside the
knot-tied typechecking loop.
This patch changes this function to fake up a stub type in the case of
an invalid newtype, instead of panicking.

This patch also changes "checkNewDataCon" so that it reports as many
errors as possible at once.

Fixes #23308

- - - - -
ab63daac by Krzysztof Gogolewski at 2023-05-12T06:11:38-04:00
Allow Core optimizations when interpreting bytecode

Tracking ticket: #23056

MR: !10399

This adds the flag `-funoptimized-core-for-interpreter`, permitting use
of the `-O` flag to enable optimizations when compiling with the
interpreter backend, like in ghci.

- - - - -
c6cf9433 by Ben Gamari at 2023-05-12T06:12:14-04:00
hadrian: Fix mention of non-existent removeFiles function

Previously Hadrian's bindist Makefile referred to a `removeFiles`
function that was previously defined by the `make` build system. Since
the `make` build system is no longer around, this function is now
undefined. Naturally, make being make, this appears to be silently
ignored instead of producing an error.

Fix this by rewriting it to `rm -f`.

Closes #23373.

- - - - -
eb60ec18 by Bodigrim at 2023-05-12T06:12:54-04:00
Mention new implementation of GHC.IORef.atomicSwapIORef in the changelog

- - - - -
d5e5e835 by Sebastian Graf at 2023-05-12T13:37:28+02:00
Cleanup a TODO introduced in 1f94e0f7

The change must have slipped through review of !4412

- - - - -
39dfb943 by Sebastian Graf at 2023-05-12T13:37:28+02:00
More explicit strictness in GHC.Real

- - - - -
e6de6042 by Sebastian Graf at 2023-05-12T13:37:28+02:00
exprIsTrivial: Factor out shared implementation

The duplication between `exprIsTrivial` and `getIdFromTrivialExpr_maybe` has
been bugging me for a long time.

This patch introduces an inlinable worker function `trivial_expr_fold` acting
as the single, shared decision procedure of triviality. It "returns" a
Church-encoded `Maybe (Maybe Id)`, so when it is inlined, it fuses to similar
code as before.
(Better code, even, in the case of `getIdFromTrivialExpr` which presently
allocates a `Just` constructor that cancels away after this patch.)

- - - - -
d6ffcd8a by Sebastian Graf at 2023-05-12T13:37:28+02:00
Simplify: Simplification of arguments in a single function

The Simplifier had a function `simplArg` that wasn't called in `rebuildCall`,
which seems to be the main way to simplify args. Hence I consolidated the code
path to call `simplArg`, too, renaming to `simplLazyArg`.

- - - - -
e5bb6b71 by Sebastian Graf at 2023-05-12T13:37:28+02:00
Core.Ppr: Omit case binder for empty case alternatives

A minor improvement to pretty-printing

- - - - -
043a90d4 by Sebastian Graf at 2023-05-12T13:37:28+02:00
Inlining literals into boring contexts is OK

- - - - -
7fc7a2b2 by Sebastian Graf at 2023-05-12T13:37:28+02:00
Kill SetLevel.notWorthFloating.is_triv (#23270)

We have had it since b84ba676034, when it operated on annotated expressions.
Nowadays it operates on vanilla `CoreExpr` though, so we should just call
`exprIsTrivial`; thus handling empty cases and string literals correctly.

- - - - -
120f90a8 by Sebastian Graf at 2023-05-12T13:37:29+02:00
ANFise string literal arguments (#23270)

This instates the invariant that a trivial CoreExpr translates to an atomic
StgExpr. Nice.

Fixes #23270.

- - - - -
8c7b3670 by Sebastian Graf at 2023-05-12T13:37:29+02:00
Deactivate -fcatch-nonexhaustive-cases in ghc-bignum (#23345)

- - - - -
90bbf404 by Sebastian Graf at 2023-05-12T13:37:29+02:00
CorePrep: Eliminate EmptyCase and unsafeEqualityProof in CoreToStg instead

We eliminate EmptyCase by way of `coreToStg (Case e _ _ []) = coreToStg e` now.
The main reason is that it plays far better in conjunction with eta expansion
(as we aim to do for arguments in CorePrep, #23083), because we can discard
any arguments, `(case e of {}) eta == case e of {}`, whereas in `(e |> co) eta`
it's impossible to discard the argument.

We do also give the same treatment to unsafeCoerce proofs and treat them as
trivial iff their RHS is trivial.

It is also both much simpler to describe than the previous mechanism of emitting
an unsafe coercion and simpler to implement, removing quite a bit of commentary
and `CorePrepProv`.

- - - - -
4fc83c91 by Sebastian Graf at 2023-05-12T13:39:02+02:00
CorePrep: Eta expand arguments (#23083)

Previously, we'd only eta expand let bindings and lambdas,
now we'll also eta expand arguments such as in T23083:
```hs
g f h = f (h `seq` (h $))
```
Unless `-fpedantic-bottoms` is set, we'll now transform to
```hs
g f h = f (\eta -> h eta)
```
in CorePrep.

See the new `Note [Eta expansion of arguments in CorePrep]` for the details.

We only do this optimisation with -O2 because we saw 2-3% ghc/alloc regressions
in T4801 and T5321FD.

Fixes #23083.

- - - - -


30 changed files:

- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/PPC/Instr.hs
- compiler/GHC/CmmToAsm/PPC/Ppr.hs
- compiler/GHC/CmmToAsm/Wasm/FromCmm.hs
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/Instr.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/CmmToC.hs
- compiler/GHC/CmmToLlvm/CodeGen.hs
- compiler/GHC/Core.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion/Opt.hs
- compiler/GHC/Core/FVs.hs
- compiler/GHC/Core/Lint.hs
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Ppr.hs
- compiler/GHC/Core/TyCo/FVs.hs
- compiler/GHC/Core/TyCo/Rep.hs
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/TyCo/Tidy.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Core/Unfold.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/7d9de75154147c86782f9b7192d4d2764c235021...4fc83c9106657a7927428689a3d0077077032f36

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/7d9de75154147c86782f9b7192d4d2764c235021...4fc83c9106657a7927428689a3d0077077032f36
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230512/d13cb04c/attachment-0001.html>


More information about the ghc-commits mailing list