[Git][ghc/ghc][wip/T23070-unify] 11 commits: Add fused multiply-add instructions

Simon Peyton Jones (@simonpj) gitlab at gitlab.haskell.org
Fri May 12 16:01:51 UTC 2023



Simon Peyton Jones pushed to branch wip/T23070-unify at Glasgow Haskell Compiler / GHC


Commits:
87eebf98 by sheaf at 2023-05-11T11:55:22-04:00
Add fused multiply-add instructions

This patch adds eight new primops that fuse a multiplication and an
addition or subtraction:

  - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#`

fmadd x y z is x * y + z, computed with a single rounding step.

This patch implements code generation for these primops in the following
backends:

  - X86, AArch64 and PowerPC NCG,
  - LLVM
  - C

WASM uses the C implementation. The primops are unsupported in the
JavaScript backend.

The following constant folding rules are also provided:

  - compute a * b + c when a, b, c are all literals,
  - x * y + 0 ==> x * y,
  - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x.

NB: the constant folding rules incorrectly handle signed zero.
This is a known limitation with GHC's floating-point constant folding
rules (#21227), which we hope to resolve in the future.

- - - - -
ad16a066 by Krzysztof Gogolewski at 2023-05-11T11:55:59-04:00
Add a test for #21278

- - - - -
05cea68c by Matthew Pickering at 2023-05-11T11:56:36-04:00
rts: Refine memory retention behaviour to account for pinned/compacted objects

When using the copying collector there is still a lot of data which
isn't copied (such as pinned, compacted, large objects etc). The logic
to decide how much memory to retain didn't take into account that these
wouldn't be copied. Therefore we pessimistically retained 2* the amount
of memory for these blocks even though they wouldn't be copied by the
collector.

The solution is to split up the heap into two parts, the parts which
will be copied and the parts which won't be copied. Then the appropiate
factor is applied to each part individually (2 * for copying and 1.2 *
for not copying).

The T23221 test demonstrates this improvement with a program which first
allocates many unpinned ByteArray# followed by many pinned ByteArray#
and observes the difference in the ultimate memory baseline between the
two.

There are some charts on #23221.

Fixes #23221

- - - - -
1bb24432 by Cheng Shao at 2023-05-11T11:57:15-04:00
hadrian: fix no_dynamic_libs flavour transformer

This patch fixes the no_dynamic_libs flavour transformer and make
fully_static reuse it. Previously building with no_dynamic_libs fails
since ghc program is still dynamic and transitively brings in dyn ways
of rts which are produced by no rules.

- - - - -
0ed493a3 by Josh Meredith at 2023-05-11T23:08:27-04:00
JS: refactor jsSaturate to return a saturated JStat (#23328)

- - - - -
a856d98e by Pierre Le Marre at 2023-05-11T23:09:08-04:00
Doc: Fix out-of-sync using-optimisation page

- Make explicit that default flag values correspond to their -O0 value.
- Fix -fignore-interface-pragmas, -fstg-cse, -fdo-eta-reduction,
  -fcross-module-specialise, -fsolve-constant-dicts, -fworker-wrapper.

- - - - -
c176ad18 by sheaf at 2023-05-12T06:10:57-04:00
Don't panic in mkNewTyConRhs

This function could come across invalid newtype constructors, as we
only perform validity checking of newtypes once we are outside the
knot-tied typechecking loop.
This patch changes this function to fake up a stub type in the case of
an invalid newtype, instead of panicking.

This patch also changes "checkNewDataCon" so that it reports as many
errors as possible at once.

Fixes #23308

- - - - -
ab63daac by Krzysztof Gogolewski at 2023-05-12T06:11:38-04:00
Allow Core optimizations when interpreting bytecode

Tracking ticket: #23056

MR: !10399

This adds the flag `-funoptimized-core-for-interpreter`, permitting use
of the `-O` flag to enable optimizations when compiling with the
interpreter backend, like in ghci.

- - - - -
c6cf9433 by Ben Gamari at 2023-05-12T06:12:14-04:00
hadrian: Fix mention of non-existent removeFiles function

Previously Hadrian's bindist Makefile referred to a `removeFiles`
function that was previously defined by the `make` build system. Since
the `make` build system is no longer around, this function is now
undefined. Naturally, make being make, this appears to be silently
ignored instead of producing an error.

Fix this by rewriting it to `rm -f`.

Closes #23373.

- - - - -
eb60ec18 by Bodigrim at 2023-05-12T06:12:54-04:00
Mention new implementation of GHC.IORef.atomicSwapIORef in the changelog

- - - - -
3ae2fec5 by Simon Peyton Jones at 2023-05-12T17:03:50+01:00
Use the eager unifier in the constraint solver

This patch continues the refactoring of the constraint solver
described in #23070.

The Big Deal in this patch is to call the regular, eager unifier from the
constraint solver, when we want to create new equalities. This
replaces the existing, unifyWanted which amounted to
yet-another-unifier, so it reduces duplication of a rather subtle
piece of technology. See

  * Note [The eager unifier] in GHC.Tc.Utils.Unify
  * GHC.Tc.Solver.Monad.wrapUnifierTcS

I did lots of other refactoring along the way

* I simplified the treatment of right hand sides that contain CoercionHoles.
  Now, a constraint that contains a hetero-kind CoercionHole is non-canonical,
  and cannot be used for rewriting or unification alike.  This required me
  to add the ch_hertero_kind flag to CoercionHole, with consequent knock-on
  effects. See wrinkle (2) of `Note [Equalities with incompatible kinds]` in
  GHC.Tc.Solver.Equality.

* I refactored the StopOrContinue type to add StartAgain, so that after a
  fundep improvement (for example) we can simply start the pipeline again.

* I got rid of the unpleasant (and inefficient) rewriterSetFromType/Co functions.
  With Richard I concluded that they are never needed.

* I discovered Wrinkle (W1) in Note [Wanteds rewrite Wanteds] in
  GHC.Tc.Types.Constraint, and therefore now prioritise non-rewritten equalities.

Quite a few error messages change, I think always for the better.

Compiler runtime stays about the same, with one outlier: a 17% improvement in T17836

Metric Decrease:
    T17836
    T18223

- - - - -


30 changed files:

- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/AArch64/Ppr.hs
- compiler/GHC/CmmToAsm/PPC/CodeGen.hs
- compiler/GHC/CmmToAsm/PPC/Instr.hs
- compiler/GHC/CmmToAsm/PPC/Ppr.hs
- compiler/GHC/CmmToAsm/Wasm/FromCmm.hs
- compiler/GHC/CmmToAsm/X86/CodeGen.hs
- compiler/GHC/CmmToAsm/X86/Instr.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/CmmToC.hs
- compiler/GHC/CmmToLlvm/CodeGen.hs
- compiler/GHC/Core/Coercion.hs
- compiler/GHC/Core/Coercion.hs-boot
- compiler/GHC/Core/Opt/ConstantFold.hs
- compiler/GHC/Core/Predicate.hs
- compiler/GHC/Core/Reduction.hs
- compiler/GHC/Core/TyCo/Compare.hs
- compiler/GHC/Core/TyCo/Rep.hs
- compiler/GHC/Core/TyCo/Subst.hs
- compiler/GHC/Core/Type.hs
- compiler/GHC/Driver/Config/StgToCmm.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Driver/Pipeline/Execute.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/HsToCore.hs
- compiler/GHC/JS/Transform.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1553a77aeec7c666797cb9659255016de38b26a6...3ae2fec52d0bb74fba4ed3800a4c0aed0514cb3d

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1553a77aeec7c666797cb9659255016de38b26a6...3ae2fec52d0bb74fba4ed3800a4c0aed0514cb3d
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230512/c9d99deb/attachment.html>


More information about the ghc-commits mailing list