NCG lowering of sqrt
kavon at farvard.in
Fri Apr 28 13:27:40 UTC 2017
Given a Cmm expression such as
(_c8Gq::F64) = call MO_F64_Sqrt(_s8oX::F64); // CmmUnsafeForeignCall
the native code generator produces an actual call to the sqrt C function, which has the side-effect of causing all floating-point registers to be dumped as they are caller-saved. In the nbody benchmark, this is particularly bad for a rather hot piece of code (see below).
Ideally the NCG would recognize this foreign call and instead use the `sqrtsd` SSE instruction when targeting x86-64.
Does anyone know if the NCG can produce this instruction? I think it would be beneficial, as the below would turn into one or two instructions.
Other math functions such as sin/cos require x87 FPU instructions, which as far as I know we're not using.
; NCG generates this in parts of the nbody benchmark
; to compute the sqrt
movsd %xmm9,176(%rsp) ; all floating-point registers
movsd %xmm1,184(%rsp) ; are caller-saved in SysV ABI
;; the loads
;; below are interleaved
;; with computations
More information about the ghc-devs