GHC vs. GCC on raw vector addition
simonmarhaskell at gmail.com
Thu Jan 19 06:28:03 EST 2006
John Meacham wrote:
> On Wed, Jan 18, 2006 at 08:54:43PM +0300, Bulat Ziganshin wrote:
>> sorry, with the "gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops"
>> the C version is 50 times faster than best Haskell one... it's the
>> loop from C version:
> I believe something similar to what I noted here is the culprit:
> it is fixable, but not without modifying ghc.
Ah, I see what you mean by indirect jumps. Those indirect jumps go away
if you compile with -optc-O2 or -fasm, they're droppings left by
inadequacies in gcc's standard -O optimisation.
Actually, -fasm does better by one instruction than gcc on this example:
vs. gcc -O2:
movq (%rbp), %rdx
cmpq $1, %rdx
movq 8(%rbp), %rax
imulq %rdx, %rax
movq %rdx, (%rbp)
movq %rax, 8(%rbp)
movq 8(%rbp), %r13
addq $16, %rbp
We should probably reverse the sense of that branch, like gcc does. The
memory accesses are still there, of course. Hopefully someday I'll get
around to trying to use more registers on x86_64 again.
More information about the Glasgow-haskell-users