GHC vs. GCC on raw vector addition
Bulat Ziganshin
bulatz at HotPOP.com
Wed Jan 18 12:54:43 EST 2006
Hello Bulat,
Wednesday, January 18, 2006, 8:34:54 PM, you wrote:
BZ> the only cause that this code is only 3 times slower is that C version
BZ> is really limited by memory speed. when tested on 1000-element
BZ> arrays, it is 20 times slower. i'm not yet tried SSE optimization for
BZ> gcc ;)
sorry, with the "gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops"
the C version is 50 times faster than best Haskell one... it's the
loop from C version:
L18:
fldl (%edx)
faddl (%ecx)
fstpl (%edx)
fldl 8(%edx)
faddl 8(%ecx)
fstpl 8(%edx)
fldl 16(%edx)
faddl 16(%ecx)
fstpl 16(%edx)
fldl 24(%edx)
faddl 24(%ecx)
addl $4,%ebx
addl $32,%ecx
fstpl 24(%edx)
addl $32,%edx
cmpl -4(%ebp),%ebx
jl L18
--
Best regards,
Bulat mailto:bulatz at HotPOP.com
More information about the Glasgow-haskell-users
mailing list