[Haskell-cafe] Re: Haskell version of ray tracer code is much
slower than the original ML
Philip Armstrong
phil at kantaka.co.uk
Fri Jun 22 09:13:13 EDT 2007
On Fri, Jun 22, 2007 at 01:16:54PM +0100, Simon Marlow wrote:
>Philip Armstrong wrote:
>>IIRC, it is possible to issue an instruction to the x86 FP unit which
>>makes all operations work on 64-bit Doubles, even though there are
>>80-bits available internally. Which then means there's no requirement
>>to spill intermediate results to memory in order to get the rounding
>>correct.
>
>For some background on why GHC doesn't do this, see the comment "MORE
>FLOATING POINT MUSINGS..." in
>
> http://darcs.haskell.org/ghc/compiler/nativeGen/MachInstrs.hs
Twisty. I guess 'slow, but correct, with switches to go faster at the
price of correctness' is about the best option.
>You probably want SSE2. If I ever get around to finishing it, the GHC
>native code generator will be able to generate SSE2 code on x86 someday,
>like it currently does for x86-64. For now, to get good FP performance on
>x86, you probably want
>
> -fvia-C -fexcess-precision -optc-mfpmath=sse2
Reading the gcc manpage, I think you mean -optc-msse2
-optc-mfpmath=sse. -mfpmath=sse2 doesn't appear to be an option.
(I note in passing that the ghc darcs head produces binaries from
ray.hs which are about 15% slower than ghc 6.6.1 ones btw. Same
optimisation options used both times.)
cheers, Phil
--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
More information about the Haskell-Cafe
mailing list