[Haskell-cafe] [Somewhat OT] Speed
Andrew Coppin
andrewcoppin at btinternet.com
Wed Oct 29 16:22:24 EDT 2008
Richard O'Keefe wrote:
>
> On 29 Oct 2008, at 8:31 am, Andrew Coppin wrote:
>
>> Hi guys.
>>
>> This isn't specifically to do with Haskell, but... does anybody have
>> any idea roughly how fast various CPU operations are?
>>
>> For example, is integer arithmetic faster or slower than
>> floating-point? Is addition faster or slower than multiplication? How
>> much slower are the trigonometric functions? etc. Does using 8-bit
>> integers make arithmetic any faster than using wider values?
>>
>> Does anybody have a good resource for this kind of information?
>
> lmbench3
>
> http://sourceforge.net/projects/lmbench
>
> Building and running it will give you some answers for your
> machine.
> It's hard to answer your questions as they are posed,
> because of the difference between throughput and latency.
> For example, you might be able to start a new multiplication
> on every cycle on some machine (throughput is 1 multiply
> per cycle) but the result of any one of
> them might not be available for twelve cycles (latency is
> 12 cycles per multiply).
>
> Rough guesses:
> integer adds, subtracts, and compares are fast,
> integer multiplies and divides are much slower,
> slow enough that compilers go to some trouble to
> do something else when multiplying or dividing
> by a constant.
>
> The speed of the trig functions depends on how much hardware
> support you have and on whether your library writer favoured
> speed or accuracy (especially accuracy over a wide range).
> I don't think lmbench measures these, but it wouldn't be hard
> to add something suitable.
>
> Using 8-bit integers is unlikely to make your program *directly*
> faster unless you are using a compiler which is smart enough to
> exploit SIMD instructions without programmer-written hints. The
> Intel C compiler _is_ that smart. In my code I have found it to
> be extremely good at vectorising trivial loops, with no actual
> effect on the run-time of my programs. However, code written
> with the intention of exploiting that would be different.
>
> The one thing that lmbench3 will tell you that will drop your
> jaw and widen your eyes is the bit at the end where it tells
> you about memory speed. Main memory is exceeding slow compared
> with cache. This is why I said that switching over to 8-bit
> integers might not make your program *directly* faster; if you
> have a lot of data which you can pack tightly, so that as 32-bit
> words it would not fit into L2 cache but as bytes it does, you
> may well get a very useful speedup from that.
>
> Or you may not. There is no substitute for measurement.
OK, well thanks for the info.
I'm not really interested in getting down to instruction-level
scheduling. I just want to know, at a high level, will implementing my
algorithm in integer arithmetic rather than floating-point make a
measurable difference to overall program speed.
Actually, thinking about it, I suppose the killer question is this: Does
changing between different numer representations make any measurable
performance difference at all, or are Haskell programs dominated by
cache misses?
More information about the Haskell-Cafe
mailing list