[Haskell-cafe] bitSize

Wed Aug 31 05:07:53 CEST 2011

On 30/08/2011, at 7:45 PM, Thomas Davie wrote:

> That's reasonably believable – streaming units on current CPUs can execute multiple floating point operations per cycle.

The figures for cephes_{sinf,cosf} are difficult to believe
because they are so extremely at variance with the figures that
come with the software.

First off, the program as delivered, when compiled, reported that the
computer was a "2000 MHz" one.  It is in fact a 2.66 GHz one.  That
figure turns out to be a #define in the code.  Fix that, and the
report includes

benching          cephes_sinf .. -> 12779.4 millions of vector evaluations/second
 ->   0 cycles/value on a 2660MHz computer
benching          cephes_cosf .. -> 12756.8 millions of vector evaluations/second
 ->   0 cycles/value on a 2660MHz computer
benching          cephes_expf .. ->    7.7 millions of vector evaluations/second
 ->  86 cycles/value on a 2660MHz computer

The internal documentation in the program claims the following results
on a 2.4 GHz machine:

benching cephes_sinf .. -> 11.6 millions of vector evaluations/second
	->  56 cycles/value on a 2600MHz computer
benching cephes_cosf .. -> 8.7 millions of vector evaluations/second
	->  74 cycles/value on a 2600MHz computer
benching cephes_expf .. -> 3.7 millions of vector evaluations/second
	-> 172 cycles/value on a 2600MHz computer

It seems surpassing strange that code compiled by gcc 4.2 on a 2.66 GHz
machine should run more than a thousand times faster than code compiled
by gcc 4.2 on a 2.60 GHz machine with essentially the same architecture.

Especially as those particular functions are *NOT* vectorised.
They are foils for comparison with the functions that *ARE* vectorised,
for which entirely credible 26.5 million vector evaluations per second
(or 25 cycles per value) are reported.  Considering that sin and cos
are not single floating point operations, 25 cycles per value is well
done.

28 cycles per single-precision logarithm is not too shabby either,
IF one can trust a benchmark that has blown its credibility as badly
as this one has.  But it's still not an Integer logarithm, just a
single precision floating point one.