[Haskell-cafe] ANN: ieee version 0.7

Tue Sep 21 15:08:11 EDT 2010

On Tuesday 21 September 2010 19:46:02, John Millikin wrote:
> On Tue, Sep 21, 2010 at 07:10, Daniel Fischer <daniel.is.fischer at web.de> 
wrote:
> > And I'd expect it to be a heck of a lot faster than the previous
> > implementation. Have you done any benchmarks?
>
> Only very rough ones -- a few basic Criterion checks, but nothing
> extensive.

Certainly good enough for an indication.

> Numbers for put/get of 64-bit big-endian:
>
>                    getWord   getFloat   putWord   putFloat
> Bitfields (0.4.1)    59 ns    8385 ns   1840 ns   11448 ns
> poke/peek (0.4.2)    59 ns     305 ns   1840 ns     744 ns

Yaw. That's a huge difference. I don't think there's much room for doubt 
that it's much faster (the exact ratios will vary of course).

> unsafeCoerce         59 ns      61 ns   1840 ns     642 ns

Odd that unsafeCoerce gains 244 ns for get, but only 102 for put.

>
> Note: I don't know why the cast-based versions can put a Double faster
> than a Word64;

Strange. putFloat does a putWord and a transformation, how can that be 
faster than only the putWord?

> Float is (as expected) slower than Word32. Some
> special-case GHC optimization?
>
> > One problem I see with both, unsafeCoerce and poke/peek is endianness.
> > Will the bit-pattern of a double be interpreted as the same uint64_t
> > on little-endian and on big-endian machines? In other words, is the
> > byte order for doubles endianness-dependent too?
> > If yes, that's fine, if no, it would break between machines of
> > different endianness.
>
> Endianness only matters when marshaling bytes into a single value --
> Data.Binary.Get/Put handles that. Once the data is encoded as a Word,
> endianness is no longer relevant.

I mean, take e.g. 2^62 :: Word64. If you poke that to memory, on a big-
endian machine, you'd get the byte sequence
40 00 00 00 00 00 00 00
while on a little-endian, you'd get
00 00 00 00 00 00 00 40
, right?

If both bit-patterns are interpreted the same as doubles, sign-bit = 0, 
exponent-bits = 0x400 = 1024, mantissa = 0 , thus yielding
1.0*2^(1024 - 1023) = 2.0, fine. But if on a little-endian machine, the 
floating point handling is not little-endian and the number is interpreted 
as sign-bit = 0, exponent-bits = 0, mantissa = 0x40, hence
(1 + 2^(-46))*2^(-1023), havoc.

I simply didn't know whether that could happen. According to 
http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness it 
could. 
On the other hand, "no standard for transferring floating point values has 
been made. This means that floating point data written on one machine may 
not be readable on another", so if it breaks on weird machines, it's at 
least a general problem (and not Haskell's).