[Haskell-cafe] question about Data.Binary and Double instance

Wed Apr 18 20:20:21 EDT 2007

On Wed, 2007-04-18 at 08:30 -0700, David Roundy wrote:
> On Wed, Apr 18, 2007 at 12:34:58PM +1000, Duncan Coutts wrote:
> > We can't actually guarantee that we have any IEEE format types
> > available. The isIEEE will tell you if a particular type is indeed IEEE
> > but what do we do if isIEEE CDouble = False ?
> 
> All the computer architectures I've ever used had IEEE format types.
> Perhaps we could add to the standard libraries a IEEEDouble type and
> conversions between it and ordinary types.  This would put the ugly ARM
> hackery where it belongs, I suppose.

>From the point of view of this library that would be ideal yes. I'm not
sure the Haskell implementation maintainers would see it the same way.

> > Perhaps we just don't care about ARM or other arches where GHC runs that
> > do not use IEEE formats, I don't know. If that were the case we'd say
> > something like:
> 
> I don't.

:-)

> > instance Binary Double where
> >   put d = assert (isIEEE (undefined :: Double)) $ do
> >             write (poke d)
> 
> I'd rather have this or nothing.  It may be that there are people out there
> who want to serialize and read Doubles to and from Haskell, but I imagine
> most people want to read or write formats that can interoperate with other
> languages (which is the only reason I'm looking into Binary now).
> 
> It's rather inconvenient (and took me quite some time to track down) having
> such a non-standard serialization for Double.
> 
> If there were no Binary instance for Double, I could write this myself, but
> alas, once an instance is declared, there's no way to undeclare it, and the
> workarounds aren't pretty.  I suppose I can

By the way, perhaps it is not obvious yet, but the library is supposed
to be split in two halves, serving different audiences and purposes. One
is to interoperate with existing externally defined binary data formats.
It sounds like your application falls into that category. The other is
to serialise Haskell data structures. For the latter case we use the
Binary class. You should not care what format you get from using this
class, only that it has some useful properties like round-tripping (on
the same machine and across architectures and Haskell implementations).
If you do care what the format is, you should not be using the Binary
class. You should instead be using the other side of the library.

Now at the moment the other side is under-developed, it only provides a
few primitives. But you can see why using a Binary class is not going to
work for these cases where people care about the format, the instance
for any particular type is not going to be right:

> newtype DDouble = D Double
> unD (D d) = d
> 
> instance Binary DDouble where

and people will for ever be defining newtype wrappers or complaining
that the whole library isn't parametrised by the endianness or whatever.
For existing formats you need much more flexibility and control. The
Binary class is to make it really convenient to serialise Haskell types,
and it's built on top of the layer that gives you full control.

We intend to work more on this other side of the library in the coming
couple of months. If you could tell us a bit more about your use case,
that'd be great.

> > If we do care about ARM and the like then we need some way to translate
> > from the native Double encoding to an IEEE double external format. I
> > don't know how to do that. I also worry we'll end up with lots of
> > #ifdefs.
> 
> I'd say lots of #ifdefs are okay.  This is a low-level library dealing with
> low-level architecture differences.

Yeah, maybe, but it makes me grumble. :-)

> > The other problem with doing this efficiently is that we have to worry
> > about alignment for that poke d operation. If we don't know the
> > alignment we have to poke into an aligned side buffer and copy over.
> > Similar issues apply to reading.
> 
> Right now, efficiency is less of a concern to me than ease.  I imagine the
> efficiency can be fixed up later?

Possibly. We would like to get as much efficiency for free, without
having to clutter the library API with stuff. Alignment is one of the
harder cases since it looks like we don't have quite enough information.
It's not clear yet, we're still pondering it.

> I'd think you could statically check the alignment with a bit of type
> hackery

Aye, we've thought a little about that, though not quite enough to have
any concrete ideas to try yet.

> (and note that I said I thought *you* could, not *I* could).

Yes, I note the difference :-).

>  Something like creating two monad types, an aligned
> one and an arbitrary one, and at run-time select which monad to use,
> so the check could occur just once.

That requires that the format is alignment preserving, which is
information that is hard to recover given an api that has primitives for
packing various sized objects in a sequence.

Duncan