getting a Binary module into the standard libs
Alastair Reid
alastair@reid-consulting-uk.ltd.uk
07 Nov 2002 19:42:39 +0000
[copied to Malcolm]
>From the attached mail, it sounds like Simon has made some worthwhile
additions to the Binary interface but left out a few things. The only
omission that seems fundamental is that Simon's version supports
reading/writng bytes whilst Malcolm's supports reading/writing bits.
Malcolm:
- How important is this?
- Assuming that supporting bits slows down the whole interface, is there
a cunning implementation trick which would have very low overhead if
you're doing a byte-aligned read/write (e.g., if all previous
reads/writes has been multiples of bytes)?
- Or, would it be appropriate to build one as a layer on top of the
other so that programmers can express their choice by using one type
or another. (I suggest a layered approach in the hope that this
would lead to more code sharing, reduce tendency for API divergence,
etc. but I have no concrete thought on what a layered approach might
look like.
--
Alastair
> > I was wondering if it was on the list of things to do to get a Binary
> > module into the standard libraries. I know SimonM has a
> > version for GHC
> > and there's an NHC version (I think the original). I don't know about
> > Hugs.
> >
> > I ask because by putting it in the standard libs, library
> > developers could
> > feel more pressured to release their data structures with Binary
> > instances.
>
> Indeed. The only reason I didn't put my version into the libraries yet
> was because it differed somewhat from the NHC version, and I thought it
> would be a good idea to discuss what the interface should look like
> first.
>
> FYI, the main differences between GHC's Binary library and NHC's are
> described below. Keep in mind that GHC's Binary library is heavily
> tuned for speed, because we use it for reading/writing interface files.
>
> GHC's Binary class:
>
> class Binary a where
> put_ :: BinHandle -> a -> IO ()
> put :: BinHandle -> a -> IO (Bin a)
> get :: BinHandle -> IO a
>
> NHC's Binary class:
>
> class Binary a where
> put :: BinHandle -> a -> IO (Bin a)
> get :: BinHandle -> IO a
> getF :: BinHandle -> Bin a -> (a, Bin b)
>
> putAt :: BinHandle -> Bin a -> a -> IO ()
> getAt :: BinHandle -> Bin a -> IO a
> getFAt :: BinHandle -> Bin a -> a
>
> - For GHC, I added the put_ method. The reason is efficiency:
> you can often write a tail-recursive definition of put_, but not
> put, and one rarely needs the return value of put (I found). Each
> function has a default definition defined in terms of the other
> (in fact, I think I use put_ exclusively in GHC, and put could
> be taken out of the class altogether).
>
> - For GHC, I didn't implement getF. Instead, I have explicit
> lazyGet and lazyPut operations, to give me more control over
> the laziness: I only want laziness in a few well-defined places.
>
> - I implemented putAt and getAt as functions rather than class
> methods. There are lots of instances of Binary, so you save a
> few dictionary fields, and I didn't come across a case where I
> needed to override either of these.
>
> - GHC's library also works in terms of bytes rather than bits, again
> for efficiency (time over space). There are putByte and getByte
> functions for writing your own instances of Binary, whereas
> NHC has putBits and getBits.
>
> There are more differences in the rest of the interface, but these are
> the most fundamental ones.
>
> Cheers,
> Simon