getting a Binary module into the standard libs

Alastair Reid alastair@reid-consulting-uk.ltd.uk
07 Nov 2002 19:42:39 +0000


[copied to Malcolm]

>From the attached mail, it sounds like Simon has made some worthwhile
additions to the Binary interface but left out a few things.  The only
omission that seems fundamental is that Simon's version supports
reading/writng bytes whilst Malcolm's supports reading/writing bits.

Malcolm: 

- How important is this? 

- Assuming that supporting bits slows down the whole interface, is there
  a cunning implementation trick which would have very low overhead if
  you're doing a byte-aligned read/write (e.g., if all previous
  reads/writes has been multiples of bytes)?

- Or, would it be appropriate to build one as a layer on top of the
  other so that programmers can express their choice by using one type
  or another.  (I suggest a layered approach in the hope that this
  would lead to more code sharing, reduce tendency for API divergence,
  etc. but I have no concrete thought on what a layered approach might
  look like.


--
Alastair

> > I was wondering if it was on the list of things to do to get a Binary
> > module into the standard libraries.  I know SimonM has a 
> > version for GHC
> > and there's an NHC version (I think the original).  I don't know about
> > Hugs.
> > 
> > I ask because by putting it in the standard libs, library 
> > developers could
> > feel more pressured to release their data structures with Binary
> > instances.
> 
> Indeed.  The only reason I didn't put my version into the libraries yet
> was because it differed somewhat from the NHC version, and I thought it
> would be a good idea to discuss what the interface should look like
> first.
> 
> FYI, the main differences between GHC's Binary library and NHC's are
> described below.  Keep in mind that GHC's Binary library is heavily
> tuned for speed, because we use it for reading/writing interface files.
> 
> GHC's Binary class:
> 
>   class Binary a where
>     put_   :: BinHandle -> a -> IO ()
>     put    :: BinHandle -> a -> IO (Bin a)
>     get    :: BinHandle -> IO a
> 
> NHC's Binary class:
> 
>   class Binary a where
>     put    :: BinHandle -> a -> IO (Bin a)
>     get    :: BinHandle -> IO a
>     getF   :: BinHandle -> Bin a -> (a, Bin b)
> 
>     putAt  :: BinHandle -> Bin a -> a -> IO ()
>     getAt  :: BinHandle -> Bin a -> IO a
>     getFAt :: BinHandle -> Bin a -> a
> 
>   - For GHC, I added the put_ method.  The reason is efficiency:
>     you can often write a tail-recursive definition of put_, but not
>     put, and one rarely needs the return value of put (I found).  Each
>     function has a default definition defined in terms of the other
>     (in fact, I think I use put_ exclusively in GHC, and put could
>     be taken out of the class altogether).
> 
>   - For GHC, I didn't implement getF.  Instead, I have explicit
>     lazyGet and lazyPut operations, to give me more control over
>     the laziness: I only want laziness in a few well-defined places.
> 
>   - I implemented putAt and getAt as functions rather than class
>     methods.  There are lots of instances of Binary, so you save a
>     few dictionary fields, and I didn't come across a case where I
>     needed to override either of these.
> 
>   - GHC's library also works in terms of bytes rather than bits, again
>     for efficiency (time over space).  There are putByte and getByte
>     functions for writing your own instances of Binary, whereas
>     NHC has putBits and getBits.
> 
> There are more differences in the rest of the interface, but these are
> the most fundamental ones.
> 
> Cheers,
>         Simon