getting a Binary module into the standard libs

Hal Daume III hdaume@ISI.EDU
Mon, 11 Nov 2002 10:12:57 -0800 (PST)


> I'd do it this way:
> 
>   putBits :: BinHandle -> Int{-size-} -> Int{-value-} -> IO ()
> 
> and similarly for getBits.  It will be easiest if the size is not
> allowed to go over 8, because then you have to deal with endianness, and
> in any case we already have put for Int16, Int32 etc. written in terms
> of putWord8.

Any reason not to make those Word8s instead of Ints?  That way we don't
have to throw errors when things are too big; additionally, Word implies
more of a bit-wise tihng to me than Int, about which I always worry about
whether shifts are going to mess with my sign bit, etc...

I imagine 'putBits bh 3 word' means to write the 3 least significant bits
of word?  That is, 'putBits bh 3 (32 + 7)' is the same as 'putBits bh 3
7', right?

> BinMem and BinIO differ quite a bit here: for BinMem you can write
> straight into the array, whereas for BinIO we need a cache - a single
> byte at the least, but ideally more.  BinMem is the most important case
> to optimise (for us in GHC anyhow), since BinIO is already significantly
> slower due to the overhead of the Handle interface.

Currently the BinIO implementation doesn't do any caching, though,
right?  I actually rarely use the BinMem implementation, so I would be
more interested in improving the speed of BinIO.  Presumably, however, the
Handle actually does buffering, which should take care of a large part of
this, right?

> There should really be a closeBin function too; it's quite simple to
> add.

On files this would flush the current byte then close the file handle,
right?  What would it do on BinMems (other than flush the byte)?