getting a Binary module into the standard libs

Malcolm Wallace Malcolm.Wallace@cs.york.ac.uk
Fri, 8 Nov 2002 15:33:48 +0000


Alastair Reid <alastair@reid-consulting-uk.ltd.uk> writes:

> From the attached mail, it sounds like Simon has made some worthwhile
> additions to the Binary interface but left out a few things.  The only
> omission that seems fundamental is that Simon's version supports
> reading/writng bytes whilst Malcolm's supports reading/writing bits.

That does seem to be the main difference.

> - How important is this? 

The motivation for my Binary library was to save space, whereas
Simon's motivation was to be fast (at any rate to be faster than
parsing text).  Thus, a bitstream is potentially far more compact
than a bytestream, depending of course on the natural size of the
objects to be serialised.  But the tradeoff is that a bytestream
is far quicker to build/read, because there is no tricky logic
required to ensure that bits are shifted to the right place etc.

Different applications will require different characteristics.
There is no one-size-fits-all.

> - Assuming that supporting bits slows down the whole interface, is there
>   a cunning implementation trick which would have very low overhead if
>   you're doing a byte-aligned read/write (e.g., if all previous
>   reads/writes has been multiples of bytes)?

Well, with a bit-stream implementation you need to test whether a
read/write is fully aligned (i.e. both that the buffer position is
aligned to an appropriate boundary, and that the data to be added/read
is over exactly the right size to take you to another boundary),
but after that it should be just the same speed as if you read/write
the bytes directly.  So the question is really how efficient is the
test in terms relative to the actual read/write.

> - Or, would it be appropriate to build one as a layer on top of the
>   other so that programmers can express their choice by using one type
>   or another.

Yes, it is possible that there is a suitable separation (using MPTC no
doubt) to allow the choice of either bit-wise or byte-wise (perhaps
even Word16 or Word32) implementations of the same basic interface.
Something like

    class Binary impl a where ...
    data BitStream
    data ByteStream

    instance Binary BitStream Bool where ...
    instance Binary ByteStream Bool where ...

An alternative would be to provide (in a single class) both
the original bit-wise ops, and in addition, the byte-aligned
"fast-entry-point" methods, so for example you could mix the two,
perhaps requiring the use of some operation like "alignBuffer" when
you switch from one style to the other.

Regards,
    Malcolm