getting a Binary module into the standard libs

Hal Daume III hdaume@ISI.EDU
Wed, 13 Nov 2002 10:24:26 -0800 (PST)

I was waiting a while before replying, hoping to get other comments from
people who know more about this stuff than I do, but that doesn't seem to
be happening.  As I see it, there are three things left on the table:

  1) Should putBits support >8 bit operations
  2) How should we support flushByte
  3) How should we buffer BinIO

I'll address each of these in turn.

1) I vote in favor of "no".  I fear my opinion on this is influenced by
the fact that I offered to implement it.  Of course you could always
define putGT8Bits in terms of putLEQ8Bits, but would probably be less
efficient than defining putGT8Bits "natively."  I don't see a real need
for it, as Simon said, most of the use for this will be for constructors
and Booleans.  I could probably be persuaded otherwise, but I'd like to
hear a good, strong example of why more than 8 bit puts are essential.

2) The proposals for flushByte, as I see it, are:

  a) flushBytes h n aligns the stream to the next 2^n byte 
     (bit?) boundary

  b) flushBytes h m n aligns the stream such that the position 
     p satisfies (p = n) mod 2^m

  c) encoding (b) as a single integer (as per Dean's suggestion)

This is something I don't really know enough about to comment.  Clearly
(a) is the simplest, implementation wise, and probably the
fastest.  (b) would be a bit more work and I don't understand what it
would gain you, but since it seems to be well known I'll admit that I just
know too little to say.  (c) wouldn't be much more work than (b), but I
wonder if it's getting too complicated.  My vote is probably for (a), but
my vote should only count epsilon in this context.  Perhaps (b) is the
right thing to do (I don't need too much convincing here).

3) I think we can all agree that we should buffer BinIOs.  There are a few
questions, given this:

  a) Should multiple threads be allowed to write the same BinHandle
     simultaneously?  If not, is an error thrown or is the behiour
     just left "unspecified"?

  b) Should multiple threads be allowed to read from the same
     BinHandle simultaneously?  If not, ...

  c) Should one thread be allowed to write and another to read from
     the same BH simultaneously?  If not, ...

I would probably say:

  a) No & left unspecified
  b) Yes
  c) Yes

That said, we probably need a dupBin function as Simon suggests.  I must
say here that I don't know enough about how Handles are implemented in GHC
to know where to start on this.  I know that they are already MVars of
Handle__s which basically hold the file pointer and some other stuff, but
I don't know what would need to be done to accomplish such a dupBin

That said, I put it out to the rest of you for comments/persuasions.

 - Hal