[Haskell-cafe] Re: Declarative binary protocols

Tue Jan 19 00:25:54 EST 2010

On Mon, Jan 18, 2010 at 10:39 PM, Antoine Latter <aslatter at gmail.com> wrote:
> Cafe,
>
> We have some fantastic tools for binary parsing in packages like
> binary and cereal (and presumably attoparsec, which I've not used).
> But they don't quite scratch an itch I have when writing
> implementations of binary communication protocols.
>
> A good example of my problem is in my implementation of the memcached
> binary wire protocol: http://hackage.haskell.org/package/starling
>
> What I've tried to do is divide up the library into a declarative
> protocol description and an imperative machine to sit on a handle and
> link together a server response with the source request.
>
> In the declarative core all of the types which come off of the wire
> have an associated Data.Binary.Get action - but this isn't quite good
> enough. Data.Binary works on ByteStrings, but I have a handle. I don't
> want to use hGetContents because I have trouble working out when lazy
> IO is and is not correct. I can't use hGet because I don't know how
> much to get until I'm in the middle of the Get action.
>

Now that I've posted I've come up with a solution.

I will start with binary-strict:Data.Binary.Strict.IncrementalGet[1]

It currently defines

>>>>>
data Result a
= Failed String
| Finished ByteString a
| Partial (ByteString -> Result a)
<<<<<

Which I will change to:

>>>>>
data Result a
= Failed String
| Finished ByteString a
| Partial Int (ByteString -> Result a)
<<<<<

Where the p type includes some information as to why the result is
partial. This means that I will change the current function:

suspend :: Get r ()

to:

require :: Int -> Get r ()

We will then only return the 'Partial' result on a call to 'require'.
Any other attempts to read beyond the so-far fetched byte-string will
result in failure.

I'll then have a function of type:

runFromHandle :: Handle -> Get r r -> IO r

Which leaves the handle in a usable state and never seeks ahead in the
handle, and also a function:

runFromBytes :: ByteString -> Get r r -> {- some sensible return type -}

The idea is that the partial return type is hidden from the users of
the library, but we take advantage of it to read from the handle in a
sensible way.

This is all based on a quick read-through of binary-strict. But it
fits how I think about a lot of the binary protocols I write:

getResponse = do
  require 256
  x <- getX
  len <- getWord16be
  y <- getY
  z <- getZ
  require (fromIntegral len * 8)
  a <- getA
  b <- getB
  return $ Response x y z a b c

The only weird part is that I only ever intend to write the "require"
statements at the top-level - maybe 'getA' and the like can be written
in some restricted version of the Get monad which doesn't permit
'require' declarations.

Any comments? Is there an easier way to do this?

Antoine

[1] http://hackage.haskell.org/packages/archive/binary-strict/0.4.6/doc/html/Data-Binary-Strict-IncrementalGet.html