[Haskell-cafe] Binary parser combinators and pretty printing

Einar Karttunen ekarttun at cs.helsinki.fi
Tue Sep 13 11:03:00 EDT 2005


Hello

I am trying to figure out the best interface to binary parser
and pretty printing combinators for network protocols. 

I am trying to find the most natural syntax to express
these parsers in Haskell and would like opinions and
new ideas.

As an example I will use a protocol with the following packet structure:
0  message-id
4  sender-id
8  receiver-id
12 number of parameters
16 parameters. Each parameter is prefixed by 32bit length followed by 
   the data.

We will use the following Haskell datatype:

data Packet = Packet Word32 Word32 Word32 [FastString]

1) Simple monadic interface

getPacket = do mid <- getWord32BE
               sid <- getWord32BE
               rid <- getWord32BE
               nmsg<- getWord32BE
               vars<- replicateM (fromIntegral nmsg) (getWord32BE >>= getBytes)
               return $ Packet mid sid rid nmsg vars

putPacket (Packet mid sid rid vars) = do
  mapM_ putWord32BE [mid, sid, rid, length vars]
  mapM_ (\fs -> putWord32BE (length fs) >> putBytes fs) vars


This works but writing the code gets tedious and dull. 

2) Using better combinators

packet = w32be <> w32be <> w32be <> lengthPrefixList w32be (lengthPrefixList w32be bytes)
getPacket = let (mid,sid,rid,vars)  = getter packet in Packet mid sid rid vars
putPacket (Packet mid sid rid vars) = setter packet mid sid rid vars

Maybe even the tuple could be eliminated by using a little of TH.
Has anyone used combinators like this before and how did it work?

3) Using TH entirely

$(getAndPut 'Packet "w32 w32 w32 lengthPrefixList (w32 bytes)")

Is this better than the combinators in 2)? Also what sort of 
syntax would be best for expressing nontrivial dependencies - 
e.g. a checksum calculated from other fields.

4) Using a syntax extension

Erlang does this with the bit syntax 
(http://erlang.se/doc/doc-5.4.8/doc/programming_examples/bit_syntax.html)
and it is very nifty for some purposes. 

getPacket = do << mid:32, sid:32, rid:32, len:32 rest:len/binary >>
               ...

The list of lists gets nontrivial here too...


- Einar Karttunen


More information about the Haskell-Cafe mailing list