[Haskell-cafe] Binary serialization, was Re: Abstraction leak

Donald Bruce Stewart dons at cse.unsw.edu.au
Wed Jul 4 18:50:42 EDT 2007


phil:
> On Wed, Jul 04, 2007 at 09:44:13PM +1000, Donald Bruce Stewart wrote:
> >Binary instances are pretty easy to write. For a simple data type:
> >
> >      > instance Binary Exp where
> >      >       put (IntE i)          = do put (0 :: Word8)
> >      >                                  put i
> >      >       put (OpE s e1 e2)     = do put (1 :: Word8)
> >      >                                  put s
> >      >                                  put e1
> >      >                                  put e2
> >
> >      >       get = do tag <- getWord8
> >      >                case tag of
> >      >                    0 -> liftM  IntE get
> >      >                    1 -> liftM3 OpE  get get get
> 
> That's quite verbose! Plus I'm a bit concerned by the boxing implied
> by those IntE / OpE constructors in get. If you were using those
> values in a pattern match on the result of get, would the compiler be
> able to eliminate them and refer directly to the values in the source
> data?

Well, here's you're flattening a Haskell structure, so it has to get
reboxed. If it was bytestring chunks, or Ints, then you can avoid any
serious copying. The 'get' just tags a value.

> 
> >The Data.Binary comes with one tool to derive these. The DrIFT preprocessor
> >also can, as can Stefan O'Rear's SYB deriver.
> >
> >I just write them by hand, or use the tool that comes with the lib.
> >
> >More docs here,
> >   http://hackage.haskell.org/packages/archive/binary/0.3/doc/html/Data-Binary.html
> 
> This doesn't seem to deal with endianness. Am I missing something?

That's the Haskell serialisation layer. Look at Data.Binary.Get/Put for
endian-primitives, to be used instead of 'get'. i.e. getWord16be

> 
> >>>>world, you could operate on the packets in place in Haskell where
> >>>>possible and save the deserialization overhead...
> >>>
> >>>Data.ByteString.* for this.
> 
> Ah, does Data.Binary fuse with ByteString.* then?

They know about each other, and Binary avoids copying if you're reading
ByteStrings.

> 
> >Hack those bytes! Quickly! :-)
> 
> :)
> 
> It's a shame the layout definition is so verbose. Erlang's is quite
> compact. I wonder if something could be done with template haskell to
> translate an Erlang-style data layout definition to the Data.Binary
> form?

Right, simple but a bit verbose. The Erlang bit syntax is a nice pattern
matching/layout syntax for bit/byte data. There's a couple of ports of
this to Haskell -- one using pattern guards, another using Template
Haskell. Look on hackage.haskell.org for bitsyntax if you're interested.

> (Bonus points for being able to parse ASN.1 and generate appropriate
> Haskell datatypes & serialization primitives automatically :-) )

I think there's at least an ASN.1 definition in the crypto library.
Dominic might be able to enlighten us on that.

-- Don


More information about the Haskell-Cafe mailing list