[Haskell-cafe] ANNOUNCE: binary: high performance,
pure binary serialisation
Donald Bruce Stewart
dons at cse.unsw.edu.au
Thu Jan 25 21:51:01 EST 2007
Binary: high performance, pure binary serialisation for Haskell
----------------------------------------------------------------------
The Binary Strike Team is pleased to announce the release of a new,
pure, efficient binary serialisation library for Haskell, now available
from Hackage:
tarball: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary/0.2
darcs: darcs get http://darcs.haskell.org/binary
haddocks: http://www.cse.unsw.edu.au/~dons/binary/Data-Binary.html
The 'binary' package provides efficient serialisation of Haskell values
to and from lazy ByteStrings. ByteStrings constructed this way may then
be written to disk, written to the network, or further processed (e.g.
stored in memory directly, or compressed in memory with zlib or bzlib).
Encoding and decoding are achieved by the functions:
encode :: Binary a => a -> ByteString
decode :: Binary a => ByteString -> a
which mirror the read/show functions. Convenience functions for serialising to
disk are also provided:
encodeFile :: Binary a => FilePath -> a -> IO ()
decodeFile :: Binary a => FilePath -> IO a
To serialise your Haskell data, all you need do is write an instance of
Binary for your type. For example, suppose in an interpreter we had the
data type:
import Data.Binary
import Control.Monad
data Exp = IntE Int
| OpE String Exp Exp
We can serialise this to bytestring form with the following instance:
instance Binary Exp where
put (IntE i) = putWord8 0 >> put i
put (OpE s e1 e2) = putWord8 1 >> put s >> put e1 >> put e2
get = do tag <- getWord8
case tag of
0 -> liftM IntE get
1 -> liftM3 OpE get get get
The binary library has been heavily tuned for performance, particularly for
writing speed. Throughput of up to 160M/s has been achieved in practice, and in
general speed is on par or better than NewBinary, with the advantage of a pure
interface. Efforts are underway to improve performance still further. Plans are
also taking shape for a parser combinator library on top of binary, for bit
parsing and foreign structure parsing (e.g. network protocols).
Several projects are using binary already for serialisation:
lambdabot : state file serialisation
hmp3 : mp3 file database
hpaste.org : pastes are stored in memory as compressed bytestrings, and
serialised to disk on MACID checkpoints
Binary was developed by a team of 8 during the Haskell Hackathon, Hac
07, and received 200+ commits over that period. You can see the commit graph
here:
http://www.cse.unsw.edu.au/~dons/images/commits/community/binary-commits.png
The use of QuickCheck was critical to the rapid, safe development of the
library. The API was developed in conjunction with the QuickCheck properties
that checked the API for sanity. We were thus able to improve performance while
maintaining stability. We feel that QuickCheck should be an integral part of
the development strategy for all new Haskell libraries. Don't write code
without it!
Binary is portable, using the foreign function interface and cpp, and is
tested with Hugs and GHC.
Happy hacking!
The Binary Strike Team,
Lennart Kolmodin
Duncan Coutts
Don Stewart
Spencer Janssen
David Himmelstrup
Bjorn Bringert
Ross Paterson
Einar Karttunen
More information about the Haskell-Cafe
mailing list