[Haskell-cafe] Re: ANNOUNCE: binary: high performance, pure binary serialisation

Simon Marlow simonmar at microsoft.com
Tue Jan 30 08:21:43 EST 2007


Donald Bruce Stewart wrote:
>         Binary: high performance, pure binary serialisation for Haskell
>      ---------------------------------------------------------------------- 
> 
> The Binary Strike Team is pleased to announce the release of a new,
> pure, efficient binary serialisation library for Haskell, now available
> from Hackage:
>     
>  tarball:    http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary/0.2
>  darcs:      darcs get http://darcs.haskell.org/binary
>  haddocks:   http://www.cse.unsw.edu.au/~dons/binary/Data-Binary.html

A little benchmark I had lying around shows that this Binary library beats the 
one in GHC by a factor of 2 (at least on this example):

GHC's binary library (quite heavily tuned by me):

Write time:   2.41
Read time:    1.44
1,312,100,072 bytes allocated in the heap
      96,792 bytes copied during GC (scavenged)
     744,752 bytes copied during GC (not scavenged)
  32,492,592 bytes maximum residency (6 sample(s))

        2384 collections in generation 0 (  0.01s)
           6 collections in generation 1 (  0.00s)

          63 Mb total memory in use

   INIT  time    0.00s  (  0.00s elapsed)
   MUT   time    3.78s  (  3.84s elapsed)
   GC    time    0.02s  (  0.02s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    3.79s  (  3.86s elapsed)

Data.Binary:

Write time:   0.99
Read time:    0.65
1,949,205,456 bytes allocated in the heap
204,986,944 bytes copied during GC (scavenged)
   5,154,600 bytes copied during GC (not scavenged)
  70,247,720 bytes maximum residency (8 sample(s))

        3676 collections in generation 0 (  0.25s)
           8 collections in generation 1 (  0.19s)

         115 Mb total memory in use

   INIT  time    0.00s  (  0.00s elapsed)
   MUT   time    1.08s  (  1.13s elapsed)
   GC    time    0.44s  (  0.52s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    1.51s  (  1.65s elapsed)

This example writes a lot of 'Maybe Int' values.  I'm surprised by the extra 
heap used by Data.Binary: this was on a 64-bit machine, so Ints should have been 
encoded as 64 bits by both libraries.  Also, the GC seems to be working quite 
hard with Data.Binary, I'd be interested to know why that is.

Anyway, this result is good enough for me, I'd like to use Data.Binary in GHC as 
soon as we can.  Unfortunately we have to support older compilers, so there will 
be some build-system issues to surmount.  Also we need a way to pass state 
around while serialising/deserialising - what's the current plan for this?

Cheers,
	Simon



More information about the Haskell-Cafe mailing list