[Haskell-cafe] Re: ANNOUNCE: binary: high performance,
pure binary serialisation
Donald Bruce Stewart
dons at cse.unsw.edu.au
Tue Jan 30 19:13:07 EST 2007
simonmar:
> Donald Bruce Stewart wrote:
> > Binary: high performance, pure binary serialisation for Haskell
> > ----------------------------------------------------------------------
> >The Binary Strike Team is pleased to announce the release of a new,
> >pure, efficient binary serialisation library for Haskell, now available
> >from Hackage:
> >
> > tarball:
> > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary/0.2
> > darcs: darcs get http://darcs.haskell.org/binary
> > haddocks: http://www.cse.unsw.edu.au/~dons/binary/Data-Binary.html
>
> A little benchmark I had lying around shows that this Binary library beats
> the one in GHC by a factor of 2 (at least on this example):
Very nice. We've been benchmarking again NewBinary, for various
Word-sized operations, with the following results, on x86:
NewBinary, fairly tuned (lots of fastMutInt#s)
10MB of Word8 in chunks of 1: 10.68MB/s write, 9.16MB/s read
10MB of Word16 in chunks of 16: 7.89MB/s write, 6.65MB/s read
10MB of Word32 in chunks of 16: 7.99MB/s write, 7.29MB/s read
10MB of Word64 in chunks of 16: 5.10MB/s write, 5.75MB/s read
Data.Binary:
10MB of Word8 in chunks of 1 ( Host endian): 11.7 MB/s write, 2.4 MB/s read
10MB of Word16 in chunks of 16 ( Host endian): 89.3 MB/s write, 3.6 MB/s read
10MB of Word16 in chunks of 16 ( Big endian): 83.3 MB/s write, 1.6 MB/s read
10MB of Word32 in chunks of 16 ( Host endian): 178.6 MB/s write, 7.2 MB/s read
10MB of Word32 in chunks of 16 ( Big endian): 156.2 MB/s write, 2.5 MB/s read
10MB of Word64 in chunks of 16 ( Host endian): 78.1 MB/s write, 11.3 MB/s read
10MB of Word64 in chunks of 16 ( Big endian): 44.6 MB/s write, 2.8 MB/s read
Note that we're much faster writing, in general, but read speed lags.
The 'get' monad hasn't received much attention yet, though we know what
needs tuning.
> GHC's binary library (quite heavily tuned by me):
>
> Write time: 2.41
> Read time: 1.44
> 1,312,100,072 bytes allocated in the heap
> 96,792 bytes copied during GC (scavenged)
> 744,752 bytes copied during GC (not scavenged)
> 32,492,592 bytes maximum residency (6 sample(s))
>
> 2384 collections in generation 0 ( 0.01s)
> 6 collections in generation 1 ( 0.00s)
>
> 63 Mb total memory in use
>
> INIT time 0.00s ( 0.00s elapsed)
> MUT time 3.78s ( 3.84s elapsed)
> GC time 0.02s ( 0.02s elapsed)
> EXIT time 0.00s ( 0.00s elapsed)
> Total time 3.79s ( 3.86s elapsed)
>
> Data.Binary:
>
> Write time: 0.99
> Read time: 0.65
> 1,949,205,456 bytes allocated in the heap
> 204,986,944 bytes copied during GC (scavenged)
> 5,154,600 bytes copied during GC (not scavenged)
> 70,247,720 bytes maximum residency (8 sample(s))
>
> 3676 collections in generation 0 ( 0.25s)
> 8 collections in generation 1 ( 0.19s)
>
> 115 Mb total memory in use
>
> INIT time 0.00s ( 0.00s elapsed)
> MUT time 1.08s ( 1.13s elapsed)
> GC time 0.44s ( 0.52s elapsed)
> EXIT time 0.00s ( 0.00s elapsed)
> Total time 1.51s ( 1.65s elapsed)
>
> This example writes a lot of 'Maybe Int' values. I'm surprised by the
> extra heap used by Data.Binary: this was on a 64-bit machine, so Ints
> should have been encoded as 64 bits by both libraries. Also, the GC seems
> to be working quite hard with Data.Binary, I'd be interested to know why
> that is.
Very interesting! Is this benchmark online?
I'm a little surprised by the read times, reading is still fairly
unoptimised compared to writing.
> Anyway, this result is good enough for me, I'd like to use Data.Binary in
> GHC as soon as we can. Unfortunately we have to support older compilers,
> so there will be some build-system issues to surmount. Also we need a way
> to pass state around while serialising/deserialising - what's the current
> plan for this?
The plan was to use StateT Put or StateT Get, I think. But we don't have
a demo for this yet. Duncan, Lennart, any suggestions?
-- Don
More information about the Haskell-Cafe
mailing list