Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Mon Feb 23 21:13:05 EST 2009


On Mon, 2009-02-23 at 17:03 -0800, Don Stewart wrote:

> Here's a quick demo using Data.Binary directly.

[...]

>     $ time ./A dict +RTS -K20M
>     52848
>     "done"
>     ./A dict +RTS -K20M  1.51s user 0.06s system 99% cpu 1.582 total
> 
> 
> Ok. So 1.5s to decode a 1.3M Map. There may be better ways to build the Map since we know the input will be sorted, but
> the Data.Binary instance can't do that.

[...]

>     $ time ./A dict.gz
>     52848
>     "done"
>     ./A dict.gz  0.28s user 0.03s system 98% cpu 0.310 total
> 
> Interesting. So extracting the Map from a compressed bytestring in memory is a fair bit faster than loading it 
> directly, uncompressed from disk.

That's actually rather surprising. The system time is negligible and the
difference between total and user time does not leave much for time
wasted doing i/o. So that's a real difference in user time. So what is
going on? We're doing the same amount of binary decoding in each right?
We're also allocating the same number of buffers, in fact slightly more
in the case that uses compression. The time taken to cat a meg through
an Handle using lazy bytestring is nothing. So where is all that time
going?

Duncan



More information about the Haskell-Cafe mailing list