Pickling a finite map (Binary + zlib) [was: [Haskell-cafe] Data.Binary poor read performance]

Paulo Tanimoto tanimoto at arizona.edu
Tue Feb 24 12:05:44 EST 2009


On Tue, Feb 24, 2009 at 2:36 AM, Don Stewart <dons at galois.com> wrote:
> This idea was the motivation for the new Seq instance, which uses
> internals to build quickly.
>    Encoding to disk, the dictionary,
>        $ time ./binary /usr/share/dict/cracklib-small
>        "done"
>        ./binary /usr/share/dict/cracklib-small  0.07s user 0.01s system 94% cpu 0.088 total
>    Decoding,
>        $ time ./binary dict.gz
>        52848
>        "done"
>        ./binary dict.gz  0.07s user 0.01s system 97% cpu 0.079 total
>    instance (Binary e) => Binary (Seq.Seq e) where
>        put s = put (Seq.length s) >> Fold.mapM_ put s
>        get = do n <- get :: Get Int
>                 rep Seq.empty n get
>          where rep xs 0 _ = return $! xs
>                rep xs n g = xs `seq` n `seq` do
>                               x <- g
>                               rep (xs Seq.|> x) (n-1) g
> Just a lot better. :)
> So ... Data.Map, we're looking at you!

Indeed, that was the motivation for writing the patch for Seq.  [Ross,
thank you again for the help.]  I had performance issues with lists,
but noticed that switching to Sequence wasn't helping at all.  This
new definition takes advantage of the features that Seq has and List

Regarding Map, I like Felipe's idea of having a separate Internal.

I know that Lemmih has a few other implementations of Map (compactMap,
BerkeleyDB).  If I remember correctly, he made BerkeleyDB an instance
of Binary.  That can probably give us some insight too.


More information about the Haskell-Cafe mailing list