[Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

Louis Wasserman wasserman.louis at gmail.com
Wed Feb 23 03:53:04 CET 2011


Here's the approach I was attempting: a while back there was a proposal to
modify Data.Map to support a sort of zipper, including the following API:

data Location k a -- represents a map with a "hole" at a particular key

search :: k -> Map k a -> (Maybe a, Location k a)

key :: Location k a -> k

assign :: a -> Location k a -> Map k a
clear :: Location k a -> Map k a

and from here you might define

insert :: k -> a -> Map k a -> Map k a
insert k a m = case search k m of
   (_, loc) -> assign a loc

The surprising thing was that this approach *increased* the speed of the Map
implementation by something like 7%.  I don't know what happened to the
original proposal, but I was going to try something similar, with something
like

data HashMap k v = HM !Int !(Tree k v)

insert k v (HM sz tree) = case search k tree of
    (Nothing, loc) -> HM (sz + 1) (assign v loc)
    (Just _, loc) -> HM sz (assign v loc)

Louis Wasserman
wasserman.louis at gmail.com
http://profiles.google.com/wasserman.louis


On Tue, Feb 22, 2011 at 8:45 PM, Johan Tibell <johan.tibell at gmail.com>wrote:

> On Tue, Feb 22, 2011 at 6:28 PM, Louis Wasserman
> <wasserman.louis at gmail.com> wrote:
> > Apologies!
>
> Accepted. :)
>
> > I was, in point of fact, working on such a patch, but I think I've been
> > convinced that it's a legitimate position for a package to take.  (I'm
> also
> > curious, Johann, about the approach you figured out that didn't cost a
> word
> > per node!)
>
> I'm working on a patch that provides O(1) size right now. The trick is
> to define HashMap as:
>
> data HashMap k v = HM {-# UNPACK #-} !Int !(Tree k v)
>
> data Tree k v
>    = Nil
>    | Tip {-# UNPACK #-} !Hash
>          {-# UNPACK #-} !(FL.FullList k v)
>    | Bin {-# UNPACK #-} !Prefix
>          {-# UNPACK #-} !Mask
>          !(Tree k v)
>          !(Tree k v)
>
> And have insert, delete, and other operations that change the size
> return both the updated tree and the size delta (-1, 0, or 1 in the
> case of insert/delete). The size delta is then applied to the stored
> size.
>
> At the moment I'm trying to figure out the performance impact of
> making this change.
>
> Johan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110222/f0ef6e91/attachment.htm>


More information about the Haskell-Cafe mailing list