[Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

Wed Feb 23 12:30:42 CET 2011

I took a rapid look and they seem a replacement for pure Map's, but not for
 mutable HashTable's. Sorry if it isn't the case. I don´t know if
Data.HashTable has improved, but the performance used to be very poor in
comparison with other languages.

The point is that pure data structures can not be used as shared data in
multithreaded environments. They must be encapsulated in mutable blocking
references, such is MVars, and the whole update process blocks any other
thread . (it is worst when using TVars)  So they are not a replacement for
mutable data structures such is Data.HashTable.

For this reason I think that an inprovement/mutable-replacement of
Data.HashTable is needed. If this hasn´t been done already. Are there
some improvements on it that I don't know?

2011/2/23 Max Bolingbroke <batterseapower at hotmail.com>

> On 23 February 2011 05:31, Johan Tibell <johan.tibell at gmail.com> wrote:
> > On Tue, Feb 22, 2011 at 9:19 PM, Johan Tibell <johan.tibell at gmail.com>
> wrote:
> >> Initial numbers suggest that lookup gets 3% slower and insert/delete
> >> 6% slower. The upside is O(1) size.
> >
> > Can someone come up with a real world example where O(1) size is
> important?
>
> I'm a bit sceptical that it is (I was not convinced by the earlier
> strict-set-inclusion argument, since that's another Data.Map feature
> I've never used). I thought of some other possibilities though:
>  1. If copying an unordered-collection to a flat array you can improve
> the constant factors (not the asymptotics) with O(1) size to
> pre-allocate the array
>  2. If building a map in a fixed point loop (I personally do this a
> lot) where you know that the key uniquely determines the element, you
> can test if a fixed point is reached in O(1) by just comparing the
> sizes. Depending what you are taking a fixed point of, this may change
> the asymptotics
>  3. Some map combining algorithms work more efficiently when one of
> their two arguments are smaller. For example, Data.Map.union is most
> efficient for (bigmap `union` smallmap). If you don't care about which
> of the two input maps wins when they contain the same keys, you can
> improve constant factors by testing the size of the map input to size
> (in O(1)) and flipping the arguments if you got (smallmap `union`
> bigmap) instead of the desirable way round.
>
> Personally I don't find any of these *particularly* compelling. But a
> ~6% slowdown for this functionality is not too bad - have you had a
> chance to look at the core to see if the cause of the slowdown
> manifests itself at that level? Perhaps it is possible to tweak the
> code to make this cheaper.
>
> Also, what was the size of the collections you used in your benchmark?
> I would expect the relative cost of maintaining the size to get lower
> as you increased the size of the collection.
>
> Max
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110223/fc969c7e/attachment.htm>