[Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library
johan.tibell at gmail.com
Wed Feb 23 23:04:03 CET 2011
On Wed, Feb 23, 2011 at 1:55 PM, Max Bolingbroke
<batterseapower at hotmail.com> wrote:
> Thanks for bringing some data to the table. There are definitely some
> common patterns in what you sent me:
> 1) For defining Binary instances, you need to write set size before
> you write the elements: ~7 occurrences
> 2) Tests against small constants (typically <= 0 or 1, but also 2 and
> 3!): ~15 occurrences
> 3) A surprise to me: generating fresh names! People keep a set of all
> names generated so far, and then just take size+1 as their fresh name.
> Nice trick. ~17 occurrences
> 4) Turning sizes into strings for human consumption: ~19 occurrences
> 5) Just reexporting the functions somehow. Uninformative. ~8 occurrences
> There were ~38 occurrences over ~13 repos where it appeared to be
> somehow fundamental to an algorithm (I didn't look into most of these
> in detail). I've put those after my message.
> Frankly I am surprised how much "size" gets used. It seems that making
> it fast is more important than I thought.
Nice analysis. Does this apply to maps as well as sets or are sets use
differently than maps somehow?
IntMap (which shares data structure with HashMap) only hash O(n) size.
I wonder if people avoid using IntMap because of this.
I wonder if there's a way to implement size that doesn't mess up the
code so badly (see the commit on GitHub to see how badly it messes up
More information about the Haskell-Cafe