export toDescList from Data.Map

Evan Laforge qdunkan at gmail.com
Fri Sep 26 10:20:02 EDT 2008


>  - Since someone raised the issue of test suites for performance and
> correctness, I think it would also be interesting to investigate what effect
> the strictness annotations in the tree constructors have on performance.
> Everyone takes for granted that bangs=faster, but I've noticed, as have
> others, that removing these strictness annotations actually make things run
> faster. Instead, the tree construction _functions_ should be made strict
> using seq. The Map construction functions are already strict enough on
> account of the balancing. Try it for yourself, remove the bangs from the
> sub-tree fields in the Bin constructor, and run a little benchmark. This
> would open up the possibility for lazy versions of functions like mapWithKey
> and mapKeysMonotonic. I had mentioned this previously, but few seemed to
> care, so I just made the change locally.
> http://www.haskell.org/pipermail/libraries/2008-August/010371.html . If we
> we're going to start a little Map/Set/IntMap/IntSet working group, then I'd
> like to throw that idea back into the mix.

Of course, anything that boosts performance is a win.  Does anyone
know of any Map benchmarks?  I suppose I could write some, but I'd be
sort of surprised if someone else hasn't already done that.  Then we
could throw in all the various avl / redblack / whatever map variants
in there and see how they stack up.

> IntMap is hard coded to Int, which is 32 bits on a 32 bit architecture, and
> 64 bits on a 64 bit architecture. I think the main reason for this is so
> that the key can be
> unpacked/specialized for extra performance, since that is the primary
> purpose of the library. If you just need sublinear insert/lookup/delete then
> use a Map.

To be honest, I had no particular rational reason for wanting IntMap
(and in fact I'm not using it).  I just heard some mumbling on the
list about "Data.Map so slow, IntMap so much faster especially for
unions".  Though to be honest, probably what most matters to me is how
much garbage an insert generates (and similarly how much the resulting
structures will share).  I didn't run any tests or anything, so I
wasn't that upset to not be able to use IntMap :)

> Are you galled that the key type is not hardcoded to Int64, or that the key
> is not allowed to be any instance of the Bits class? In the former case,
> you'd inflate the size of the tree and take a small speed hit on 32bit
> archs, and in the latter I think you'd have even worse performance. I think
> the choice of Int is wise because it is the fastest, and that is the point
> of the library, and if you think about it you're not going to be dealing
> with more that 2^32 Ints on a 32 bit computer. (What would they be
> referencing? Not array positions or file offsets.)  If you're trying to

Points in time.  I was using Int64 at the time, but I'm actually using
Doubles now.  Plain Data.Map is working fine in practice.

> optimize, say, "Map (Int32,Int32) a" into "IntMap64 a", there are other ways
> to accomplish that:

Ah yes, splitting the numbers would work for 64 bit words.


More information about the Libraries mailing list