[Haskell-cafe] lookup tables & style guidelines
Adrian Hey
ahey at iee.org
Thu Apr 24 11:33:55 EDT 2008
Ketil Malde wrote:
> Don Stewart <dons at galois.com> writes:
>
>>> 1) what is the most performant lookup table/hashtable/dictionary solution
>>> for Haskell?
>
>> Data.IntMap is awfully good.
>
> Is it benchmarked anywhere? Compared to the Judy bindings, or Adrian
> Hey's AVL trees, or Data.Hashtable?
I must get around to putting the AVL clones of Data.Map/Set in Hackage
sometime. Meanwhile anyone wanting to roll their own maps with an API
of their chosing could do a lot worse than the raw AVL lib..
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/AvlTree-3.1
Also, if you're likely to be using union/intersection a lot you should
know that Data.Map/Set are very slow for this because they use the
not efficient hedge algorithm :-)
There's a really cool demo I found from wikipedia that shows why it is
that AVL trees perform so well in the "pathological" situation of sorted
insertions..
http://www.csi.uottawa.ca/~stan/csi2514/applets/avl/BT.html
If you try it you can see that after 2^n-1 sorted insertions you always
get a perfectly balanced tree. This explains these benchmark results..
http://groups.google.co.uk/group/comp.lang.functional/msg/74a422ea04ff1217
DData is what became Data.Map/Set and it would appear to be the worst
performing of the four tree types tested there :-(
Don is right about Data.IntMap/IntSet. For Ints it will be much faster
than Data.Map/Set or even (polymorphic) AVL tree. But an AVL tree of
unboxed Ints gives faster lookup than IntMap/Set and about the same
insert/delete times..
http://www.haskell.org/pipermail/libraries/2005-October/004518.html
Union and Intersection times for AVL aren't so great, but I think
I know what to do about that (especially intersection).
But the real way ahead has to be Tries for non-trivial keys and (I
suspect) AVL trees of unboxed Ints for simple keys (serialisable
as 1 machine word). This is what that GSoC project is all about.
At the moment we have the exact opposite, Tries for Ints and balanced
trees for non-trivial keys. Oh well..
Regards
--
Adrian Hey
More information about the Haskell-Cafe
mailing list