[Haskell] combining IntMaps
daan at cs.uu.nl
Sat Jul 30 14:07:12 EDT 2005
Adrian Hey wrote:
>>Thanks! It's interesting the way your AVL tree library is set up --
>>there seems to be a much broader degree of functionality than that
>>provided by Data.Set. But I'm trying to see, is there a significant
>>difference in the fundamental data structure.
> Well Data.Set is based on a different balanced tree type (weight
> balanced trees), similar to those used in the Adams paper. I'm
> also quite sceptical about the Hedge algorithm, so the AVL library
> doesn't use it. It uses divide and conquer, but not quite as Adams
> describes it.
Please note that the original mail was about "IntMap" and "IntSet" and these data
structures use patricia trees and have a much different union algorithm alltogether.
(a very good one actually -- one of the main reasons to make a special Map and Set
instance for integers :-) )
> But IMO the biggest problem with Data.Set is the inflexible API.
> For example..
> union :: Ord a => Set a -> Set a -> Set a
> intersect :: Ord a => Set a -> Set a -> Set a
> genUnion :: (e -> e -> COrdering e) -> AVL e -> AVL e -> AVL e
> genIntersection :: (a -> b -> COrdering c) -> AVL a -> AVL b -> AVL c
You are right, but there is a reason for this: with the Set api I wanted to expose
only operations that do not violate the notion of a mathematical set: ie. the
ordering on the elements must be fixed or otherwise the invariants might not hold any
longer. One way to insure this is by passing an ordering function when constructing a
set, but a nicer way is to use overloading (since only a single instance can hold for
Of course, maybe it is a good idea to make two modules: a 'collection' data type that
exposes internal operations, and a 'Set' module on top that implements sets and does
not expose operations that could violate set invariants.
You are right that these considerations do not make as much sense in the context of
general AVL trees since no guarantee op "Set" functionality is given there. Same
holds for the IntMap and Map interfaces. (but not for "Bag").
> or is the main point that
> the additional functionality could not have otherwise been provided in
> an efficient way without going into the guts of Data.Set?
Efficiency wise, I think one should only provide functions that ensure that the
complexity does not get worse -- if one also considers functionality that improves on
a constant factor (like traversing twice) there is no end to the number of functions
that can be provided (as one is basically doing deforestation by hand). But this is
of course just my personal opinion on how to control the size of the API.
All the best,
-- Daan Leijen.
> Of course there's no reason why similar functions could not be
> provided by Data.Set, but they're not there at present.
>>or is the main point that
>>the additional functionality could not have otherwise been provided in
>>an efficient way without going into the guts of Data.Set?
> Yes. This why producing useable libraries like this is so difficult.
> There's plenty of reasonable things you just can't do efficiently
> with Data.Set. Same is probably true of Data.Tree.AVL of course, but
> I'm trying to make it more complete all the time.
> Anyway, please try out AVL and let me know if there's anything more
> Adrian Hey
> Haskell mailing list
> Haskell at haskell.org
More information about the Haskell