Improving containers library
Axel Simon
Axel.Simon at in.tum.de
Wed Mar 3 10:28:44 EST 2010
Hi Milan,
On 03.03.2010, at 15:23, Milan Straka wrote:
> Hi all,
>
> I have started an internship in Cambridge and will spend 3 months on
> Haskell. One of the possible projects is improving the containers
> library. Things I am thinking about:
> - measuring the performance of existing implementations (notably Map
> and
> Set) and possibly improve them (either without or with API changes),
> - adding Queue and a PriorityQueue,
> - maybe adding a generalized trie,
> - maybe adding a hashtable (more like a trie of the hashes,
> something in
> the line of Ideal hash trees; and maybe a 'classic' hash modifiable
> in
> the ST monad?)
>
> I would be grateful for any comments, thoughts or wishes.
In the context of program analysis, one tracks a lot of information
about the state of a program at a given program point. This state is
propagated around loops and conditionals. The implicit sharing of
nodes in functional data structures are very well suited for this.
However, whenever two states need to be joined at the loop head or at
the end of a conditional, the two trees that represent the state need
to be joined by performing some sort of join operation on each element
of a tree. For all sensible domains a `join` a = a. Given that a loop
or branch of a conditional usually touches only few variables, it is
prudent not to perform the join operation on all elements of the tree.
Instead, a tree-difference operation is required that traverses the
two trees to be joined and calculates the difference between them. In
particular, whenever two references are, in fact, the same pointer,
the subtree does not need to be considered. This way, a join between
two trees of size n reduces to joining only m elements where m is the
maximum number of elements in each tree. This is a tremendous win for
n >> m.
I've implemented such a difference operation on a home-grown splay
tree implementation. I would like to see this on standard AVL trees.
Maybe other people could indicate if such a difference operation on
trees would be useful in their applications.
The difference operation in my implementation has the following
signature:
difference :: (Ord k) => (a -> a -> Bool) -> Map k a -> Map k a ->
([k], [k])
Calculate the difference between the two trees. Returns the elements
that were in the first but not in the second and the elements that
were in the second but not in the first. Both lists are unsorted. In
case two element have the same key the given equality predicate is a
applied to them. If the predicate returns False the elements are added
to their respective difference lists. The contents of both trees
remain unchanged.
differenceElems :: (Ord k) => (a -> a -> Bool) -> Map k a -> Map k a -
> ([a], [a])
As difference, but extracts the elements, not their keys.
Cheers,
Axel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20100303/bb722f09/attachment.html
More information about the Libraries
mailing list