Proposal: Further performance improvements of Data.Set

Milan Straka fox at
Tue Sep 14 12:30:26 EDT 2010



The following text is the description of ticket 4312:

This proposal depends on #4280.

This proposal further improves the performance of Data.Set. It consists of five patches:

   1. making the set datatype to store the elements evaluated (by using a bang). This is in accordance with a poll on libraries@…, see point 1) of
   2. improvements to union and difference implementation (by eliminating high-order function)
   3. improvements to balance function which is used by nearly all methods modifying a set (making one monolithic function which allows perform only one pattern-match on the tree nodes)
   4. correction of the test (the Arbitrary instance could generate unbalanced trees, which the patch 3. discovered)
   5. added benchmark of union, difference and intersection methods 

On i386 Intel Core2 and GHC 6.12.1, the improvements are

delete       13.46
deleteMax    13.58
deleteMin    15.93
difference   26.17
insert       12.20
intersection 2.21
member       9.52
union        27.59

The repository of the containers package with these patches (and also several others) is at

The patches are also attached (including #4280)

More information about the Libraries mailing list