[Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library
Ross Paterson
ross at soi.city.ac.uk
Thu Feb 24 13:36:42 CET 2011
On Wed, Feb 23, 2011 at 08:45:47AM +0000, Max Bolingbroke wrote:
> 3. Some map combining algorithms work more efficiently when one of
> their two arguments are smaller. For example, Data.Map.union is most
> efficient for (bigmap `union` smallmap). If you don't care about which
> of the two input maps wins when they contain the same keys, you can
> improve constant factors by testing the size of the map input to size
> (in O(1)) and flipping the arguments if you got (smallmap `union`
> bigmap) instead of the desirable way round.
This is a most unfortunate feature of Data.Map, forcing users to bend the
logic of their application to suit the library. Ideally you'd want binary
operations that are symmetric and associative performance-wise, freeing
users to follow the structure of their problem domain. The documentation
(and the original paper) doesn't quantify the gain for such unions,
but hopefully it achieves the optimum: O(m log(n/m)), where m and n are
the sizes of the smaller and larger trees respectively. Then building
a tree costs O(n log(n)), regardless of the way you do it.
More information about the Haskell-Cafe
mailing list