Proposal: more general unionWith for Data.Map
Christian Sattler
sattler.christian at gmail.com
Tue Jan 24 18:35:11 CET 2012
Hi all,
There are some high-level operations on maps which take two tree
traversals only because the interface fails to expose sufficiently
general functions. This proposal is concerned with an analogue of
unionWithKey of type Ord k => (k -> a -> a -> Maybe a) -> Map k a -> Map
k a -> Map k a, with the intended semantics that if a key is present in
both maps and the operation applied to the key and corresponding values
returns Nothing, the key is deleted from the result.
Example use: sparse vectors are represented as maps from indices to
non-zero coordinates. Often, a highly sparse vector u needs to be a
added to a rather dense sparse vector v, i.e. we have m << n where m and
n are the counts of non-zero elements in u and v, respectively. I need
this operation to be O(m + n) as well as O(m * log(n)), and it should be
as efficient as reasonable possible. This seems like a straightforward
use case for unionWith since its hedge-union algorithm guarantees O(m *
log(n)). The only problem occurs when u[i] = x and v[i] = -x for some
index i and non-zero x, resulting in a zero entry I would rather have
deleted from the result (this is a common case when dealing with
matrices with small integer entries). The same problems occurs for
coordinate-wise vector multiplication over non-integral domains, this
time with intersectionWith.
Remarks:
- It would align the types of unionWith and intersectionWith with
differenceWith.
- The concept of allowing operations with return type Maybe a to avoid
an additional high-level operation is already expressed in functions
like alter or mapMaybeWithKey.
- It would enable an efficient implementation of symmetricDifference =
unionMaybeWithKey (const Nothing).
- This concerns Data.Map, Data.IntMap, Data.Set, Data.IntSet and the
With[Key] versions of union/intersection/difference
- At least for Data.Map.unionWithKey, modifying the existing
implementation seems trivial, requiring a merge instead of a join in the
new case.
I apologize if some of these points have been brought up before.
Christian
More information about the Libraries
mailing list