Proposal: more general unionWith for Data.Map

Milan Straka fox at ucw.cz
Tue Jan 24 18:47:41 CET 2012


Hi,

> There are some high-level operations on maps which take two tree
> traversals only because the interface fails to expose sufficiently
> general functions. This proposal is concerned with an analogue of
> unionWithKey of type Ord k => (k -> a -> a -> Maybe a) -> Map k a ->
> Map k a -> Map k a, with the intended semantics that if a key is
> present in both maps and the operation applied to the key and
> corresponding values returns Nothing, the key is deleted from the
> result.
> 
> Example use: sparse vectors are represented as maps from indices to
> non-zero coordinates. Often, a highly sparse vector u needs to be a
> added to a rather dense sparse vector v, i.e. we have m << n where m
> and n are the counts of non-zero elements in u and v, respectively.
> I need this operation to be O(m + n) as well as O(m * log(n)), and
> it should be as efficient as reasonable possible. This seems like a
> straightforward use case for unionWith since its hedge-union
> algorithm guarantees O(m * log(n)). The only problem occurs when
> u[i] = x and v[i] = -x for some index i and non-zero x, resulting in
> a zero entry I would rather have deleted from the result (this is a
> common case when dealing with matrices with small integer entries).
> The same problems occurs for coordinate-wise vector multiplication
> over non-integral domains, this time with intersectionWith.
> 
> Remarks:
> - It would align the types of unionWith and intersectionWith with
> differenceWith.

FYI, development version of containers has generalized intersectionWith[Key] with type
  intersectionWithKey :: Ord k => (k -> a -> b -> Maybe c) -> Map k a -> Map k b -> Map k c

If others agree, it is indeed simple to generalize type of
unionWith[Key] from
  unionWithKey :: Ord k => (k -> a -> a -> a) -> Map k a -> Map k a -> Map k a
to
  unionWithKey :: Ord k => (k -> a -> a -> Maybe a) -> Map k a -> Map k a -> Map k a
These functions are used quite frequently in my opinion, so it will
probably result in quite a lot of code breaks (unionWith (++) has to be
changed to unionWith (\a b -> Just (a ++ b))).

Any opinion, noble almighty others?

Cheers,
Milan



More information about the Libraries mailing list