Proposal: IntMap.differenceKeysSet for removing an IntSet of keys

Milan Straka fox at ucw.cz
Wed Jun 15 10:40:25 CEST 2011


Hello,

thanks for patience with me.

<snip>
> > My point was, that there are many combinations you can think of --
> > IntMap minus IntSet, IntMap minus Map Int, IntMap minus list of Ints,
> > IntMap union Map Int, IntMap intersection IntSet, IntMap intersection
> > Set Int and so on.
> 
> Actually, it's just IntMap and IntSet, since they share the same
> underlying patricia tree implementation.
<snip>
> > My biggest problem is "Does difference with IntSet deserves to have
> > a special function implementing it?" I do not believe so.
> 
> The point is that IntSet and IntMap are identical internally.

The same implementation means we can implement those. But does it also
imply a reason for doing so? I am not sure we are doing it just because
we can or because it is useful.

Does some other people think it is worth having
   intersection m (fromSet f s)    and    difference m (fromSet g s)
perform equally well as intersection/difference between pairs of maps,
instead of a penalty for converting the set to the map?

> > But I do not like depending on rules alone either. If we promise that
> > difference map (fromSet set) works without creating intermediate map,
> > then it
> > - works only for GHC
> > - is nontrivial to use -- which functions do really fuse with fromSet?
> 
> I'm not promising anything performance-wise… it still produces the
> same result without the rules, just that an intermediate IntMap will
> be produced and consumed. With the rules, it's the same O(m+n) time
> complexity, but with a much smaller constant.
> 
> How is this any more controversial than list fusion?

Hm, you are right. I was still thinking about deterministically calling
differenceKeysSet instead of just invisible optimization.

> > For me, right now no functions working with different data structures
> > are worth it. The API is complicated as it is and I do not feel adding
> > cross-structure functions is a good thing to do. I am happy with
> > being able to convert between the structures efficiently.
> 
> For the record, the only API extension I'm proposing is:
> 
>     fromSet :: (Key -> a) -> IntSet -> IntMap a

and I am happy with this one.

> Nothing else changes, as far as users are concerned, except that
> 
>     intersection m (fromSet f s)    and    difference m (fromSet g s)
> 
> will perform equally well as intersection/difference between pairs of maps.

The last thing I am worried about is whether the new duplicated code
(intersectionKeysSet and differenceKeysSet) is worth the speedup. I will
wait for some other opinions on this.

> > Thanks for all the pathes! The patch 0006 is unfortunately missing the
> > new module and also the changes to IntSet.
> 
> Those are in patch 0001.

Oh, I am sorry I did not read the first one.


Also, the IntMap is considered to be a replacement of Map -- I would
like to add the fromSet method to the Map module too. But I am not sure
I want to add the specialization of intersectionKeysSet and
differenceKeysSet, at least not now -- I want to tune the representation
of Map and Set and less code I need to change the better.

Cheers,
Milan



More information about the Libraries mailing list