Proposal: IntMap.differenceKeysSet for removing an IntSet of keys

Milan Straka fox at ucw.cz
Wed Jul 13 12:45:58 CEST 2011


Hi,

> On 13 June 2011 21:40, Milan Straka <fox at ucw.cz> wrote:
> > My point was, that there are many combinations you can think of --
> > IntMap minus IntSet, IntMap minus Map Int, IntMap minus list of Ints,
> > IntMap union Map Int, IntMap intersection IntSet, IntMap intersection
> > Set Int and so on.
> 
> Actually, it's just IntMap and IntSet, since they share the same
> underlying patricia tree implementation.
> 
> > I am quite unhappy about exporting differenceKeysSet and other hybrid functions
> > working on different data structures.
> 
> Agreed. I don't think we should export differenceKeysSet either.
> 
> > But I do not like depending on rules alone either. If we promise that
> > difference map (fromSet set) works without creating intermediate map,
> > then it
> > - works only for GHC
> > - is nontrivial to use -- which functions do really fuse with fromSet?
> 
> I'm not promising anything performance-wise… it still produces the
> same result without the rules, just that an intermediate IntMap will
> be produced and consumed. With the rules, it's the same O(m+n) time
> complexity, but with a much smaller constant.
> 
> How is this any more controversial than list fusion?
> 
> > My biggest problem is "Does difference with IntSet deserves to have
> > a special function implementing it?" I do not believe so.
> 
> The point is that IntSet and IntMap are identical internally.
> 
> > For me, right now no functions working with different data structures
> > are worth it. The API is complicated as it is and I do not feel adding
> > cross-structure functions is a good thing to do. I am happy with
> > being able to convert between the structures efficiently.
> 
> For the record, the only API extension I'm proposing is:
> 
>     fromSet :: (Key -> a) -> IntSet -> IntMap a
> 
> Nothing else changes, as far as users are concerned, except that
> 
>     intersection m (fromSet f s)    and    difference m (fromSet g s)
> 
> will perform equally well as intersection/difference between pairs of maps.
> 
> > Thanks for all the pathes! The patch 0006 is unfortunately missing the
> > new module and also the changes to IntSet.
> 
> Those are in patch 0001.

Sorry for the long inactivity. Now we have a repo on github, so I am
returning to this.

I want to add fromSet to Data.Map and Data.IntMap.

For the difference specialization, could you please benchmark what is
the difference of having differenceKeysSet versus not having and
performing difference + fromSet? We should verify that there is
reasonable benefit in adding differenceKeysSet.

I am now leaving on vacation and won't be reading mail till Mon 17.

Cheers,
Milan



More information about the Libraries mailing list