Proposal: IntMap.differenceKeysSet for removing an IntSet of keys

Milan Straka fox at ucw.cz
Mon Jun 13 14:40:33 CEST 2011


> Milan Straka <fox <at> ucw.cz> writes:
> > > Ticket: http://hackage.haskell.org/trac/ghc/ticket/5242
> > another problem is that more methods could be overloaded for IntSets --
> > also union and intersect. Adding the variant only for difference seems
> > quite arbitrary, but adding all is bloating the API.
> 
> For intersection, sure. I can see plausible use-cases for this too.
> Patch 0007.
> 
> For union, where would it get a value from, if a key was found in
> the set but not the map? Take a function to generate the value from
> the key like fromSet? Given union is left-biased in its effect but
> not in its type, I'm going to need separate versions to handle the
> two ways a map and a set can be combined. Not sure I want to go down
> this route, especially as I can't think of a good use-case.

My point was, that there are many combinations you can think of --
IntMap minus IntSet, IntMap minus Map Int, IntMap minus list of Ints,
IntMap union Map Int, IntMap intersection IntSet, IntMap intersection
Set Int and so on.

With fromSet and fromDistinctAscList, you can handle all of these easily,
although with some time penalty. For example the union case,
  union map (fromSet (const 1) set)
seems fine.

> > I liked the second idea with the
> >   fromSet :: (Key -> a) -> IntSet.IntSet -> IntMap a
> > but I would not add the rewriting rules. The intermediate IntMap has to
> > be constructed, but I think it is reasonable price for doing
> > cross-structure operations and not extending the API.
> 
> I don't understand your reasoning behind not adding the rewrite rules.
> Simply don't export differenceKeysSet, as patch 0005 does.

(Johan, please read this too.)

I am quite unhappy about exporting differenceKeysSet and other hybrid functions
working on different data structures.

But I do not like depending on rules alone either. If we promise that
difference map (fromSet set) works without creating intermediate map,
then it
- works only for GHC
- is nontrivial to use -- which functions do really fuse with fromSet?

My biggest problem is "Does difference with IntSet deserves to have
a special function implementing it?" I do not believe so. There are many
functions that could fuse with fromSet, you could even implement better
implementations of e.g. union with Map Int. But how do we measure which
combinations are worth it?

For me, right now no functions working with different data structures
are worth it. The API is complicated as it is and I do not feel adding
cross-structure functions is a good thing to do. I am happy with
being able to convert between the structures efficiently.

Johan, what is your opinion?

> 
> > Instead we could make fromSet more efficient -- if the IntMap could
> > access the internal representation of IntSet, the can implement fromSet
> > just by adding values to each node.
> 
> Right. I factored "data IntSet" out to a separate module so that
> this was possible. This takes its time complexity from O(n*min(n,W))
> down to O(n), I believe.
> 
> > The strictness annotations are a bit
> > harmful here -- if they were not there, only the parts of the structure
> > needed for the following difference (union, intersect) would be
> > constructed. BTW, seeing the IntSet representation in IntMap would allow
> > us to define keysSet more efficient too.
> 
> Done both! Patch 0006.

Thanks for all the pathes! The patch 0006 is unfortunately missing the
new module and also the changes to IntSet.

Cheers,
Milan



More information about the Libraries mailing list