Proposal: containers: add indexing operations for Set

Milan Straka fox at ucw.cz
Thu Jul 5 17:01:16 CEST 2012


> > Agreed. These indexing methods work only for OrderededSet and
> > OrderedMap, not for HashMap and HashSet.
> >
> > If the Data.Set and Data.Map modules were called Data.OrdSet
> > and Data.OrdMap, then there would be no problem including the indexing
> > API in them. Personally I consider Data.Set to be Data.OrdSet, not
> > a generic Set API. If we every create a Set class, the indexing API will
> > definitely _not_ be part of it.
> 
> I've also considered Map and Set to be OrdMap and OrdSet for quite a
> while. If we ever add a type class to abstract over e.g. HashMap and
> Map we'd probably call it Data.Map.Class or something.
> 
> A potential downside to adding these functions is that I don't know if
> we can support them efficiently with a different implementation of
> ordered trees than the size-balanced ones. I still think we should add
> them though.

Any balanced tree can support indexing API, as long as every subtree
stores its size. That is obviously trivially true for size-balanced
trees, as the subtree sizes are used for balancing. If for example AVL
trees were used, sizes would have to be stored explicitly in every node
for indexing API to work fast.

The complexity of 'size' also depends on whether sizes are stored
explicitly in every node -- Data.Map.size is constant-time operation.

Consider IntMap -- there sizes are not stored in the nodes, so we have
no indexing API and 'size' is linear-time operation.

BTW, when I last benchmarked it, adding size to every node of
a red-black tree increased the time complexity of insert, lookup and
delete by 0-5%.


Nevertheless, I also think we should add indexing API to Data.Set.

Cheers,
Milan



More information about the Libraries mailing list