# [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

wren ng thornton wren at freegeek.org
Fri Feb 25 00:40:57 CET 2011

```On 2/23/11 8:06 PM, wren ng thornton wrote:
> On 2/23/11 4:42 PM, Sterling Clover wrote:
>> A quick grep of some of my own source reveals that I've used M.size
>> and S.size only to test for sizes equal to 1. So, for my purposes at
>> least, an O(1) isSingleton operation would be just as useful as an
>> O(1) size.
>
> I agree, a fast isSingleton function would cover a very common use
> case--- especially for set-like containers.

On reflection, in order to handle all the comparisons of size to some
small constant, it would be better to have two functions: one for
comparing the size of a collection against a boundary condition, and one
for comparing the size of two collections. That is, you should add
analogues of the functions in Data.List.Extras.LazyLength[1].

The lazy boundary condition function is O(min(m,n)) where m is the size
of the boundary and n is the size of the collection. For detecting
singletons, doubletons, etc, this reduces to O(1) though there may be a
constant factor hit vs dedicated isSingleton/isDoubleton/et functions.
It would be a slight difference that diminishes as m increases, but it
could be worth benchmarking...

Similarly, the lazy comparative size function is O(min(m,n)) where m and
n are the sizes of the two collections. This is always less than the
O(m)+O(n) of taking the sizes individually[2], but it's remarkably less
when one of the collections is known to be much smaller than the other
(even if you don't know which is which).

[1]

[2] The O(m)+O(n) can be parallelized to yield O(max(m,n)) but that's
still greater than O(min(m,n)). With some form of chunked-lazy natural
numbers you may be able to get the parallel version to approach
O(max(m,n)) and then get into a battle of constant factors.

--
Live well,
~wren

```