Proposal: add 'findLess' and variants to containers

Twan van Laarhoven twanvl at gmail.com
Sun Feb 12 14:14:46 CET 2012


Hello all,


== Overview ==

I have several times needed functions to find items less than a given key in a 
container. So I propose we add these to the containers library:

     -- | /O(log n)/.
     --   Find the largest element in the set that is @<@ the query.
     findLess :: Ord a => a -> Set a -> Maybe a
     -- | /O(log n)/.
     --   Find the largest element in the set that is @<=@ the query.
     findLessEqual :: Ord a => a -> Set a -> Maybe a
     -- | /O(log n)/.
     --   Find the smallest element in the set that is @>@ the query.
     findGreater :: Ord a => a -> Set a -> Maybe a
     -- | /O(log n)/.
     --   Find the smallest element in the set that is @>=@ the query.
     findGreaterEqual :: Ord a => a -> Set a -> Maybe a

And for Maps:

     findLess, findLessEqual, findGreater, findGreaterEqual
        :: Ord a => a -> Map a b -> Maybe (a,b)

As well as the equivalent functions for IntSet and IntMap.


== Previous proposal and related functions ==

There was a previous proposal to add such functions about two years ago [1]. 
That proposal died due to concerns over adding related 'delete' and 'update' 
functions, and that the API would become too large.

I think the 'find' functions should be added regardless of whether we also add 
'update' and 'delete' variants. On the other hand, for consitency it would be 
better to also add 'deleteLess', etc.

We should perhaps mimic the findMin/findMax functions. The only issue with that 
idea is that findMin does not return a Maybe, but rather fails for an empty 
container. I think that not returning its result in a Maybe would make the 
findLess useless in practice, since there is no such easy precondition for findLess.

As for the API size: This proposal would add a lot of functions to Data.Map and 
Data.Set (especially when we also include 'update' and 'delete' variants). 
However, the mental burden of these new functions is not so large. Library users 
already know what 'find', 'update' and 'delete' mean. Moreover, there is a nice 
symmetry between the 4 different conditions, which means that you can almost 
think of them as one thing.


== Bikeshedding / naming ==

There are several possible names for these functions:
   1. findLess / findLessEqual / findGreater / findGreaterEqual
   2. findMaxLT / findMaxLE / findMinGT / findMinGE
   3. findLower / findFloor / findHigher / findCeiling

I personally prefer either 1 or 2. To me option 3 (which Java uses) is very 
confusing.


== Implementation ==

These functions can be implemented in terms of split, but it is a pain (and 
error prone) to do so:

     findLess         a m = findMax' (fst (split a m))
     findLessEqual    a m = case splitLookup a s of
                              (_,Just b,_)   -> Just (a,b)
                              (m',Nothing,_) -> findMax' m'
     findGreater      a m = findMin' (snd (split a m))
     findGreaterEqual a m = case splitLookup a m of
                              (_,Just b,_)   -> Just (a,b)
                              (_,Nothing,m') -> findMin' m'

where

     findMin' = fmap fst . minViewWithKey
     findMax' = fmap fst . maxViewWithKey

This is really something that shouldn't have to be done by users of the library. 
And we might also be able to come up with a more efficient implementation.


== Discussion period ==

Discussion deadline: 2 weeks from now; Sunday, 26 February 2012.


Twan


[1] http://www.haskell.org/pipermail/libraries/2010-February/013060.html
      continued here:
     http://www.haskell.org/pipermail/libraries/2010-March/013062.html




More information about the Libraries mailing list