Proposal: add 'findLess' and variants to containers
Twan van Laarhoven
twanvl at gmail.com
Fri Feb 17 23:20:10 CET 2012
On 2012-02-17 22:21, Twan van Laarhoven wrote:
> On 16/02/12 20:37, Johan Tibell wrote:
>> Hi,
>>
>> First, thanks for writing such a detailed proposal.
>>
>> I don't have strong feelings one way or the other. Like always I'm
>> worried about API growth. How about performance? Can these functions
>> be implemented efficiently outside the library?
>
> I also did some performance tests, comparing an implementation that uses only
> the exposed API (split,findMin) to one that uses the constructors. The benchmark
> code is attached, it is mostly copy-pasted from benchmarks/Map.hs. Here is a
> general idea of the results (ghc 7.0.4 on windows 7) :
>
> I haven't yet compared the performance for Set yet (which I expect will be
> similar), nor for IntMap (where I have no clue).
Here are the results for IntMap:
benchmarking GE split present
mean: 313.7273 us, lb 312.2985 us, ub 314.8706 us, ci 0.950
std dev: 7.010637 us, lb 4.061641 us, ub 9.947760 us, ci 0.950
benchmarking GE def present
mean: 197.7439 us, lb 195.8572 us, ub 199.6307 us, ci 0.950
std dev: 9.474155 us, lb 9.388414 us, ub 9.482203 us, ci 0.950
benchmarking GE split absent
mean: 625.5818 us, lb 621.1370 us, ub 630.5826 us, ci 0.950
std dev: 24.49309 us, lb 21.45281 us, ub 26.63424 us, ci 0.950
benchmarking GE Craig absent
mean: 239.5751 us, lb 237.8357 us, ub 241.3148 us, ci 0.950
std dev: 9.259975 us, lb 7.455007 us, ub 11.33443 us, ci 0.950
benchmarking GE def absent
mean: 186.1479 us, lb 184.3935 us, ub 187.7270 us, ci 0.950
std dev: 8.600419 us, lb 8.080284 us, ub 8.809609 us, ci 0.950
As you can see, when the key is present in the map, a specialized version
("def") takes 2/3 as long as one built on top of split. But when the key is not
in the map, the difference becomes larger, around 1/3. For findGreater (as
opposed to findGreaterEqual), we are essentially always in the 'not found in the
map' case.
By the way, in all cases, both for Map and for IntMap, passing a default
argument is better than returning nothing and then later unpacking that again.
What I mean is:
go _ Tip = Nothing
go k (Bin _ kx x l r) =
case compare k kx of
LT -> case go k l of
Nothing -> Just (kx,x)
ret -> ret
GT -> go k r
EQ -> Just (kx,x)
vs.
go def _ Tip = def
go def k (Bin _ kx x l r) =
case compare k kx of
LT -> go (Just (kx,x)) k l
GT -> go def k r
EQ -> Just (kx,x)
If you want to include the attached code in the library, you should use
Map_FindGE.findGreater{Equal}3 and IntMap_FindGE.findGreater{Equal}3, those are
the most efficient versions.
Twan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: IntMap_FindGE.hs
URL: <http://www.haskell.org/pipermail/libraries/attachments/20120217/da6a08fe/attachment.asc>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: findGE-IntMap.hs
URL: <http://www.haskell.org/pipermail/libraries/attachments/20120217/da6a08fe/attachment.txt>
More information about the Libraries
mailing list