AVL size

Isaac Dupree isaacdupree at charter.net
Mon Aug 20 14:43:43 EDT 2007


Adrian Hey wrote:
> Isaac Dupree wrote:
>> Since it's a fairly well balanced tree, I would think there could be 
>> an "approximate size" in O(log(n)) time?  Of course this would be 
>> another exported function... would making it produce equal results for 
>> all equal maps be difficult?
> 
> I did consider something like this. It's certainly true that you could
> derive definite upper and lower bounds on size in O(log n) time.
> 
> Trouble is, the result of this would depend of details of tree structure
> which AVL and Data.Map make non-observable by defining equality
> in terms of (association) lists. So if a function like this was exposed
> you could have two "equal" trees (Maps) that gave different answers.

Yes, assuming we can't come up with some clever way to avoid that (and 
it's not seeming possible to me at the moment)

> I should mention that there's no particular technical reason why an AVL
> based Data.Map implementation could not provide O(1) size and O(log n)
> indexing. I just didn't think it was worth the extra cost and effort on
> my part.
> 
> It would still cost O(n) in extra heap storage and indexing a Map is bit
> suspect IMO. The only reasonable use of indexing is covered by the
> clones OMap type or the AVL zipper I think, and I couldn't think of any
> good reason why anybody would want the size of a Map unless they were
> about to do something that was O(n) anyway (or worse).

Maybe a use being "if it's small, do something O(n^2)" or generally, 
choosing algorithm efficiency based on size (maybe random sampling if 
it's _really_ big...).  I don't suppose there's an easy way to offer 
caching of information like that to those who have a use for it AT THE 
SAME TIME AS efficiency, for those who don't...

I don't see a use for _indexing_ either. Reminds be a bit of C++ Boost's 
multi_container (I think) that could be a map and a sequence and another 
kind of map all at the same time, sort of...

>> Also there is are "small length"-like functions that tell what the 
>> size of something is, up to a point, or "more than that"; this can 
>> probably be done with lazy "elems"? (e.g. minMapSize of two maps, at 
>> O(min(m,n))
> 
> Yes, something like this would be possible and easy.
> I'll think about adding it.

Consider that you need something isomorphic to the lazy Peano numeral in 
order to compare any two arbitrary possibly-very-large structures' 
sizes... so I don't know what a really good interface for that is.

Isaac


More information about the Libraries mailing list