# from array update algorithm to nice Haskell code

Daan Leijen daanleijen at xs4all.nl
Sat Jan 10 10:24:43 EST 2004

```Hi Wolfgang,

>> I think that you are mixing up worst case bounds O(..) with some specific
>> case. I guess that you mean by "O(1) elements" a known and constant number
>> of elements.
>
> I mean a constant maximum number of elements.  An example would be that the
> set b is guaranteed to have not more than one element.

Ah, I think I have the gist of your message now. For DData.Set, the
worst case bound for intersect is O(n+m), however, for your particular
case, it will behave as you want. i.e. for an intersection where one
of the sets is empty, it will run in constant time; for an intersection
where one of the sets has one element, it will take at most O(log n) time
(where n is the number of elements in the other set). This is again a
worst case bound: if n is zero (the empty set), it will be a constant
time operation. Even stronger, if you ask for the intersection of two large
sets that have no common elements, the operation will take (log n + log m) time.

Still, in the worst case, where both trees have about the same size and the
same kind of elements, the algorithm will degenerate to O(n+m). (btw. this is not
the case when you would simply iterate over one set and test membership of
the other (= O(m*log n)))

I am not entirely sure if this also holds for Data.Set, but I guess it is
the same. A small issue is that it *does* pay off to swap the arguments if
their size differs (just as in your list example), and I don't know
if Data.Set does that -- you should look in the CVS sources to find out.

If you think that application to the sets with single values happens often,
maybe I should add some special-case code to make this more (absolutely) efficient.

All the best,
Daan.

>> However, in the worst case, this will degenerate to some "m" number of
>> elements, and your function would take O(m*log n) time, i.e. much worse than
>> O(n+m).
>
> I don't understand this.  If m doesn't depend on any input values but is
> really constant then O(m * log n) = O(log n)—regardless of how large m is.
>
> Well, the specification of O(n + m) time does only say that the time is less
> than c * (n + m) of some constant c which is no contradiction to an O(log n)
> time.  But it leaves the possibility open that the calculation really needs
> linear time.
>
>> All the best,
>>   Daan.
>
>> [...]
>
> Wolfgang
>
>  Schöning, Theoretische Informatik – kurzgefasst, Spektrum Akademischer
> Verlag
>
> _______________________________________________