set representation question
Stefan Karrmann
sk at mathematik.uni-ulm.de
Wed Nov 12 21:07:45 EST 2003
Dear Nicholas,
Nicholas Nethercote (Wed, Nov 12, 2003 at 11:32:54AM +0000):
> On Wed, 12 Nov 2003, Tom Pledger wrote:
>
> > Hal Daume III writes:
> > :
> > | *all* i care about is being able to quickly calculate the size of
> > | the intersection of two sets. these sets are, in general, very
> > | sparse, which means that the intersections tend to be small.
> > |
> > | for example, i might have two sets reprsented by the arrays:
> > |
> > | {0,1,10,346,398,1039,3289,3853,9811,89231,50913}
> > | {0,3,98,183,398,1038,5319,7642,9811,13893,93123}
> > |
> > | and all i need to be able to do is respond with "3" very very
> > | quickly. using sorted arrays, this takes O(n+m) time (where n and
> > | m are the sizes of the arrays).
> >
> > The total time (including the up front time for building the data
> > structure) can't go below O(n+m), because if it did, you'd be
> > neglecting to look at some of the elements at all.
>
> Isn't it O(min(m,n))? You don't have to look at all elements for the
> intersection. Eg:
>
> {0,1,10}
> {0,3,98,183,398,1038,5319,7642,9811,13893,93123}
O(f) describes the worst case of the algorithm. It is O((m,n)->m+n).
The average cost may be lower, but it depends on the distribution of the
data.
Sincerly,
--
Stefan Karrmann
More information about the Haskell
mailing list