DiffArray Performance
Alastair Reid
alastair at reid-consulting-uk.ltd.uk
Mon Oct 27 16:17:05 EST 2003
> [Binary tree] seems to be taking about 4.8 seconds (of 5.1 seconds total
> program execution time) for the input I'm using. I thought using
> DiffArrays might be faster, but no such luck. Execution time rose
> to 9.5 *minutes*.
>
> Is this what I should expect to see?
You don't mention _how_ you use diffarrays so I'll guess you're keeping the
array in ascending order, finding the insertion point by binary search and
inserting by copying bigger elements one place to the right.
Using binary trees (the code you copied), the cost of N insertions will be O(N
* log N) assuming the input is in random order.
Using 'true' arrays (all operations constant time), the cost of copying values
will dominate and inserting N elements will take O(N^2) time. So, even in
ideal circumstances, I'd expect it to be much slower for large N.
I think DiffArrays where most elements have been assigned to more than once
have constant time update but linear time lookup. So the total cost of
moving the elements when you insert is still O(N^2) but finding where to
insert will cost something closer to O(N * N * log N) since the cost of a
single binary search will be O(N * log N).
Ignoring all the constant factors (a dubious thing to do) and choosing N=10^6,
these relative costs look something like this:
binary tree: 10^6 * 20 = 2 * 10^7
true array: 10^6 * 10^6 = 10^12
diff array: 10^6 * 10^6 * 20 = 2 * 10^13
(The constant overheads would have to be pretty large to mask differences like
this.)
--
Alastair Reid www.haskell-consulting.com
More information about the Glasgow-haskell-users
mailing list