need help optimizing a function

David Roundy droundy@jdj5.mit.edu
Wed, 9 Oct 2002 14:29:26 -0400


On Tue, Oct 08, 2002 at 07:52:52AM -0700, Hal Daume III wrote:
> I already saw one reply mentioning using state threaded arrays.  This is
> probably your best (in terms of performance) option.
> 
> Keep in mind that Haskell arrays are *not* your standard imperative
> arrays.  Like imperative arrays, they give you O(1) lookup, but only
> O(n) insert.  If you want to keep with a functional style, I'd suggest
> using a different data structure for the data.  A FiniteMap should work
> fine and should give you pretty good (at least much
> better) performance.  And you won't have to give up the FP style (for
> whatever that's worth).

Thanks everyone for the advice! I think I now understand pretty well why my
old code didn't work.  I ended up switching to a binary tree structure,
since I didn't really need O(1) lookup anyways (since I had to do a binary
search in the first place to decide which element if any to modify.  This
changes the algorithm as a whole (I believe) from being O(n^2) back to O(n
log n) as it should be.

I get a speedup of about a factor of 6 for the test files I was using (of
course, this would depend on file size), and find that now only 2% of my
time is spent in that function.  I'm still something like 100 times slower
than GNU diff, so I think (hope, certainly!) there's still room for more
improvement (another day).  I'm sure I'll have more questions in a few
days, once I've tracked down what the new bottlenecks are.
-- 
David Roundy
http://civet.berkeley.edu/droundy/