need help optimizing a function
David Roundy
droundy@jdj5.mit.edu
Wed, 9 Oct 2002 14:29:26 -0400
On Tue, Oct 08, 2002 at 07:52:52AM -0700, Hal Daume III wrote:
> I already saw one reply mentioning using state threaded arrays. This is
> probably your best (in terms of performance) option.
>
> Keep in mind that Haskell arrays are *not* your standard imperative
> arrays. Like imperative arrays, they give you O(1) lookup, but only
> O(n) insert. If you want to keep with a functional style, I'd suggest
> using a different data structure for the data. A FiniteMap should work
> fine and should give you pretty good (at least much
> better) performance. And you won't have to give up the FP style (for
> whatever that's worth).
Thanks everyone for the advice! I think I now understand pretty well why my
old code didn't work. I ended up switching to a binary tree structure,
since I didn't really need O(1) lookup anyways (since I had to do a binary
search in the first place to decide which element if any to modify. This
changes the algorithm as a whole (I believe) from being O(n^2) back to O(n
log n) as it should be.
I get a speedup of about a factor of 6 for the test files I was using (of
course, this would depend on file size), and find that now only 2% of my
time is spent in that function. I'm still something like 100 times slower
than GNU diff, so I think (hope, certainly!) there's still room for more
improvement (another day). I'm sure I'll have more questions in a few
days, once I've tracked down what the new bottlenecks are.
--
David Roundy
http://civet.berkeley.edu/droundy/