need help optimizing a function

Hal Daume III hdaume@ISI.EDU
Tue, 8 Oct 2002 07:52:52 -0700 (PDT)


I already saw one reply mentioning using state threaded arrays.  This is
probably your best (in terms of performance) option.

Keep in mind that Haskell arrays are *not* your standard imperative
arrays.  Like imperative arrays, they give you O(1) lookup, but only
O(n) insert.  If you want to keep with a functional style, I'd suggest
using a different data structure for the data.  A FiniteMap should work
fine and should give you pretty good (at least much
better) performance.  And you won't have to give up the FP style (for
whatever that's worth).

 - Hal

p.s., I sense your next question is going to be something like "why can't
the compiler detect that the array can be updated in place instead of
copied" and the answer, from what i can tell, is simply that it doesn't
try.

--
Hal Daume III

 "Computer science is no more about computers    | hdaume@isi.edu
  than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume

On Tue, 8 Oct 2002, David Roundy wrote:

> Hello.  If you followed my previous post (which was about a bug in my lcs
> function), this is still the same code (only now bug-free, it seems).
> 
> It turns out that my lcs diff function is MUCH slower than GNU diff on some
> test files (much worse than 1000 time slower!).The test case where its
> speed was close was IO bound on the output to an xterm (silly me).
> 
> I've profiled my lcs code, and it is spending about 95% of its time in the
> "hunt_one_char" function.  It also claims to spend 0.0% of its time in
> my_bs, which does a binary search on the array, and is the only other
> function that gets called by hunt_one_char (not counting itself).
> 
> I've told the compiler (ghc) to inline hunt_one_char and my_bs to no avail.
> I guess maybe it can't or won't inline a recursive function, or maybe that
> just doesn't gain me anything.
> 
> My only guess is that for some reason it is making an entirely new copy of
> the array each time it recurses, but that doesn't make any sense, since the
> array is getting threaded through, so it should be able to just modify the
> array in place.  But then, I'm not entirely clear as to what the compiler
> can and cannot do as it optimizes.
> 
> >>data Threshold a = Thresh Int! [a]
> >>                 deriving (Show)
> >>
> >>hunt_one_char :: String -> [Int] -> Array Int (Threshold String) ->
> >>                 Array Int (Threshold String)
> >>hunt_one_char c [] th = th
> >>hunt_one_char c (j:js) th =
> >>    case my_bs (Thresh j [c]) th of
> >>    Nothing -> hunt_one_char c js th
> >>    Just k ->
> >>        case th!(k-1) of
> >>        Thresh _ rest ->
> >>            hunt_one_char c js th//[(k,Thresh j (c:rest))]
> 
> For what it's worth, the calling function is:
> 
> >>hunt_internal :: [String] -> [[Int]] -> 
> >>                 Array Int (Threshold String) ->
> >>                 Array Int (Threshold String)
> >>hunt_internal [] _ th = th
> >>hunt_internal _ [] th = th
> >>hunt_internal (c:cs) (m:ms) th =
> >>    hunt_internal cs ms $ hunt_one_char c m th
> 
> Each string in this list of strings is a line from one of the files (and
> the [[Int]] is a list of line numbers in the other file that that line
> matches).  The array 'th' has size (1,n) where n is the number of lines in
> the file.
> 
> My simplest test case consists of a 20k line file, each line of which is
> unique and a second file which is simply a permutation of the first (moving
> the first line to the last), so hunt_one_char should be called exactly once
> for each line, and it is always called with its second argument having
> exactly one element.  In this test case, according to the ghc profiler,
> 89.7% of the time is spent in hunt_one_char:
>                                               individual     inherited
> COST CENTRE              MODULE     entries  %time %alloc   %time %alloc
>       hunt_internal      Main         20001    0.2   0.0     91.4  95.3
>        hunt_one_char     Main         40000   89.7  95.0     91.1  95.3
>         my_bs            Main         20000    0.2   0.0      1.4   0.3
>          my_helper_bs    Main        307231    1.2   0.3      1.2   0.3
> 
> This has me at a loss.  It seems like such a simple function... of course,
> it is called 20k times, which could certainly add up, but while each call
> of my_bs should take O(log 20k) time, each call of hunt_one_char (apart
> from its one call to my_bs) should only take O(1), so most of the time
> should be spent in my_bs unless something is terribly wrong (which must be
> the case here).
> 
> If it would help, I'd be happy to send a listing of the complete code, but
> it's 6.5k so I figured it'd be better not to send it, since it seems
> unlikely that anyone would want to run it anyways.
> -- 
> David Roundy
> http://civet.berkeley.edu/droundy/
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>