[Haskell-cafe] Re: Memoizing longest-common-subsequence

Wed Aug 23 05:43:30 EDT 2006

> From: Bayley, Alistair 
>
> The performance I'm alluding to is not something I've tested 
> with any rigor, or even metrics to support. I simply needed 
> to compare a couple of large text files which GNU diff didn't 
> handle (I think when the input is over a certain size it 
> breaks it into chunks and diffs the chunks).

Sorry, I'm telling complete porkies here...

I've found the folder in which I did some of this testing, and GNU diff
has no problem with the input files; they're only 7M, My program spends
70% of its time doing String-IO (so 30% in the algorithm), and peaks at
about 350M of memory (which seems quite high, but then it does convert
the String input into STArrays). The algorithm itself takes about 18secs
on this input (wall-clock time), but the profile says a lot less CPU
time is used.

I think using ByteStrings would be a big improvement; maybe I'll find
time to try that later.

Alistair
*****************************************************************
Confidentiality Note: The information contained in this message,
and any attachments, may contain confidential and/or privileged
material. It is intended solely for the person(s) or entity to
which it is addressed. Any review, retransmission, dissemination,
or taking of any action in reliance upon this information by
persons or entities other than the intended recipient(s) is
prohibited. If you received this in error, please contact the
sender and delete the material from any computer.
*****************************************************************