[Haskell-cafe] Substring replacements

Daniel Fischer daniel.is.fischer at web.de
Wed Dec 21 09:20:18 EST 2005


Am Mittwoch, 21. Dezember 2005 08:18 schrieb Branimir Maksimovic:
> From: Bulat Ziganshin <bulatz at HotPOP.com>
>
> >Reply-To: Bulat Ziganshin <bulatz at HotPOP.com>
> >To: "Branimir Maksimovic" <bmaxa at hotmail.com>
> >CC: haskell-cafe at haskell.org
> >Subject: Re: [Haskell-cafe] Substring replacements
> >Date: Tue, 20 Dec 2005 23:55:22 +0300
> >
> >Hello Branimir,
> >
> >Tuesday, December 20, 2005, 9:48:48 PM, you wrote:
> >
> >BM> I've finally performed test on amd64 and result is a same as on intel.
> >BM> KMP always wins. So KMP is best suited for non indexed strings
> >BM> and I guess should be used in library as prefered search/replace
> >method.
> >BM> This test favors straightforward search.
> >
> >i'm 90% sure that straightforward method must be faster for one-time
> >searches.

I'm far less sure of that. If you have a really short search-pattern, I think 
that probably straightforward search is indeed faster (at least, if the 
search-pattern isn't supplied at compile-time). But if you have a pattern of 
length 100, say, and lots of relatively long partial matches, chances are 
that KMP will move on something like 50 chars at once, which would have to be 
re-checked by straightforward. I've no idea, when that would outweigh the 
extra time of building the KMP-arrays, though. In my test -- extremely 
unrealistic, but provides a more or less worst case example -- 
straightforward must make roughly half a million character comparisons to 
pass each 'c', while KMP does 2000 (+/-2) (one way through the array and 
back), so there are situations where KMP definitely is faster. But 
ordinarily, on my computer, your version of straightforward is 10-15% faster 
than KMP (search-patterns are now supplied on the command line -- which has 
no big impact; searched string is " able sea..."; all compiled with -O2; 
NOINLINE makes no difference -- at least with optimisations on --; without 
optimisations, KMP suffers far worse than straightforward).

>
> KMP is O(m) while straightforward is O(m*n).

Where m is the length of the input and n is the length of the searched-for 
pattern, I think?
But these are worst-case complexities, I believe, ordinarily, straightforward 
will be O(m), too.

>
> your test may give better results with KMP algorithm just
>
> >because you repeat the same search many times and it was automatically
> >"run-time compiled" as described on the wiki page about KMP

My results seem to witness against that.

>
> My test favors straightforward, in any other case KMP wins by order of
> magnitude.
Can you give example tests?

> I think that straightfoirward is better then KMP with optimisations
> off is due more complex program.
>
> >try to add
> >
> >{-# NOINLINE replace #-}
> >
> >to both programs and repeat comparision
>
> Greetings, Bane.
>
Cheers,
Daniel



More information about the Haskell-Cafe mailing list