[Haskell-cafe] Substring replacements
Daniel Fischer
daniel.is.fischer at web.de
Wed Dec 21 09:20:18 EST 2005
Am Mittwoch, 21. Dezember 2005 08:18 schrieb Branimir Maksimovic:
> From: Bulat Ziganshin <bulatz at HotPOP.com>
>
> >Reply-To: Bulat Ziganshin <bulatz at HotPOP.com>
> >To: "Branimir Maksimovic" <bmaxa at hotmail.com>
> >CC: haskell-cafe at haskell.org
> >Subject: Re: [Haskell-cafe] Substring replacements
> >Date: Tue, 20 Dec 2005 23:55:22 +0300
> >
> >Hello Branimir,
> >
> >Tuesday, December 20, 2005, 9:48:48 PM, you wrote:
> >
> >BM> I've finally performed test on amd64 and result is a same as on intel.
> >BM> KMP always wins. So KMP is best suited for non indexed strings
> >BM> and I guess should be used in library as prefered search/replace
> >method.
> >BM> This test favors straightforward search.
> >
> >i'm 90% sure that straightforward method must be faster for one-time
> >searches.
I'm far less sure of that. If you have a really short search-pattern, I think
that probably straightforward search is indeed faster (at least, if the
search-pattern isn't supplied at compile-time). But if you have a pattern of
length 100, say, and lots of relatively long partial matches, chances are
that KMP will move on something like 50 chars at once, which would have to be
re-checked by straightforward. I've no idea, when that would outweigh the
extra time of building the KMP-arrays, though. In my test -- extremely
unrealistic, but provides a more or less worst case example --
straightforward must make roughly half a million character comparisons to
pass each 'c', while KMP does 2000 (+/-2) (one way through the array and
back), so there are situations where KMP definitely is faster. But
ordinarily, on my computer, your version of straightforward is 10-15% faster
than KMP (search-patterns are now supplied on the command line -- which has
no big impact; searched string is " able sea..."; all compiled with -O2;
NOINLINE makes no difference -- at least with optimisations on --; without
optimisations, KMP suffers far worse than straightforward).
>
> KMP is O(m) while straightforward is O(m*n).
Where m is the length of the input and n is the length of the searched-for
pattern, I think?
But these are worst-case complexities, I believe, ordinarily, straightforward
will be O(m), too.
>
> your test may give better results with KMP algorithm just
>
> >because you repeat the same search many times and it was automatically
> >"run-time compiled" as described on the wiki page about KMP
My results seem to witness against that.
>
> My test favors straightforward, in any other case KMP wins by order of
> magnitude.
Can you give example tests?
> I think that straightfoirward is better then KMP with optimisations
> off is due more complex program.
>
> >try to add
> >
> >{-# NOINLINE replace #-}
> >
> >to both programs and repeat comparision
>
> Greetings, Bane.
>
Cheers,
Daniel
More information about the Haskell-Cafe
mailing list