[Haskell-cafe] Substring replacements (was: Differences in
optimisiation ...)
Daniel Fischer
daniel.is.fischer at web.de
Sat Dec 10 16:56:10 EST 2005
Am Samstag, 10. Dezember 2005 02:51 schrieben Sie:
> From: Daniel Fischer <daniel.is.fischer at web.de>
>
> >To: "Branimir Maksimovic" <bmaxa at hotmail.com>
> >CC: Haskell-Cafe at haskell.org
> >Subject: Re: Differences in optimisiation with interactive and compiled mo
> >Date: Fri, 9 Dec 2005 23:27:00 +0100
> >
> >Still doesn't work, though:
> >
> >*Main> searchr "hahal" "jupp" "hahahalala"
> >"hahahalala"
> >
> >The problem is that the string to replace may contain a repeated pattern
> >and the pattern that begins the actual occurence might be consumed before
> > a failure is detected.
>
> Yes, I've corrected it. Now it is just 25% faster and that is only with -O2
> flag.
> Here is whole thing, I hope there are no more bugs left :) :
>
None that sprang to my eyes. However, on my machine, yours is not faster than
Lemmih's.
Now, using the new Strings, I get the following times:
-O2 -O1 no opt
Lemmih's: 38.9 sec 38.9 sec 76.7 sec
Yours : 40.1 sec 41.5 sec 131.1 sec
Mine : 32.9 sec 33.1 sec 82.8 sec.
However, there's a problem with Lemmih's replace:
*Main> searchr "ababcab" "###" "ababcababcabab"
"###abcab"
*Main> replace "ababcab" "###" "ababcababcabab"
"ababc###ab"
due to the fact that Lemmih's version scans the input from right to left
(that's easily changed by a few reverses, though -- but costly for long
inputs), more serious is
Prelude Main> replace "ja" "aja" "jjjjjjja"
"ajajajajajajaja".
The fastest -- and nicely simple above -- that I could come up with is
replace :: String -> String -> String -> String
replace _ _ "" = ""
replace "" _ str = str
replace src dst inp
= process inp
where
n = length src
process "" = ""
process st@(c:cs)
| src `isPrefixOf` st = dst ++ process (drop n st)
| otherwise = c:process cs
It's roughly 10% faster than my other version on "seasearch" ...
and if you try it on
main2 :: IO ()
main2 = let src = replicate 1000 'r'
dst = " # "
str = replicate 999 'r' ++ 'c': replicate 1000 'r'
out = replace src dst $ concat $ replicate 500 str
out1 = replace src dst $ concat $ replicate 501 str
in do putStrLn $ "Working very long"
putStrLn $ show (out == out1) ++ "\nDone"
you'll see a real difference. I'm not sure, why your algorithm pays a so much
higher penalty, though. Maybe, it'll be faster if you make searchr' &c local
functions? I'll try.
Cheers,
Daniel
More information about the Haskell-Cafe
mailing list