[Haskell-cafe] Substring replacements
Branimir Maksimovic
bmaxa at hotmail.com
Wed Dec 14 04:16:36 EST 2005
>From: Daniel Fischer <daniel.is.fischer at web.de>
>To: "Branimir Maksimovic" <bmaxa at hotmail.com>
>CC: Haskell-Cafe at haskell.org
>Subject: Re: [Haskell-cafe] Substring replacements
>Date: Tue, 13 Dec 2005 11:23:29 +0100
>
After seeing that your program is fastest (I've also tried one from
http://haskell.org/hawiki/RunTimeCompilation but perhaps I'm not
that good in converting to search replace?) I've decided to
try with Rabin-Karp algorithm.
This algorithm performs same operation as straightforward search,
but compares hashes instead of chars.
With ability to rotate hash (remove first, add next) characters
there is also optimisation, that hash is calculated only for single
next character rather again for whole substring.
Unfortunatelly on my machine it is very cheap to compare
characters so with my test hashing overweights character compare,
except in your test when hash searching is faster then straightforward
search.
This is best I can write in terms of performance and readability.
I've tried with getFst that returns Maybe but it was slower so I decided
to return '\0' in case that argument is empty list, which renders '\0'
unusable, but then I really doubt that 0 will be used in strings.
-- Rabin-Karp string search algorithm, it is very effective in searching of
set
-- of patterns of length n on same string
-- this program is for single pattern search, but can be crafted
-- for multiple patterns of length m
hSearchReplace :: String -> String -> String -> String
hSearchReplace sr rp xs
| not (null remaining) = found ++ rp
++ hSearchReplace sr rp (drop (length sr)
remaining)
| otherwise = found
where
(found,remaining) = hSearch sr xs
hSearch :: String -> String -> (String,String)
hSearch sr xs = hSearch' sr xs hcmp ""
where
hsrch = hash sr
hcmp = hash $ take ls xs
cmp = take ls xs
ls = length sr
hSearch' [] xs _ _= (xs,[])
hSearch' sr [] _ fndFail = (reverse fndFail,[])
hSearch' srch xxs@(x:xs) hcmps fndFail
= if hsrch == hcmps
then if isPrefixOf srch xxs
then (reverse fndFail,xxs)
else searchAgain
else searchAgain
where
searchAgain
= hSearch' srch xs
(hashRotate (getFst xxs) (getFst nextxxs) (ls-1)
hcmps)
(x:fndFail)
nextxxs = drop ls xxs
getFst :: String -> Char
getFst [] = '\0'
getFst (a:as) = a
hash :: String -> Int
hash str
= hash' str (length str - 1)
where
hash' :: String -> Int -> Int
hash' [] _ = 0
hash' (s:str) pow = (101 ^ pow) *(fromEnum s)
+ hash' str (pow-1)
hashRotate :: Char -> Char -> Int -> Int -> Int
hashRotate cout cin pow hsh
= (hsh - ((101 ^ pow) * (fromEnum cout)))*101
+ (fromEnum cin)
Greetings, Bane.
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
More information about the Haskell-Cafe
mailing list