[Haskell-cafe] Substring replacements
bmaxa at hotmail.com
Thu Dec 15 13:21:48 EST 2005
This is what I got for BM. Performance dissapoints as BM is really
suited for indexed strings like arrays.It mainly operates on indexes.
This is simple BM, as I didn't want to go
for more complex variant,becauses takes and drops and recalculation of next
is too pricey for non indexed structure. So, clear winner is KMP for
non indexed strings. There is also finite automaton algorithm
but this works well if search strings are precompiled, so I'll
implement it only for education purposes. I hope my Haskell improves
as I've learned how to reduce number of paramaters.
searchReplaceBM :: String -> String -> String -> String
searchReplaceBM "" _ str = str
searchReplaceBM sr rp str = searchReplace str
table :: UArray Int Int
table = array (0,255) ([(i,0) | i <- [0..255]] ++ proc sr 1)
proc  _ = 
proc (s:st) i = (ord s,i):proc st (i+1)
len = length sr
rsrch = reverse sr
| null remaining = if found then rp
|found = rp ++ searchReplace remaining
| otherwise = passed ++ searchReplace remaining
(passed,remaining,found) = searchReplace' str
= if j == 0
then ("",drop len str,True)
where failed = case drop (j-1) str of
 -> (str,"",False)
(c:_) -> (take sk str, drop sk str, False)
where md = j - table ! ord c
sk = if md > 0
j = srch rsrch (reverse $ take len str) len
where srch "" "" _ = 0
srch _ "" l = l
srch (s:str) (s':str') l
| s == s' = srch str str' (l-1)
| otherwise = l
>From: "Branimir Maksimovic" <bmaxa at hotmail.com>
>To: bmaxa at hotmail.com, daniel.is.fischer at web.de
>CC: Haskell-Cafe at haskell.org
>Subject: Re: [Haskell-cafe] Substring replacements
>Date: Thu, 15 Dec 2005 01:39:57 +0000
>>From: "Branimir Maksimovic" <bmaxa at hotmail.com>
>>To: daniel.is.fischer at web.de
>>CC: Haskell-Cafe at haskell.org
>>Subject: Re: [Haskell-cafe] Substring replacements
>>Date: Thu, 15 Dec 2005 00:55:02 +0000
>>>From: Daniel Fischer <daniel.is.fischer at web.de>
>>>To: "Branimir Maksimovic" <bmaxa at hotmail.com>
>>>CC: Haskell-Cafe at haskell.org
>>>Subject: Re: [Haskell-cafe] Substring replacements
>>>Date: Wed, 14 Dec 2005 20:40:06 +0100
>>>nice algorithm. Since comparing chars _is_ cheap, it is to be expected
>>>all the hash-rotating is far more costly for short search patterns. The
>>>longer the pattern, the better this gets, I think -- though nowhere near
>>>(or would it?).
>>Yes,KMP is superior in single pattern search. We didn't tried Boyer-Moore
>>algorithm yet, though. But I think it would be
>>difficult to implement it in Haskell efficiently as it searches backwards
>>and jumps around, and we want memory savings.
>>Though, I even didn't tried yet, but it is certainly very interesting.
>Forget what I've said.
>Boyer-Moore *can* be implemented efficiently, it is similar to KMP it goes
>forward, but when it finds last character in pattern, than starts to search
>This can be implemented easilly as Haskell lists naturaly reverse order
>when putting from one list to other.
>Heh, never say never :)
>As I see from documents Boyer-Moore has best performance on average
>and should be better than KMP.
>FREE pop-up blocking with the new MSN Toolbar - get it now!
Don't just search. Find. Check out the new MSN Search!
More information about the Haskell-Cafe