[Haskell-cafe] Re: Am I doing it right?
Daniel Fischer
daniel.is.fischer at web.de
Wed Sep 24 19:53:38 EDT 2008
Am Donnerstag, 25. September 2008 00:39 schrieb Achim Schneider:
> "Creighton Hogg" <wchogg at gmail.com> wrote:
> > So as another step in my education, I wanted to ask if the following
> > code at least feels 'morally' correct as a find/replace function,
> > replacing each occurrence of the sub-list before with after in the
> > list.
>
> Besides not using head, tail and isPrefixOf I would write it the same,
> except that I'd write it like this:
>
> -->8
>
> import Data.ByteString.Lazy.Char8 as B
> import Prelude as P
>
> cut =
> let f _ s [] = [s]
> f n s (x:xs) =
> let (h,t) = B.splitAt (x - n) s
> in h:f x t xs
> in f 0
>
>
> replace before@(b:_) after this =
> let
> before' = B.pack before
> after' = B.pack after
> f s | B.tail before' `B.isPrefixOf` B.tail s =
> B.append after' $ B.drop (B.length before') s
>
> | otherwise = s
>
> in B.concat $ P.map f $ cut this $ B.elemIndices b this
>
> main =
> B.interact $ replace "th" "s"
>
> -->8
>
> As we all love benchmarks so dearly, here are the times:
>
> ./replace < /usr/share/dict/words > /dev/null 1.29s user 0.02s system
> 87% cpu 1.498 total
>
> ./replace < /usr/share/dict/words > /dev/null 0.25s user 0.02s system
> 84% cpu 0.313 total
If you love benchmarks, how about:
dafis at linux:~/Documents/haskell/move> time ./replaceAS <
/usr/share/dict/ger-eng.txt > /dev/null
real 0m1.227s
user 0m1.130s
sys 0m0.100s
dafis at linux:~/Documents/haskell/move> time ./replaceKMP <
/usr/share/dict/ger-eng.txt > /dev/null
real 0m0.179s
user 0m0.130s
sys 0m0.050s
where the latter is
---------------------------------------------
module Main (main) where
import qualified Data.ByteString.Lazy.Char8 as B
import qualified Data.ByteString.Search.KnuthMorrisPratt as KMP
replace before after this =
let pat = B.pack before
rep = B.pack after
lp = B.length pat
ixs = KMP.matchLL pat this
offs = zipWith (-) ixs (0:map (+ lp) ixs)
wrk bs [] = [bs]
wrk bs (o:os) =
let (h,t) = B.splitAt o bs
rm = B.drop lp t
in h:rep:wrk rm os
in B.concat $ wrk this offs
main = B.interact $ replace "th" "s"
-------------------------------------------------------
just to stay close to your code and because I didn't want to think about a
really fast search&replace implementation. Note that longer patterns give a
better advantage to the stringsearch package, searching for "this" (replacing
with "that") gives
dafis at linux:~/Documents/haskell/move> time ./replaceAS <
/usr/share/dict/ger-eng.txt > /dev/null
real 0m1.222s
user 0m1.130s
sys 0m0.080s
dafis at linux:~/Documents/haskell/move> time ./replaceKMP <
/usr/share/dict/ger-eng.txt > /dev/null
real 0m0.124s
user 0m0.080s
sys 0m0.040s
dafis at linux:~/Documents/haskell/move> time ./replaceBM <
/usr/share/dict/ger-eng.txt > /dev/null
real 0m0.093s
user 0m0.060s
sys 0m0.030s
The fast searching function on ByteStrings has already been written for you :)
More information about the Haskell-Cafe
mailing list