[Haskell-cafe] Re: Am I doing it right?

Daniel Fischer daniel.is.fischer at web.de
Wed Sep 24 19:53:38 EDT 2008


Am Donnerstag, 25. September 2008 00:39 schrieb Achim Schneider:
> "Creighton Hogg" <wchogg at gmail.com> wrote:
> > So as another step in my education, I wanted to ask if the following
> > code at least feels 'morally' correct as a find/replace function,
> > replacing each occurrence of the sub-list before with after in the
> > list.
>
> Besides not using head, tail and isPrefixOf I would write it the same,
> except that I'd write it like this:
>
> -->8
>
> import Data.ByteString.Lazy.Char8 as B
> import Prelude as P
>
> cut =
>     let f _ s [] = [s]
>         f n s (x:xs) =
>             let (h,t) = B.splitAt (x - n) s
>             in h:f x t xs
>     in f 0
>
>
> replace before@(b:_) after this =
>     let
>         before' = B.pack before
>         after' = B.pack after
>         f s | B.tail before' `B.isPrefixOf` B.tail s =
>                 B.append after' $ B.drop (B.length before') s
>
>             | otherwise = s
>
>     in B.concat $ P.map f $ cut this $ B.elemIndices b this
>
> main =
>     B.interact $ replace "th" "s"
>
> -->8
>
> As we all love benchmarks so dearly, here are the times:
>
> ./replace < /usr/share/dict/words > /dev/null  1.29s user 0.02s system
> 87% cpu 1.498 total
>
> ./replace < /usr/share/dict/words > /dev/null  0.25s user 0.02s system
> 84% cpu 0.313 total

If you love benchmarks, how about:
dafis at linux:~/Documents/haskell/move> time ./replaceAS < 
/usr/share/dict/ger-eng.txt > /dev/null

real    0m1.227s
user    0m1.130s
sys     0m0.100s
dafis at linux:~/Documents/haskell/move> time ./replaceKMP < 
/usr/share/dict/ger-eng.txt > /dev/null

real    0m0.179s
user    0m0.130s
sys     0m0.050s

where the latter is

---------------------------------------------
module Main (main) where

import qualified Data.ByteString.Lazy.Char8 as B
import qualified Data.ByteString.Search.KnuthMorrisPratt as KMP

replace before after this =
    let pat = B.pack before
        rep = B.pack after
        lp = B.length pat
        ixs = KMP.matchLL pat this
        offs = zipWith (-) ixs (0:map (+ lp) ixs)
        wrk bs [] = [bs]
        wrk bs (o:os) =
            let (h,t) = B.splitAt o bs
                rm = B.drop lp t
            in h:rep:wrk rm os
    in B.concat $ wrk this offs

main = B.interact $ replace "th" "s"
-------------------------------------------------------

just to stay close to your code and because I didn't want to think about a 
really fast search&replace implementation. Note that longer patterns give a 
better advantage to the stringsearch package, searching for "this" (replacing 
with "that") gives
dafis at linux:~/Documents/haskell/move> time ./replaceAS < 
/usr/share/dict/ger-eng.txt > /dev/null

real    0m1.222s
user    0m1.130s
sys     0m0.080s
dafis at linux:~/Documents/haskell/move> time ./replaceKMP < 
/usr/share/dict/ger-eng.txt > /dev/null

real    0m0.124s
user    0m0.080s
sys     0m0.040s
dafis at linux:~/Documents/haskell/move> time ./replaceBM < 
/usr/share/dict/ger-eng.txt > /dev/null

real    0m0.093s
user    0m0.060s
sys     0m0.030s


The fast searching function on ByteStrings has already been written for you :)



More information about the Haskell-Cafe mailing list