[Haskell-beginners] please critique my first stab at systems programming
Daniel Fischer
daniel.is.fischer at googlemail.com
Thu Apr 14 14:50:54 CEST 2011
On Thursday 14 April 2011 14:30:59, Felipe Almeida Lessa wrote:
> On Thu, Apr 14, 2011 at 8:50 AM, Daniel Fischer
>
> <daniel.is.fischer at googlemail.com> wrote:
> > $ cabal install stringsearch
> > -- faster searches for a pattern than provided by bytestring
> >
> > import qualified Data.ByteString.Char8 as BC
> > import qualified Data.ByteString.Lazy as L
> > import qualified Data.ByteString.Lazy.Search as L
> >
> > blockSize = 512
> >
> > main = do
> > stuff <- L.readFile "blocks"
> > case L.indices (BC.pack "PART") stuff of
> > [] -> putStrLn "Not Found."
> > (i:_) -> putStrLn $ "Found at " ++ show (i `quot` blockSize) ++
> > "." putStrLn "Done."
>
> This solves a different problem. The original looked for PART only on
> the beginning of each block,
Well, I would have expected that, but
> searchForPattern' index handle pat = do
>
> eof <- hIsEOF handle
> if eof then return Nothing
>
> else do
>
> bytes <- B.hGet handle chunkSize
> case BC.breakSubstring pat bytes of
>
> (x, y) | BC.null y -> searchForPattern' (index + 1) handle pat
>
> | otherwise -> return (Just index)
searches the entire block and checks whether the second component is empty
to see whether there's any match at all. So I thought that'd be the desired
behaviour.
If one wants to only look at the start of the blocks, isPrefixOf would be
a much more efficient way than breakSubstring. It would probably be still
more efficient to read larger chunks than 512 bytes and step through them
with 512-byte steps to check for a "PART" there (unless the "PART" appears
in the first few blocks).
> while your solution will find everything
> that has PART in it. Because this is I/O bound, probably the real
> time taken won't be very different, but the program will surely use
> more CPU time and report more false positives.
>
> Cheers,
More information about the Beginners
mailing list