[Haskell-cafe] Iteratee performance
Daniel Fischer
daniel.is.fischer at web.de
Wed Mar 17 12:42:40 EDT 2010
Am Mittwoch 17 März 2010 17:01:06 schrieb Gregory Collins:
> Vasyl Pasternak <vasyl.pasternak at gmail.com> writes:
> > Hi Cafe,
> >
> > Yesterday I played with iteratee package, and wanted to check its
> > performance. I tried to count lines in a file, as Oleg in his famous
> > lazy_vs_correct[1] article. The results somewhat disappointed me.
>
> eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words
> 234936
>
> real 0m0.027s
> user 0m0.010s
> sys 0m0.015s
> eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words
> 234936
>
> real 0m0.024s
> user 0m0.015s
> sys 0m0.007s
>
>
> Note also that the Bytestring I/O functions use a 32KB buffer and the
> iteratee "enumFd" function uses a 4KB buffer; if the buffers were the
> same the performance would be comparable. Here is my code:
Which gave not so good performance on a large file, it ate ~530MB memory
for a 260MB file with 20915399 lines (after the first run, when the other
apps' memory had been swapped out, it wasn't awful anymore, just mediocre).
It was, however, a good memory citizen - and pretty fast - after putting a
BangPattern on go's argument.
Times:
wc -l:
0.90s
your code with the BangPattern:
1.10s
Vasyl's Iteratee code:
7.61s
Vasyl's ByteString code:
1.40s
Code from http://www.haskell.org/haskellwiki/Wc [1]:
0.76s
(all Haskell code compiled with -O2 by ghc-6.12.1). I used iteratee-0.3.4
from hackage.
So I'd say, if you know how, Iteratees can already be fast, but it seems to
be easier to write slow code than with ByteString.
[1] The winner:
import qualified Data.ByteString.Lazy.Char8 as L
main :: IO ()
main = L.getContents >>= print . L.count '\n'
>
> ------------------------------------------------------------------------
>
> {-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
>
> import qualified Data.Iteratee.IO.Fd as I
> import qualified Data.Iteratee as I
> import qualified Data.Iteratee.WrappedByteString as I
> import qualified Data.ByteString.Char8 as S
>
> import System.Environment
> import System.IO
>
>
> count :: FilePath -> IO Int
> count s = I.fileDriverFd cnt s
>
> cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
> cnt = go 0
> where
> go n = I.IterateeG $ \ch ->
> case ch of
> (I.EOF Nothing) -> return $ I.Done n ch
> (I.EOF (Just e)) -> return $ I.Cont cnt (Just e)
> (I.Chunk (I.WrapBS s)) -> do
> let n' = n + S.count '\n' s
> return $ I.Cont (go n') Nothing
>
> main :: IO ()
> main = do
> [f] <- getArgs
> print =<< count f
More information about the Haskell-Cafe
mailing list