[Haskell-cafe] Iteratee performance

Daniel Fischer daniel.is.fischer at web.de
Wed Mar 17 12:42:40 EDT 2010


Am Mittwoch 17 März 2010 17:01:06 schrieb Gregory Collins:
> Vasyl Pasternak <vasyl.pasternak at gmail.com> writes:
> > Hi Cafe,
> >
> > Yesterday I played with iteratee package, and wanted to check its
> > performance. I tried to count lines in a file, as Oleg in his famous
> > lazy_vs_correct[1] article. The results somewhat disappointed me.
>
> eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words
> 234936
>
> real	0m0.027s
> user	0m0.010s
> sys	0m0.015s
> eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words
> 234936
>
> real	0m0.024s
> user	0m0.015s
> sys	0m0.007s
>
>
> Note also that the Bytestring I/O functions use a 32KB buffer and the
> iteratee "enumFd" function uses a 4KB buffer; if the buffers were the
> same the performance would be comparable. Here is my code:

Which gave not so good performance on a large file, it ate ~530MB memory 
for a 260MB file with 20915399 lines (after the first run, when the other 
apps' memory had been swapped out, it wasn't awful anymore, just mediocre).

It was, however, a good memory citizen - and pretty fast - after putting a 
BangPattern on go's argument.


Times:

wc -l:
  0.90s
your code with the BangPattern:
  1.10s
Vasyl's Iteratee code:
  7.61s
Vasyl's ByteString code:
  1.40s
Code from http://www.haskell.org/haskellwiki/Wc [1]:
  0.76s

(all Haskell code compiled with -O2 by ghc-6.12.1). I used iteratee-0.3.4 
from hackage.

So I'd say, if you know how, Iteratees can already be fast, but it seems to 
be easier to write slow code than with ByteString.

[1] The winner:
import qualified Data.ByteString.Lazy.Char8 as L
 
main :: IO ()
main = L.getContents >>= print . L.count '\n'


>
> ------------------------------------------------------------------------
>
> {-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
>
> import qualified Data.Iteratee.IO.Fd as I
> import qualified Data.Iteratee as I
> import qualified Data.Iteratee.WrappedByteString as I
> import qualified Data.ByteString.Char8 as S
>
> import System.Environment
> import System.IO
>
>
> count :: FilePath -> IO Int
> count s = I.fileDriverFd cnt s
>
> cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
> cnt = go 0
>   where
>     go n = I.IterateeG $ \ch ->
>         case ch of
>           (I.EOF Nothing)        -> return $ I.Done n ch
>           (I.EOF (Just e))       -> return $ I.Cont cnt (Just e)
>           (I.Chunk (I.WrapBS s)) -> do
>               let n' = n + S.count '\n' s
>               return $ I.Cont (go n') Nothing
>
> main :: IO ()
> main = do
>   [f] <- getArgs
>   print =<< count f



More information about the Haskell-Cafe mailing list