[Haskell-cafe] Iteratee performance

Gregory Collins greg at gregorycollins.net
Wed Mar 17 12:01:06 EDT 2010

Vasyl Pasternak <vasyl.pasternak at gmail.com> writes:

> Hi Cafe,
> Yesterday I played with iteratee package, and wanted to check its
> performance. I tried to count lines in a file, as Oleg in his famous
> lazy_vs_correct[1] article. The results somewhat disappointed me.

eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words 

real	0m0.027s
user	0m0.010s
sys	0m0.015s
eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words 

real	0m0.024s
user	0m0.015s
sys	0m0.007s

Note also that the Bytestring I/O functions use a 32KB buffer and the
iteratee "enumFd" function uses a 4KB buffer; if the buffers were the
same the performance would be comparable. Here is my code:


{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}

import qualified Data.Iteratee.IO.Fd as I
import qualified Data.Iteratee as I
import qualified Data.Iteratee.WrappedByteString as I
import qualified Data.ByteString.Char8 as S

import System.Environment
import System.IO

count :: FilePath -> IO Int
count s = I.fileDriverFd cnt s

cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
cnt = go 0
    go n = I.IterateeG $ \ch ->
        case ch of
          (I.EOF Nothing)        -> return $ I.Done n ch
          (I.EOF (Just e))       -> return $ I.Cont cnt (Just e)
          (I.Chunk (I.WrapBS s)) -> do
              let n' = n + S.count '\n' s
              return $ I.Cont (go n') Nothing

main :: IO ()
main = do
  [f] <- getArgs
  print =<< count f

Gregory Collins <greg at gregorycollins.net>

More information about the Haskell-Cafe mailing list