[Haskell-cafe] Iteratee performance

Vasyl Pasternak vasyl.pasternak at gmail.com
Wed Mar 17 12:55:05 EDT 2010


Gregory,

Thank you, your code helps, now my it runs in the speed of lazy
bytestring test but uses less memory with it.

I've only added to your code more strictness in the recursion, my
version is below.

BTW, I think it is more useful to let user set the chunk size for
reading, so I'd like to see this possibility in the iteratee package.

{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}

import qualified Data.Iteratee.IO.Fd as I
import qualified Data.Iteratee as I
import qualified Data.Iteratee.WrappedByteString as I
import qualified Data.ByteString.Char8 as S

import System.Environment
import System.IO


count :: FilePath -> IO Int
count s = I.fileDriverFd cnt s

cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
cnt = go 0
 where
   go n = I.IterateeG $ \ch ->
       case ch of
         (I.EOF Nothing)        -> return $ I.Done n ch
         (I.EOF (Just e))       -> return $ I.Cont cnt (Just e)
         (I.Chunk (I.WrapBS s)) -> do
             let n' = n + S.count '\n' s
             return $ n' `seq` I.Cont (go n') Nothing

main :: IO ()
main = do
  [f] <- getArgs
  print =<< count f


2010/3/17 Gregory Collins <greg at gregorycollins.net>:
> Vasyl Pasternak <vasyl.pasternak at gmail.com> writes:
>
>> Hi Cafe,
>>
>> Yesterday I played with iteratee package, and wanted to check its
>> performance. I tried to count lines in a file, as Oleg in his famous
>> lazy_vs_correct[1] article. The results somewhat disappointed me.
>
>
> eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words
> 234936
>
> real    0m0.027s
> user    0m0.010s
> sys     0m0.015s
> eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words
> 234936
>
> real    0m0.024s
> user    0m0.015s
> sys     0m0.007s
>
>
> Note also that the Bytestring I/O functions use a 32KB buffer and the
> iteratee "enumFd" function uses a 4KB buffer; if the buffers were the
> same the performance would be comparable. Here is my code:
>
> ------------------------------------------------------------------------
>
> {-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
>
> import qualified Data.Iteratee.IO.Fd as I
> import qualified Data.Iteratee as I
> import qualified Data.Iteratee.WrappedByteString as I
> import qualified Data.ByteString.Char8 as S
>
> import System.Environment
> import System.IO
>
>
> count :: FilePath -> IO Int
> count s = I.fileDriverFd cnt s
>
> cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
> cnt = go 0
>  where
>    go n = I.IterateeG $ \ch ->
>        case ch of
>          (I.EOF Nothing)        -> return $ I.Done n ch
>          (I.EOF (Just e))       -> return $ I.Cont cnt (Just e)
>          (I.Chunk (I.WrapBS s)) -> do
>              let n' = n + S.count '\n' s
>              return $ I.Cont (go n') Nothing
>
> main :: IO ()
> main = do
>  [f] <- getArgs
>  print =<< count f
>
> --
> Gregory Collins <greg at gregorycollins.net>
>


More information about the Haskell-Cafe mailing list