[Haskell-cafe] ANNOUNCE: enumerator,
an alternative iteratee package
John Millikin
jmillikin at gmail.com
Thu Aug 19 18:24:43 EDT 2010
On Thu, Aug 19, 2010 at 14:29, wren ng thornton <wren at freegeek.org> wrote:
> I was under the impression Jason was asking about the performance of the
> iteratee package vs the enumerator package. I'd certainly be interested in
> seeing that. Right now I'm using attoparsec-iteratee, but if I could
> implement an attoparsec-enumerator which has the same/better performance,
> then I might switch over.
Oh, sorry -- both packages have the same performance. At least, if
there is a difference, it's less than the margin of error on my
benchmark (counting lines in the ubuntu 10.04 ISO, with cleared
filesystem caches).
Here's my Iteratee benchmark. I think this is the proper way to
implement "wc -l", but if you see any errors in it which could cause
poor performance, please let me know.
--------------------------------------------------------------------------------
import qualified Data.ByteString.Char8 as B
iterLines :: Monad m => IterateeG WrappedByteString Word8 m Integer
iterLines = IterateeG (step 0) where
step acc s@(EOF _) = return $ Done acc s
step acc (Chunk wrapped) = return $ Cont (IterateeG
(step acc')) Nothing where
acc' = acc + countChar '\n' (unWrap wrapped)
countChar :: Char -> B.ByteString -> Integer
countChar c = B.foldl (\acc c' -> if c' == c then acc + 1 else acc) 0
--------------------------------------------------------------------------------
And here's typical times for various implementations -- numbers are
real / user / sys, as reported by "time". They're mostly as expected,
except (to my surprise) lazy bytestrings are as fast as strict
bytestrings:
wc -l
====================
5.451 / 0.030 / 0.190
5.426 / 0.060 / 0.150
5.466 / 0.130 / 0.200
enumerator
====================
8.235 / 5.270 / 1.010
8.278 / 5.270 / 0.880
8.264 / 5.370 / 0.860
iteratee
====================
8.239 / 5.270 / 0.980
8.255 / 5.320 / 0.790
8.265 / 5.140 / 0.900
strict bytestrings
====================
5.425 / 2.030 / 0.360
5.402 / 2.180 / 0.330
5.446 / 2.240 / 0.400
lazy bytestrings
====================
5.467 / 1.910 / 0.260
5.428 / 1.990 / 0.280
5.433 / 2.140 / 0.190
> So far I've been very pleased with John Lato's work, quality-wise. Reducing
> dependencies is nice, but my main concern is the lack of documentation. I
> know the ideas behind iteratee and have read numerous tutorials on various
> people's simplified versions. However, because the iteratee package uses
> somewhat different terminology and types, it's not always clear exactly how
> to translate my knowledge into being able to use the library effectively.
> The enumerator package seems to have fixed this :)
Glad to hear it. My goal is not to supplant "iteratee", but to
supplement it -- if enumerator becomes the simple/learning version,
and most major packages use "iteratee", that's fine.
More information about the Haskell-Cafe
mailing list