[Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

Conrad Parker conrad at metadecks.org
Thu Aug 19 21:18:12 EDT 2010


On 20 August 2010 06:29, wren ng thornton <wren at freegeek.org> wrote:
> John Millikin wrote:
>>
>> On Wed, Aug 18, 2010 at 23:33, Jason Dagit <dagit at codersbase.com> wrote:
>>>
>>> The main reason I would use iteratees is for performance reasons.  To
>>> help
>>> me, as a potential consumer of your library, could you please provide
>>> benchmarks for comparing the performance of enumerator with say, a)
>>> iteratee, b) lazy/strict bytestring, and c) Prelude functions?
>>> I'm interested in both max memory consumption and run-times.  Using
>>> criterion and/or progression to get the run-times would be icing on an
>>> already delicious cake!
>>
>> Oleg has some benchmarks of his implementation at <
>> http://okmij.org/ftp/Haskell/Iteratee/Lazy-vs-correct.txt >, which
>> clock iteratees at about twice as fast as lazy IO. He also compares
>> them to a native "wc", but his comparison is flawed, because he's
>> comparing a String iteratee vs byte-based wc.
>
> I was under the impression Jason was asking about the performance of the
> iteratee package vs the enumerator package. I'd certainly be interested in
> seeing that. Right now I'm using attoparsec-iteratee, but if I could
> implement an attoparsec-enumerator which has the same/better performance,
> then I might switch over.
>
> So far I've been very pleased with John Lato's work, quality-wise. Reducing
> dependencies is nice, but my main concern is the lack of documentation. I
> know the ideas behind iteratee and have read numerous tutorials on various
> people's simplified versions. However, because the iteratee package uses
> somewhat different terminology and types, it's not always clear exactly how
> to translate my knowledge into being able to use the library effectively.
> The enumerator package seems to have fixed this :)

To be fair, John Lato's in-development branch of iteratee also fixes
the naming problem (ie. is closer to Oleg's original naming for
Iteratees, Enumerators and Enumeratees).

I've been developing applications using iteratee for the past few
weeks. Considering documentation, I don't think there is a lack of
published characters on the topic. Oleg's series of emails introducing
Iteratee and John Lato's article in the Monad.Reader were useful. John
Millikin's documentation for enumerator is a welcome addition. However
there is a deeper issue that Iteratees are semantically complex, and
that complexity is not really addressed by the existing documentation:
it mostly covers the various APIs, the design motivation (an extension
of the left-fold enumerator), and evangelism (comparisons to lazy IO).
I found it difficult to grok the reasons for the types, and what the
operational control flow is (eg. how and why does EOF get propagated,
how is a seek request communicated etc.).

In general there seems to be a lot of interest in Iteratees recently
as a way of dealing with resource management in IO. It's great to have
a few different implementations to compare, but once performance is
benchmarked and semantics are denotated it would be nice to converge
on a single implementation and build a platform of libraries on it
(for compression etc.), as was done for Lazy ByteString.

cheers,

Conrad.


More information about the Haskell-Cafe mailing list