[Haskell-cafe] Fwd: Semantics of iteratees, enumerators, enumeratees?

Tue Aug 24 16:23:10 EDT 2010

Here's my (uneducated, half-baked) two cents:

There's really no need for an "Iteratee" type at all, aside from the
utility of defining Functor/Monad/etc instances for it. The core type
is "step", which one can define (ignoring errors) as:

    data Step a b = Continue (a -> Step a b)
                  | Yield b [a]

Input chunking is simply an implementation detail, but it's important
that the "yield" case be allowed to contain (>= 0) inputs. This allows
steps to consume multiple values before deciding what to generate.

In this representation, enumerators are functions from a Continue to a Step.

    type Enumerator a b = (a -> Step a b) -> Step a b

I'll leave off discussion of enumeratees, since they're just a
specialised type of enumerator.

-------------

Things become a bit more complicated when error handling is added.
Specifically, steps must have some response to EOF:

    data Step a b = Continue (a -> Step a b) (Result a b)
                  | Result a b

    data Result a b = Yield b [a]
                    | Error String

In this representation, "Continue" has two branches. One for receiving
more data, and another to be returned if there is no more input. This
avoids the "divergent iteratee" problem, since it's not possible for
Continue to be returned in response to EOF.

Enumerators are similarly modified, except they are allowed to return
"Continue" when their inner data source runs out. Therefore, both the
"continue" and "eof" parameters are Step.

    type Enumerator a b = (a -> Step a b) -> Step a b -> Step a b

-------------

Finally, support for monads is added. I don't know if denotational
semantics typically considers monads, but I feel they're important
when discussing enumerators/iteratees. After all, the entire point of
the iteratee abstraction is to serve an alternative to lazy IO.

    data Step a m b = Continue (a -> m (Step a m b)) (m (Result a b))
                    | Result a b

    data Result a b = Yield b [a]
                    | Error String

    type Enumerator m a b = (a -> m (Step a m b)) -> m (Step a m b) ->
m (Step a m b)

This is mostly the same as the second representation, except that it
makes obvious at which point each value is calculated from the
underlying monad.

>From here, it's trivial to define the "Iteratee" type, if desired:

    type Iteratee a m b = m (Step a m b)

    data Step a m b = Continue (a -> Iteratee a m b) (m (Result a b))
                    | Result a b

    data Result a b = Yield b [a]
                    | Error String

    type Enumerator m a b = (a -> Iteratee a m b) -> Iteratee a m b ->
Iteratee a m b

-------------

Note: the data types I've arrived at here are significantly different
from those defined by Oleg. Given my relative level of competency with
logical programming in general and Haskell in particular, I suspect
his definitions are superiour in some way I do not understand.