[Haskell-cafe] Fwd: Semantics of iteratees, enumerators, enumeratees?

Wed Aug 25 04:33:49 EDT 2010

From: John Millikin <jmillikin at gmail.com>

>
> Here's my (uneducated, half-baked) two cents:
>
> There's really no need for an "Iteratee" type at all, aside from the
> utility of defining Functor/Monad/etc instances for it. The core type
> is "step", which one can define (ignoring errors) as:
>
>    data Step a b = Continue (a -> Step a b)
>                  | Yield b [a]
>
> Input chunking is simply an implementation detail, but it's important
> that the "yield" case be allowed to contain (>= 0) inputs. This allows
> steps to consume multiple values before deciding what to generate.
>
> In this representation, enumerators are functions from a Continue to a
> Step.
>
>    type Enumerator a b = (a -> Step a b) -> Step a b
>
> I'll leave off discussion of enumeratees, since they're just a
> specialised type of enumerator.
>
> -------------
>
> Things become a bit more complicated when error handling is added.
> Specifically, steps must have some response to EOF:
>
>    data Step a b = Continue (a -> Step a b) (Result a b)
>                  | Result a b
>
>    data Result a b = Yield b [a]
>                    | Error String
>
> In this representation, "Continue" has two branches. One for receiving
> more data, and another to be returned if there is no more input. This
> avoids the "divergent iteratee" problem, since it's not possible for
> Continue to be returned in response to EOF.
>

Is this really true?  Consider iteratees that don't have a sensible default
value (e.g. head) and an empty stream.  You could argue that they should
really return a Maybe, but then they wouldn't be divergent in other
formulations either.  Although I do find it interesting that EOF is no
longer part of the stream at all.  That may open up some possibilities.

Also, I found this confusing because you're using Result as a data
constructor for the Step type, but also as a separate type constructor.  I
expect this could lead to very confusing error messages ("What do you mean
'Result b a' doesn't have type 'Result'?")

>
> Enumerators are similarly modified, except they are allowed to return
> "Continue" when their inner data source runs out. Therefore, both the
> "continue" and "eof" parameters are Step.
>
>    type Enumerator a b = (a -> Step a b) -> Step a b -> Step a b
>

I find this unclear as well, because you've unpacked the continue parameter
but not the eof.  I would prefer to see this as:
    type Enumerator a b = (a -> Step a b) -> Result a b -> Step a b

However, is it useful to do so?  That is, would there ever be a case where
you would want to use branches from separate iteratees?  If not, then why
bother unpacking instead of just using
    type Enumerator a b = Step a b -> Step a

John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100825/b7ba2792/attachment.html