[Haskell-cafe] safe lazy IO or Iteratee?

Fri Feb 5 09:04:38 EST 2010

> Subject: Re: [Haskell-cafe] safe lazy IO or Iteratee?
>
> Downside: iteratees are very hard to understand. I wrote a
> decently-sized article about them trying to figure out how to make
> them useful, and some comments in one of Oleg's implementations
> suggest that the "iteratee" package is subtly wrong. Oleg has written
> at least three versions (non-monadic, monadic, monadic CPS) and I've
> no idea why or whether their differences are important. Even dons says
> he didn't understand them until after writing his own iteratee-based
> IO layer.

More significant than, and orthogonal to, the differences between
non-monadic and monadic are the two primary implementations Oleg has
written.  They are[1]:

Design 1:
newtype Iteratee el m a = Iteratee{runIter:: Stream el -> m (IterV el m a)}
data IterV el m a = IE_done a (Stream el)
		  | IE_cont (Iteratee el m a) (Maybe ErrMsg)

Design 2:
newtype Iteratee el m a = Iteratee{runIter:: m (IterV el m a)}
data IterV el m a = IE_done a (Stream el)
		  | IE_cont (Stream el -> Iteratee el m a) (Maybe ErrMsg

With the first design, it's impossible to get the state of an iteratee
without feeding it a chunk.  There are other consequences too.  The
second design seems to require some specialized combinators, that is
(>>==) and ($$), which are not required for the first version.
Neither situation is ideal.  The CPS version appears to remedy both
flaws, but at the expense of introducing CPS at a low level (this can
be hidden from the end user in many cases).  I already think of
iteratees as holding continuations, so to me the so-called "CPS
version" is to me a double CPS.

Both designs appear to offer similar performance in aggregate,
although there are differences for particular functions.  I haven't
yet had a chance to test the performance of the CPS variant, although
Oleg has indicated he expects it will be higher.

The monadic/non-monadic issue is related.  Non-monadic iteratees are
iteratees that can't perform monadic effects when they're running
(although they can still be fed from a monadic enumerator).
Essentially it's the difference between "fold" and "foldM".  They are
simpler and more efficient because of this, but also much less
powerful.  Any iteratee design can support both non-monadic and
monadic, but *I* don't want to support both.  At least, I don't want
to have double modules for everything for nearly identical functions,
and polymorphic code that can handle non-monadic and monadic iteratees
is non-trivial[2].

Much of my recent work has been in the consequences of these various
design considerations for the next version of the iteratee library.
Currently undecided, although I'm leaning towards CPS.  It seems to
solve a lot of problems, and the implementation details are generally
cleaner too.

Cheers,
John

[1] Both taken from
http://okmij.org/ftp/Haskell/Iteratee/IterateeM.hs.  Design 1 is
commented out on that page.

[2] At least for me.  Maybe others can provide a better solution.