[Haskell-cafe] Re: Left fold enumerator - a real pearl
ekirpichov at gmail.com
Mon Mar 2 01:58:44 EST 2009
OK, I'm far from being a supergeek, but anyways.
I'm not considering the lazy IO approach, as it doesn't involve any
form of control over resources.
With the traditional approach, you manually ask a stream do to
something (read a block of bytes, seek to a position etc.), and your
program is a mixture of stuff that asks the stream to do something and
stuff that deals with the results.
With the iteratee approach, you split the program into two parts:
Iteratee (a thing that encapsulates the state of your work with the
stream: either you're done and you don't need any new data, or you
need another block of bytes to do more work (and you know *which*
work), or you need to seek to a different position and you know what
you're going to do after that) and Enumerator : a thing that 'runs' an
iteratee, looking at what is is demanding, performing the
corresponding action, giving its result to the iteratee and seeing
what it wants next repeatedly.
Simplifying all the monadic stuff, we end up with something like this:
data Iteratee a = Done a
| NeedAnotherChunk (Maybe Chunk -> Iteratee a)
-- Will be given Nothing if we're at EOF
| NeedToSeek Int (Maybe Chunk -> Iteratee a)
type Enumerator a = Iteratee a -> Iteratee a
The key point is that you, by construction, *always* know whether you
need the stream or not.
Thus, since the data processing loop is concentrated in one place,
namely the particular enumerator, this loop *always* knows whether
it's time to close the stream or not. It is time, if the iteratee has
become a Done, or if the stream was closed or encountered an error.
Another key point is that both the iteratees and enumerators are
highly composable; and iteratees also form a monad, thus becoming
suitable for easily writing parsers.
Also, there's no recursion in the iteratees, and they are fusable and
thus extremely performant.
Further, you better read Oleg's article and code.
2009/3/2 Gü?nther Schmidt <gue.schmidt at web.de>:
> Hi everyone,
> after reading all the responses I would like to ask someone, anyone, to kind
> of summarize the merits of the left-fold-enumerator approach.
> From all that I read so far about it all I was able to gather was that it
> has significance but I'm still not even sure what for and what not for.
> Apparently Oleg has done various CS work, this particular piece just being
> one. But he also broaches the topic at very high level, ok, too high for me,
> ie. no CS or higher math background.
> Would one of the super geeks please summarize it up? (In RWH kind of style
> if possible)
> John Lato schrieb:
>> Hi Don,
>> Would you please elaborate on what features or capabilities you think
>> are missing from left-fold that would elevate it out of the special
>> purpose category? I think that the conception is so completely
>> different from bytestrings that just saying it's not a bytestring
>> equivalent doesn't give me any ideas as to what would make it more
>> useful. Since the technique is being actively developed and
>> researched, IMO this is a good time to be making changes.
>> Incidentally, in my package I've made newtypes that read data into
>> strict bytestrings. It would be relatively simple to use
>> unsafeInterleaveIO in an enumerator to create lazy bytestrings using
>> this technique. I don't see why anyone would want to do so, however,
>> since it would have all the negatives of lazy IO and be less efficient
>> than simply using lazy bytestrings directly.
>> On Sat, Feb 28, 2009 at 10:54 PM, Don Stewart <dons at galois.com> wrote:
>>>> There are a few iteratee/enumerator design questions that remain,
>>>> which Oleg and others would like to explore more fully. The results
>>>> of that research will likely find there way into this library.
>>> I agree. There's no left-fold 'bytestring' equivalent. So it remains a
>>> special purpose technique.
>>> -- Don
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
Web IR developer, market.yandex.ru
More information about the Haskell-Cafe