[Haskell-cafe] Re: Fwd: Semantics of iteratees, enumerators, enumeratees?

Tue Aug 24 02:41:33 EDT 2010

Here's a way I've been tinkering with to think about iteratees clearly.

For simplicity, I'll stick with pure, error-free iteratees for now, and take
chunks to be strings.  Define a function that runs the iteratee:

> runIter :: Iteratee a -> [String] -> (a, [String])

Note that chunking is explicit here.

Next, a relation that an iteratee implements a given specification, defined
by a state transformer:

> sat :: Iteratee a -> State String a -> Bool

Define sat in terms of concatenating chunks:

> sat it st =
>   second concat . runIter it == runState st . second concat

where the RHS equality is between functions (pointwise/extensionally), and
runState uses the representation of State directly

> runState :: State s a -> s -> (a,s)

(I think this sat definition is what Conrad was alluding to.)

Now use sat to specify and verify operations on iteratees and to
*synthesize* those operations from their specifications.  Some iteratees
might not satisfy *any* (State-based) specification.  For instance, an
iteratee could look at the lengths or number of its chunks and produce
results accordingly.  I think of such iteratees as abstraction leaks.  Can
the iteratee vocabulary be honed to make only well-behaved (specifiable)
iteratees possible to express?  If so, can we preserve performance benefits?

If indeed the abstraction leaks can be fixed, I expect there will be a
simpler & more conventional semantics than sat above.

  - Conal

On Tue, Aug 24, 2010 at 2:55 PM, Conrad Parker <conrad at metadecks.org> wrote:

> On 24 August 2010 14:47, Jason Dagit <dagit at codersbase.com> wrote:
> >
> >
> > On Mon, Aug 23, 2010 at 10:37 PM, Conrad Parker <conrad at metadecks.org>
> > wrote:
> >>
> >> On 24 August 2010 14:14, Jason Dagit <dagit at codersbase.com> wrote:
> >> > I'm not a semanticist, so I apologize right now if I say something
> >> > stupid or
> >> > incorrect.
> >> >
> >> > On Mon, Aug 23, 2010 at 9:57 PM, Conal Elliott <conal at conal.net>
> wrote:
> >> >>>
> >> >>> So perhaps this could be a reasonable semantics?
> >> >>>
> >> >>> Iteratee a = [Char] -> Maybe (a, [Char])
> >> >>
> >> >> I've been tinkering with this model as well.
> >> >>
> >> >> However, it doesn't really correspond to the iteratee interfaces I've
> >> >> seen, since those interfaces allow an iteratee to notice size and
> >> >> number of
> >> >> chunks.  I suspect this ability is an accidental abstraction leak,
> >> >> which
> >> >> raises the question of how to patch the leak.
> >> >
> >> > From a purely practical viewpoint I feel that treating the chunking as
> >> > an
> >> > abstraction leak might be missing the point.  If you said, you wanted
> >> > the
> >> > semantics to acknowledge the chunking but be invariant under the size
> or
> >> > number of the chunks then I would be happier.
> >>
> >> I think that's the point, ie. to specify what the invariants should
> >> be. For example (to paraphrase, very poorly, something Conal wrote on
> >> the whiteboard behind me):
> >>
> >> run [concat [chunk]] == run [chunk]
> >>
> >> ie. the (a, [Char]) you maybe get from running an iteratee over any
> >> partitioning of chunks should be the same, ie. the same as from
> >> running it over the concatenation of all chunks, which is the whole
> >> input [Char].
> >
> > I find this notation foreign.  I get [Char], that's the Haskell String
> > type, but what is [chunk]?  I doubt you mean a list of one element.
>
> sorry, that was just my way of writing "the list of chunks" or perhaps
> "the stream of chunks that represents the input".
>
> Conrad.
>
> >
> >>
> >> > I use iteratees when I need to be explicit about chunking and when I
> >> > don't
> >> > want the resources to "leak outside" of the stream processing.  If you
> >> > took
> >> > those properties away, I wouldn't want to use it anymore because then
> it
> >> > would just be an inelegant way to do things.
> >>
> >> Then I suppose the model for Enumerators is different than that for
> >> Iteratees; part of the point of an Enumerator is to control the size
> >> of the chunks, so that needs to be part of the model. An Iteratee, on
> >> the other hand, should not have to know the size of its chunks. So you
> >> don't want to be able to know the length of a chunk (ie. a part of the
> >> stream), but you do want to be able to, say, fold over it, and to be
> >> able to stop the computation at any time (these being the main point
> >> of iteratees ...).
> >
> > I think I agree with that.
> > Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100824/4db7c7c7/attachment.html