[Haskell-cafe] Sending messages up-and-down the iteratee-enumerator chain [Was: iterIO-0.1]

oleg at okmij.org oleg at okmij.org
Fri May 13 11:57:38 CEST 2011


David Mazie'res wrote:

> What you really want is the ability to send both upstream and
> downstream control messages.  Right now, I'd say iterIO has better
> support for upstream control messages, while iteratee has better
> support for downstream messages, since iteratee can just embed an
> Exception in a Stream.  (I'm assuming you could have something like a
> 'Flush' exception to cause output to be flushed by an Iteratee that
> was for some reason buffering some.)

> Can you explain how iteratee could keep track of the stream position?
> I'm not saying it's impossible, just that it's a challenging puzzle to
> make the types come out and I'd love to see the solution.  

The code described in this message does exactly that. We illustrate
enumerator's telling something to iteratees in the middle of the stream
(to Flush their buffers) as well as iteratee's asking an enumerator of
something (the stream position). The chunk of a stream and
EOF are themselves `control' messages that an enumerator may send;
the request for a new chunk, just like the request for a stream position,
is just one of the requests an iteratee may ask.

The set of messages an enumerator may send and the set of requests an
iteratee may ask are both treated as open unions.  We illustrate the
explicit coding of open unions, to let the type checker ensure that
what an iteratee may ask an enumerator can answer, and what an
enumerator may tell an iteratee can understand.

In process calculus lingo, we implement external (to iteratee) choice,
internal choice, and a form of session types.

For clarity, we implement a greatly simplified version of
iteratees. We assume a single-character chunk, which an iteratee
always consumes.  Chunking of a stream and look-ahead are orthogonal
concerns and have been discussed already.

The stream represents the external, producer choice:

> data Stream ie = 
>       Chunk Char              -- the current character
>     | SExc ie                 -- A message from the enumerator

The chunk is an ever-present option; other choices include EOF and
Flush:

> data EOF = EOF
> data Flush = Flush


The iteratee represents the internal, consumer choice:

> data Iter ee ie a = 
>       Done a
>     | Cont (Stream ie -> Iter ee ie a)
>     | IExc (ee (Iter ee ie) a)		-- other requests

Cont is a typical request from an iteratee. It is so typical that we
wire it in (we could've treated it as other requests, like Tell).
The iteratee is parametrised by what messages it understands
and what requests it may ask.

Iteratees compose as a monad:

> instance Bindable ee => Monad (Iter ee ie) where
>     return = Done
>     Done a  >>= f = f a
>     Cont k  >>= f = Cont (\x -> k x >>= f)
>     IExc ee >>= f = IExc (comp ee f)

All requests must be bindable, so they can percolate

> class Bindable ee where
>     comp :: Monad m => ee m a -> (a -> m b) -> ee m b

An exception is a sort of request (especially if the exception
is resumable)

> data Err m a  = Err (() -> m a)

> instance Bindable Err where
>     comp (Err k) f = Err (\x -> k x >>= f)


Another sort of request is to tell the position

> data Tell m a = Tell (Int -> m a)
> instance Bindable Tell where
>     comp (Tell k) f = Tell (\x -> k x >>= f)


We use Either (or higher-kinded E2) to build unions:

> class Sum e c where
>     inj :: e -> c
>     prj :: c -> Maybe e

> class Sum2 (e :: (* -> *) -> * -> *) (c :: (* -> *) -> * -> *) where
>     inj2 :: e m a -> c m a
>     prj2 :: c m a -> Maybe (e m a)


Iteratees are explicit in what they receive on the stream,
the external choices they may handle.
But they leave the requests polymorphic to ease composing with
other iteratees which may asks more requests.

Here is the simplest iteratee, which doesn't do anything but asks for
trouble

> ierr :: Sum2 Err c => Iter c ie a
> ierr = IExc . inj2 $ Err (\_ -> ierr)


A typical iteratee, like the head below, asks for little and accepts
little:

> iehead :: Sum2 Err c => Iter c EOF Char
> iehead = Cont step
>  where
>  step (Chunk a)  = Done a
>  step (SExc EOF) = ierr


We can ask for the current position:

> itell :: Sum2 Tell c => Iter c ie Int
> itell = IExc . inj2 $ Tell Done


Enumerators, in contrast, are explicit in what requests they may
satisfy, but implicit in what they may send on the stream.
A typical, small enumerator requires that an iteratee understand at
least EOF, and answers no requests beyond errors.

> en_str :: Sum EOF ie => String -> Iter Err ie x -> Iter Err ie x
> en_str _ i at Done{} = i
> en_str _ (IExc x) | Just (Err _) <- prj2 x = ierr
> en_str "" (Cont k) = k eof
> en_str (h:t) (Cont k) = en_str t $ k (Chunk h)

A typical enumeratee, like the following keeper of
positions, is explicit in requests it accepts: only Tell and Err.
The Tell requests are satisfied and not propagated. 
The stream messages are relayed from the outer to the inner stream:

> en_pos :: Int -> Iter (E2 Err Tell) ie x -> Iter Err ie x
> en_pos _ (Done x) = return x
> en_pos n (Cont k) = Cont (\s -> en_pos (n+1) (k s))
> en_pos _ (IExc x) | Just (Err _)  <- prj2 x = ierr
> en_pos n (IExc x) | Just (Tell k) <- prj2 x = en_pos n (k n)

Here are some of the examples:

> t1 = irun $ en_str "x" iehead

> -- Type error! en_str doesn't know how to handle Tell
> -- tb2 = irun $ en_str "x" itell

We interpose en_pos enumeratee to deal with positional requests, so
we can run  the whole example:

> t5 = irun $ en_str "xab" $ en_pos 0 $ iter
>  where
>  iter = do
> 	  x <- iehead
> 	  y <- ietell
> 	  return (x,y)

The complete code is available at
	http://okmij.org/ftp/Haskell/Iteratee/UpDown.hs





More information about the Haskell-Cafe mailing list