[Haskell-cafe] Composing a list of Enumeratees

oleg at okmij.org oleg at okmij.org
Sat Oct 8 08:27:40 CEST 2011

```Roma'n Gonza'lez wrote
> how can one compose a list of enumeratees, is it even possible?

It is possible. Composition of enumeratees is interesting because
-- there are several, distinct and useful ways of doing it,
-- one of the composition methods, shell-like pipelining, requires
higher-rank types.

Let us step back to Enumerators and Iteratees. Iteratee is a
monad and composes as a monad. A list of Iteratees can be fused into a
single Iteratee using the ordinary Control.Monad.sequence (or
sequence_).

Enumerator, an iteratee transformer, is a function of the type
Iteratee el m a -> m (Iteratee el m a)
which fits the patters (t -> m t) for some monad m. Enumerators thus
compose as ordinary `monadic functions' using Kleisli composition:

> (>>>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)
> f >>> g = \x -> f x >>= g

(the operator (>>>) is defined in Control.Category). A list of
enumerators is composed with foldl1 (>>>), which is the same
as foldr1 (>>>) since (>>>) is associative.

As a running example of the list of enumeratees to compose we take

> takes = [take 2, take 3, take 4]

Granted, the example is a bit contrived: normally we want to compose a
variety of enumeratees rather than the single take.  On the other
hand, `take n' makes a neater example.

Enumeratees are enumerators, and so compose as enumerators,
through Kleisli composition:

> c1 :: Monad m => Enumeratee el el m a
> c1 = foldl1 (>>>) takes

Composing enumerators `concatenates' their sources. Therefore,
c1 is equivalent to take (2+3+4):

> c1r = runIdentity \$ run =<< run =<< enum_pure_1chunk [1..15] (c1 stream2list)
> -- [1,2,3,4,5,6,7,8,9]

If e_j is an enumeratee and i is an iteratee (for the inner stream),
(e_i i) is an iteratee for the outer stream. We may use various
iteratee compositions to compose (e_j i).

For example, we use sequential composition

> c2s :: Monad m => Iteratee el m a -> Iteratee el m [a]
> c2s = \i -> sequence \$ map (\e -> runI =<< e i) takes
> c2sr = runIdentity \$ run =<< enum_pure_1chunk [1..15] (c2s stream2list)
> -- [[1,2],[3,4,5],[6,7,8,9]]

or parallel composition

> c2p :: Monad m => Iteratee el m a -> Iteratee el m [a]
> c2p = \i -> parallel \$ map (\e -> runI =<< e i) takes
>  where
>  parallel [i] = liftM (: []) i
>  parallel (i:t) = do
>     (iv,tv) <- enumPair i (parallel t)
>     return (iv:tv)
>
> c2pr = runIdentity \$ run =<< enum_pure_1chunk [1..15] (c2p stream2list)
> -- [[1,2],[1,2,3],[1,2,3,4]]

Finally, we can compose enumeratees `telescopically': enumeratees are
stream converters, which can be arranged into a pipeline -- very much
like the Unix Shell pipeline -- to convert the elements of the outer
stream further and further.

> pipe :: Monad m =>
>  	(forall a. Enumeratee el1 el2 m a) -> Enumeratee el2 el3 m a ->
>  	Enumeratee el1 el3 m a
> pipe e12 e23 = \i3 -> e12 (e23 i3) >>= runI

(see the source code for the step-wise derivation of that
composition).

> c4 :: Monad m => Enumeratee el el m a
> c4 = take 4 `pipe` (take 3 `pipe` take 2)

Piping 'take n_j' into each other is not very interesting. It is
easy to see that the result is equivalent to the single
take (minimum [n_1,...n_j]).

> c4r = runIdentity \$ run =<< enum_pure_1chunk [1..15] (runI =<< c4 stream2list)
> -- [1,2]
> -- indeed, c4 behaves like take (minimum [2,3,4]) == take 2

To perform the pipeline composition on the list of enumeratees via
fold we need impredicative polymorphism -- or at least its emulation,
using the newtype trick. The trick makes type abstractions and
applications explicit.

> newtype EI el1 el2 m = EI{unEI :: forall a. Enumeratee el1 el2 m a}
> takes' = [EI (take 2), EI (take 3), EI (take 4)]

> c4'  = foldl1 (\e1 e2 -> EI (unEI e1 `pipe` unEI e2)) takes'
> c4'' = foldr1 (\e1 e2 -> EI (unEI e1 `pipe` unEI e2)) takes'
> -- both are equivalent to take (minimum [2,3,4])

The complete source code for the article is available at