[Haskell-cafe] Streaming a Conduit into a lazy list

Jules Bean jules at jellybean.co.uk
Wed Nov 4 08:59:48 UTC 2015


Conduits seem like a popular and useful way to compose operations
which consume and produce streams at various speeds. I hadn't used
them before, so apologies for any obvious mistakes below.

Conduits have lots of power in terms of how you can interleave monadic
effects with the streaming; but I found that I had a stream with no
effects at all and I wanted to convert it back into haskells more
simplistic representation, lazy lists.

This is probably a well-known trick, but I couldn't find how to do this
from googling, so here is the solution I found in case it helps
someone else:

> {-# LANGUAGE DeriveDataTypeable, FlexibleContexts  #-}
> module Main where 
> 
> import Data.Conduit
> import Control.Monad.Catch
> import Control.Monad.Trans.Resource
> import Control.Monad.Writer
> import Control.Monad.Identity
> import qualified Data.Conduit.List as CL
> import Data.Typeable
> 
> source123Error :: Monad m => Source m Int
> source123Error = do
>  yield 1
>  yield 2
>  yield 3
>  error "error"

A source which produces some data and then hangs, to test lazy production. 

One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:

*Main> :t runIdentity (source123Error $$ CL.consume)
runIdentity (source123Error $$ CL.consume) :: [Int]
*Main> runIdentity (source123Error $$ CL.consume)
*** Exception: error

But what we can do instead is push the data out via the Writer monad:

> tellEverything :: MonadWriter [a] m => Sink a m ()
> tellEverything = awaitForever (\x -> tell [x])

*Main> :t source123Error $$ tellEverything
source123Error $$ tellEverything :: MonadWriter [Int] m => m ()
*Main> snd $ runWriter (source123Error $$ tellEverything)
[1,2,3*** Exception: error

Success! A lazily produced list.

If your output is of any size - and depending on the compositions
pattern in your code - it may be much faster to use this version:

> tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m ()
> tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))

*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne)
[1,2,3*** Exception: error

I found a further case where the Conduit I was trying to run was pure
but had a 'MonadThrow' constraint. You can use the same approach here,
using the ExceptionT transformer to satisfy the MonadThrow constraint.

> data MyError = MyError deriving (Show,Typeable)
> instance Exception MyError
> 
> source123Throw :: MonadThrow m => Source m Int
> source123Throw = do
> yield 1
> yield 2
> yield 3
> throwM MyError

*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything)
[1,2,3*** Exception: error
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything)
[1,2,3]

An aside: I tried to measure the speed difference between the
list-mappend and Endo-mappend approaches with the following
code. count2N uses a binary-tree shaped recursion so it should be
fairly bad for list-mappend with lots of long lists on the left.

> count2N :: Monad m => Int -> Source m Int
> count2N 0 = yield 0
> count2N n = count2N (n-1) >> count2N (n-1)

> speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything)
> speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)

With -O or -O2 I measure no speed difference between these, they both
take a bit over 2 seconds. With no optimisation flag, speedTest2 is
slower, 17 seconds vs 11 seconds. 

I'm quite surprised the Endo version isn't faster, it seems like
something is rewriting those list appends?

Cheers,

Jules



More information about the Haskell-Cafe mailing list