[Haskell-cafe] Streaming a Conduit into a lazy list

Wed Nov 4 13:04:40 UTC 2015

This got me curious, so I just added a `sourceToList` function to master:

https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace165d73a

On Wed, Nov 4, 2015 at 4:43 AM, Michael Snoyman <michael at snoyman.com> wrote:

> conduit isn't designed to be used in this way, though in theory such a
> lazy function would be possible. To get the same effect, you can (ab)use
> the Data.Conduit.Lazy module, which provides a lazy I/O escape hatch. In
> this case:
>
> > lazyConsume source123Error >>= print
>
> On Wed, Nov 4, 2015 at 12:59 AM, Jules Bean <jules at jellybean.co.uk> wrote:
>
>> Conduits seem like a popular and useful way to compose operations
>> which consume and produce streams at various speeds. I hadn't used
>> them before, so apologies for any obvious mistakes below.
>>
>> Conduits have lots of power in terms of how you can interleave monadic
>> effects with the streaming; but I found that I had a stream with no
>> effects at all and I wanted to convert it back into haskells more
>> simplistic representation, lazy lists.
>>
>> This is probably a well-known trick, but I couldn't find how to do this
>> from googling, so here is the solution I found in case it helps
>> someone else:
>>
>> > {-# LANGUAGE DeriveDataTypeable, FlexibleContexts  #-}
>> > module Main where
>> >
>> > import Data.Conduit
>> > import Control.Monad.Catch
>> > import Control.Monad.Trans.Resource
>> > import Control.Monad.Writer
>> > import Control.Monad.Identity
>> > import qualified Data.Conduit.List as CL
>> > import Data.Typeable
>> >
>> > source123Error :: Monad m => Source m Int
>> > source123Error = do
>> >  yield 1
>> >  yield 2
>> >  yield 3
>> >  error "error"
>>
>> A source which produces some data and then hangs, to test lazy production.
>>
>> One solution which doesn't work is CL.consume (as its own docs say) - and
>> instead this happens:
>>
>> *Main> :t runIdentity (source123Error $$ CL.consume)
>> runIdentity (source123Error $$ CL.consume) :: [Int]
>> *Main> runIdentity (source123Error $$ CL.consume)
>> *** Exception: error
>>
>> But what we can do instead is push the data out via the Writer monad:
>>
>> > tellEverything :: MonadWriter [a] m => Sink a m ()
>> > tellEverything = awaitForever (\x -> tell [x])
>>
>> *Main> :t source123Error $$ tellEverything
>> source123Error $$ tellEverything :: MonadWriter [Int] m => m ()
>> *Main> snd $ runWriter (source123Error $$ tellEverything)
>> [1,2,3*** Exception: error
>>
>> Success! A lazily produced list.
>>
>> If your output is of any size - and depending on the compositions
>> pattern in your code - it may be much faster to use this version:
>>
>> > tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m ()
>> > tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
>>
>> *Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne)
>> [1,2,3*** Exception: error
>>
>> I found a further case where the Conduit I was trying to run was pure
>> but had a 'MonadThrow' constraint. You can use the same approach here,
>> using the ExceptionT transformer to satisfy the MonadThrow constraint.
>>
>> > data MyError = MyError deriving (Show,Typeable)
>> > instance Exception MyError
>> >
>> > source123Throw :: MonadThrow m => Source m Int
>> > source123Throw = do
>> > yield 1
>> > yield 2
>> > yield 3
>> > throwM MyError
>>
>> *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter
>> . runExceptionT $ (source123Error $$ tellEverything)
>> [1,2,3*** Exception: error
>> *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter
>> . runExceptionT $ (source123Throw $$ tellEverything)
>> [1,2,3]
>>
>> An aside: I tried to measure the speed difference between the
>> list-mappend and Endo-mappend approaches with the following
>> code. count2N uses a binary-tree shaped recursion so it should be
>> fairly bad for list-mappend with lots of long lists on the left.
>>
>> > count2N :: Monad m => Int -> Source m Int
>> > count2N 0 = yield 0
>> > count2N n = count2N (n-1) >> count2N (n-1)
>>
>> > speedTest1 = print . length . snd . runWriter $ (count2N 24 $$
>> tellEverything)
>> > speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $
>> (count2N 24 $$ tellEveryEndo)
>>
>> With -O or -O2 I measure no speed difference between these, they both
>> take a bit over 2 seconds. With no optimisation flag, speedTest2 is
>> slower, 17 seconds vs 11 seconds.
>>
>> I'm quite surprised the Endo version isn't faster, it seems like
>> something is rewriting those list appends?
>>
>> Cheers,
>>
>> Jules
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20151104/7d02d6a6/attachment.html>