[Haskell-cafe] Streaming a Conduit into a lazy list

Jules Bean jules at jellybean.co.uk
Wed Nov 4 13:14:09 UTC 2015


Yay! That’s great.

The point of course is that you shouldn’t need to do a lazyIO trick when there is no IO going on.

My actual application of this was in Text.XML.Stream.Parse which certainly can produce output lazily without doing IO.

Jules

On 4 Nov 2015, at 13:04, Michael Snoyman <michael at snoyman.com> wrote:

> This got me curious, so I just added a `sourceToList` function to master:
> 
> https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace165d73a
> 
> On Wed, Nov 4, 2015 at 4:43 AM, Michael Snoyman <michael at snoyman.com> wrote:
> conduit isn't designed to be used in this way, though in theory such a lazy function would be possible. To get the same effect, you can (ab)use the Data.Conduit.Lazy module, which provides a lazy I/O escape hatch. In this case:
> 
> > lazyConsume source123Error >>= print
> 
> On Wed, Nov 4, 2015 at 12:59 AM, Jules Bean <jules at jellybean.co.uk> wrote:
> Conduits seem like a popular and useful way to compose operations
> which consume and produce streams at various speeds. I hadn't used
> them before, so apologies for any obvious mistakes below.
> 
> Conduits have lots of power in terms of how you can interleave monadic
> effects with the streaming; but I found that I had a stream with no
> effects at all and I wanted to convert it back into haskells more
> simplistic representation, lazy lists.
> 
> This is probably a well-known trick, but I couldn't find how to do this
> from googling, so here is the solution I found in case it helps
> someone else:
> 
> > {-# LANGUAGE DeriveDataTypeable, FlexibleContexts  #-}
> > module Main where
> >
> > import Data.Conduit
> > import Control.Monad.Catch
> > import Control.Monad.Trans.Resource
> > import Control.Monad.Writer
> > import Control.Monad.Identity
> > import qualified Data.Conduit.List as CL
> > import Data.Typeable
> >
> > source123Error :: Monad m => Source m Int
> > source123Error = do
> >  yield 1
> >  yield 2
> >  yield 3
> >  error "error"
> 
> A source which produces some data and then hangs, to test lazy production.
> 
> One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:
> 
> *Main> :t runIdentity (source123Error $$ CL.consume)
> runIdentity (source123Error $$ CL.consume) :: [Int]
> *Main> runIdentity (source123Error $$ CL.consume)
> *** Exception: error
> 
> But what we can do instead is push the data out via the Writer monad:
> 
> > tellEverything :: MonadWriter [a] m => Sink a m ()
> > tellEverything = awaitForever (\x -> tell [x])
> 
> *Main> :t source123Error $$ tellEverything
> source123Error $$ tellEverything :: MonadWriter [Int] m => m ()
> *Main> snd $ runWriter (source123Error $$ tellEverything)
> [1,2,3*** Exception: error
> 
> Success! A lazily produced list.
> 
> If your output is of any size - and depending on the compositions
> pattern in your code - it may be much faster to use this version:
> 
> > tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m ()
> > tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
> 
> *Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne)
> [1,2,3*** Exception: error
> 
> I found a further case where the Conduit I was trying to run was pure
> but had a 'MonadThrow' constraint. You can use the same approach here,
> using the ExceptionT transformer to satisfy the MonadThrow constraint.
> 
> > data MyError = MyError deriving (Show,Typeable)
> > instance Exception MyError
> >
> > source123Throw :: MonadThrow m => Source m Int
> > source123Throw = do
> > yield 1
> > yield 2
> > yield 3
> > throwM MyError
> 
> *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything)
> [1,2,3*** Exception: error
> *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything)
> [1,2,3]
> 
> An aside: I tried to measure the speed difference between the
> list-mappend and Endo-mappend approaches with the following
> code. count2N uses a binary-tree shaped recursion so it should be
> fairly bad for list-mappend with lots of long lists on the left.
> 
> > count2N :: Monad m => Int -> Source m Int
> > count2N 0 = yield 0
> > count2N n = count2N (n-1) >> count2N (n-1)
> 
> > speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything)
> > speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
> 
> With -O or -O2 I measure no speed difference between these, they both
> take a bit over 2 seconds. With no optimisation flag, speedTest2 is
> slower, 17 seconds vs 11 seconds.
> 
> I'm quite surprised the Endo version isn't faster, it seems like
> something is rewriting those list appends?
> 
> Cheers,
> 
> Jules
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20151104/c3123068/attachment.html>


More information about the Haskell-Cafe mailing list