[Haskell-cafe] ANNOUNCE: streaming-conduit

Sun Jun 11 20:55:30 UTC 2017

Dear Ivan, 

I'm confused. The package documentation states that the typical problem to avoid is extracting a long list from IO, which is bound to allocate too much memory. However, if I do

preprocess :: IO [a]
preprocess = fmap (map (parse :: String -> a) . lines) $ readFile filename

then this may create a huge pile of thunks if the rest of the program is not written in a streaming style. But if done right, lazy IO will make sure only the necessary part of the list is kept im memory. For example, a strict function like 

postprocess :: [a] -> IO ()
postprocess = mapM (print . f)

with some strict f would fit the bill. Did I get the concept of lazy IO wrong? What is the operational semantics of the following fragment?

x:xs <- someIOaction :: IO [a]
Control.DeepSeq.force x

I used to think that xs is now a thunk which, when evalated further, may trigger more (read) IO actions. Hence, would behave as if someIOaction was one of your Streams. I do acknowledge, though, that the style above is easy to go wrong [*], whereas explicitly wrapping each list element in its own monadic action makes the intentions more verbose. 

Cheers,
Olaf

[*] If someIOaction needs to touch every list element in order to ensure the result has the shape _:_ then we will indeed have a huge list of thunks.