[Haskell-cafe] Haskell & monads for newbies

Sun Jul 15 15:28:56 EDT 2007

On Sunday 15 July 2007, Paul Moore wrote:
> On 15/07/07, Andrew Coppin <andrewcoppin at btinternet.com> wrote:
> > I guess because in most normal programming languages you can do I/O
> > anywhere you damn like, it doesn't occur to most programmers that it's
> > possible to make a seperation. (Most seem to realise that, e.g., mixing
> > business logic with GUI code is a Bad Thing though...)
>
> Hmm, I would speculate (I have no hard data, in other words...) that
> it's more the case that in imperative languages, you do I/O throughout
> the program, because that defers the I/O (which is slow) to the last
> possible moment, and it allows you to reuse memory buffers.
>
> People's intuition about performance and memory usage says that
> delaying I/O is good, and "separating" I/O and logic (which is taken
> to mean slurping data in all at once, and then processing it) is
> memory intensive and risks doing unnecessary I/O.
>
> Haskell handles this with laziness. The canonical example is counting
> characters in a file, where you just grab the whole file, and use
> length. An imperative programmer's intuition says that this wastes
> huge amounts of memory compared to reading character by character and
> incrementing a count. Lazy I/O means that no more than 1 character
> needs to be in RAM at any one time, without the programmer need to do
> the bookkeeping.
>
> If lazy I/O was publicised in this way, as separation of concerns (I/O
> and processing) with the compiler and language handling the work of
> minimising memory use and avoiding unnecessary I/O, then maybe the
> message might get through better. However, the only article I've ever
> seen taking this approach (http://blogs.nubgames.com/code/?p=22)
> didn't seem to get a good reception in the Haskell community, sparking
> comments that hGetContents and similar functions had a number of
> issues which made them "bad practice". The result was to leave me with
> a feeling that separating I/O and processing in Haskell really was
> hard, but I never quite understood why...

Because hGetContents only buys you laziness /if you use it lazily/.  And 
laziness is, technically, a denotational property, but it is a very 
operational-feeling denotational property.  And operational reasoning is 
difficult in imperative languages and gets really, really hard in lazy 
functional languages.  And the article you cite falls flat on its face in 
trying to be lazy:

> readWithIncludes :: String -> IO [String]
> readWithIncludes f = do
>   s <- readFile f
>   ss <- mapM expandIncludes (lines s)
>   return (concat ss)

> expandIncludes :: String -> IO [String]
> expandIncludes s =
>   if isInclude s
>     then
>       readWithIncludes (includeFile s)
>     else
>       return [s]

That's calling mapM, a strict function, on the result of lines ss --- an 
arbitrarily long list.

More generally, I suspect the Haskell community has a collective memory of 
stream I/O, back when this sort of thing used to be /really, really 
important/, because your program had type [Response] -> [Request] and if it 
wasn't lazy enough in its argument, you'd get a deadlock --- and that 
deadlock had nothing whatsoever to do with the result of applying your 
function to total arguments, so reasoning about it required abandoning every 
Haskeller's instinct to reason about functions only over total (or even 
finite total) arguments.  interact takes a function with a type eerily 
similar to [Response] -> [Request], which means its argument has all the same 
problems.  Laziness is great and everything --- but it's a lot of work, even 
in Haskell.

> So I guess that leaves me with the question: is separating I/O and
> processing really the right thing to do (in terms of memory usage and
> performance) in Haskell, and if so, why isn't it advertised more? (And
> for extra credit, please explain why the article I quoted above didn't
> make more of an impact in the Haskell community... :-))

Jonathan Cast
http://sourceforge.net/projects/fid-core
http://sourceforge.net/projects/fid-emacs