[Haskell-cafe] Unwrapping long lines in text files

Ben Millwood haskell at benmachine.co.uk
Sat Aug 14 19:07:18 EDT 2010


On Sat, Aug 14, 2010 at 2:38 AM, michael rice <nowgate at yahoo.com> wrote:
>
> The program below takes a text file and unwraps all lines to 72 columns, but I'm getting an end of file message at the top of my output.
>
> How do I lose the EOF?
>
> Michael
>

While many other people have shown you why you need not necessarily
answer this question, I think it'd be helpful for you to hear the
answer anyway.
Your message is being produced because you are trying to getLine when
there is no input left. This raises an exception, which, because it is
not handled by your program, prints a diagnostic message and exits.
Strangely, it prints this before the output of your program - this
isn't terribly important, but for the sake of completeness, it's
because of the different buffering characteristics of stdout and
stderr, which confusingly mean that even though your program produces
output and then produces an error, the error is printed immediately
while the output waits until the program is terminated to be produced.
I think. Something like that, anyway.

So, how do you avoid the exception? You can use System.IO.isEOF [1] to
check if there is input available on the standard input handle:

main = do
  eof <- isEOF
  when (not eof) realMain
  -- when from Control.Monad, see also: unless
 where
  realMain = do realStuff

Or you can let getLine throw the exception, but catch it and deal with
it yourself, rather than letting it kill your program.
Catching it is a little less simple, but is considerably more flexible
and powerful. The exceptions situation in Haskell is somewhat
complicated by the fact that the mechanism used by haskell98 has been
improved upon, but the new extensible mechanism is incompatible with
the old so both have to hang around. Have a look at Control.Exception
[2] and System.IO.Error [3]. In either case you have 'catch' as a sort
of basic operation, and more convenient things like 'try' which sort
of turn an exception into a pure Either result. I'd do something like
this:

main = do
  result <- try getLine
  case result of
    Left err -> return () -- or do something diagnostic
    Right "" -> putStrLn "" >> main
    Right line -> doStuffWith line

On a more general note, your main function looks a little suspect
because it *always* recurses into itself - there's no termination
condition. A question nearly as important as "how does my program know
what to do" is "how does it know when to stop" :)

[1] http://hackage.haskell.org/packages/archive/base/4.2.0.1/doc/html/System-IO.html#v%3AisEOF
[2] http://hackage.haskell.org/packages/archive/base/4.2.0.1/doc/html/Control-Exception.html
[3] http://hackage.haskell.org/packages/archive/base/4.2.0.1/doc/html/System-IO-Error.html


More information about the Haskell-Cafe mailing list