[Haskell-cafe] Reading files efficiently
Pete Chown
1 at 234.cx
Sun Mar 19 09:31:25 EST 2006
I've got another n00b question, thanks for all the help you have been
giving me!
I want to read a text file. As an example, let's use
/usr/share/dict/words and try to print out the last line of the file.
First of all I came up with this program:
import System.IO
main = readFile "/usr/share/dict/words" >>= putStrLn.last.lines
This program gives the following error, presumably because there is an
ISO-8859-1 character in the dictionary:
"Program error: <handle>: IO.getContents: protocol error (invalid
character encoding)"
How can I tell the Haskell system that it is to read ISO-8859-1 text
rather than UTF-8?
I now used iconv to convert the file to UTF-8 and tried again. This
time it worked, but it seems horribly inefficient -- Hugs took 2.8
seconds to read a 96,000 line file. By contrast the equivalent Python
program:
print open("words", "r").readlines()[-1]
took 0.05 seconds. I assume I must be doing something wrong here, and
somehow causing Haskell to use a particularly inefficient algorithm.
Can anyone give me any clues what I should be doing instead?
Thanks again,
Pete
More information about the Haskell-Cafe
mailing list