[Haskell-beginners] hGetContents, unicode and linux

Yitzchak Gale gale at sefer.org
Sun Nov 28 01:53:56 EST 2010


Michael Snoyman wrote:
> Perhaps a silly question, but are you certain that the input file is
> valid UTF-8?

That is a very good point.

> You could also try using the readFile from utf8-string...
> [or] read the contents as a lazy
> bytestring and then use the decode functions...

Those approaches are now both deprecated. Either do
what you are doing, which gives you conceptually simple
strings as lists of Char. Or, for better efficiency, use
the text package:

>    import qualified Data.Text.Lazy as T
>    main :: IO ()
>    main
>     = do   text <- T.readFile "unicode.txt"
>            T.putStr text

In any case, you still need to have the correct encoding
set on the handles as before. (And the input needs to
be valid for your selected encoding.)

Regards,
Yitz


More information about the Beginners mailing list