[Haskell-cafe] How to use Unicode strings?

Austin Seipp mad.one at gmail.com
Sat Nov 22 09:07:42 EST 2008


Excerpts from Dmitri O.Kondratiev's message of Sat Nov 22 05:40:41 -0600 2008:
> Please advise how to write Unicode string, so this example would work:
> 
> main = do
>   putStrLn "Les signes orthographiques inclus les accents (aigus, grâve,
> circonflexe), le tréma, l'apostrophe, la cédille, le trait d'union et la
> majuscule."
> 
> I get the following error:
> hello.hs:4:68:
>     lexical error in string/character literal (UTF-8 decoding error)
> Failed, modules loaded: none.
> Prelude>
> 
> Also, how to read Unicode characters from standard input?
> 
> Thanks!
> 

Hi,

Check out the utf8-string package on hackage:

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/utf8-string

In particular, you probably want the System.IO.UTF8 functions, which
are identical to to their non-utf8 counterparts in System.IO except,
well, they handle unicode properly.

More specifically, you will probably want to mainly look at
Codec.Binary.UTF8.String.encodeString and decodeString, respectively
(in fact, most of the System.IO.UTF8 functions are defined in terms of
these, e.g. 'putStrLn x = IO.putStrLn (encodeString x)' and 'getLine =
liftM decodeString IO.getLine'.)

Austin


More information about the Haskell-Cafe mailing list