[Haskell-cafe] What is the state if Unicode in Haskell
implementations?
Piotr Kalinowski
pitkali at gmail.com
Mon Jul 31 08:07:13 EDT 2006
On 31/07/06, Olof Bjarnason <olof.bjarnason at gmail.com> wrote:
> 1) reading UTF-8 coded text files into unicode-enabled Strings, lets call
> them UString
> 2) writing UStrings to UTF-8 coded text files
> 3) using unicode strings in-code, that is in my .hs files
>
In case of GHC:
String (Char actually) is unicode enabled. The current stable version cannot
read UTF-8 encoded source files though (I've written a converter to
workaround it - it escapes the national characters). The development version
however is capable of reading UTF-8 encoded source files and does encode
read strings using unicode.
However - the IO is not aware of Unicode. So in order to do 1) and 2) you
have to
- read/write stream of bytes encoding text in UTF-8 from/to a file
- convert it to/from Unicode encoding.
The first one is just about reading/writing using normal IO operations. The
second can be done with the following module:
http://repetae.net/john/repos/jhc/UTF8.hs
Note also that the same procedure would apply to simply printing/reading
to/from the screen.
Does that help?
Regards,
Piotr Kalinowski
--
Intelligence is like a river: the deeper it is, the less noise it makes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org//pipermail/haskell-cafe/attachments/20060731/1f81ffa7/attachment.htm
More information about the Haskell-Cafe
mailing list