[Haskell-cafe] Hugs vs GHC (again) was: Re: Some random newbiequestions

Malcolm Wallace Malcolm.Wallace at cs.york.ac.uk
Fri Jan 7 07:29:47 EST 2005

"Simon Marlow" <simonmar at microsoft.com> writes:

> Here's a summary of the state of Unicode support in GHC and other
> compilers.  There are several aspects:
>  - Can the Char type hold the full range of Unicode characters?
>    This has been true in GHC for some time, and is now true in Hugs.
>    I don't think it's true in nhc98 (please correct me if I'm wrong).

You're wrong :-).  nhc98 has always had 32-bit characters internally.

>  - Do the character class functions (isUpper, isAlpha etc.) work
>    correctly on the full range of Unicode characters?  This is true in
>    Hugs.  It's true with GHC on some systems (basically we were lazy
>    and used the underlying C library's support here, which is patchy).

In nhc98, currently the character class functions work only on the
8-bit Latin-1 range.

>  - Can you use (some encoding of) Unicode for your Haskell source files?
>    I don't think this is true in any Haskell compiler right now.

Many years ago, hbc claimed to be the only compiler with support for this.

>  - Can you do String I/O in some encoding of Unicode?  No Haskell
>    compiler has support for this yet, and there are design decisions
>    to be made.  Some progress has been made on an experimental prototype
>    (see recent discussion on this list).

Apparently some Haskell/XML toolkits already do I/O conversions in a
selection of the encodings permitted by the XML standard, namely ASCII,
Latin-1, UTF-8, and UTF-16 (either byte ordering), but not yet UCS-4
(four possible byte orderings), or EBCDIC.  See for example:

>  - What about Unicode FilePaths?  This was discussed a few months ago
>    on the haskell(-cafe) list, no support yet in any compiler.

Indeed, AFAIK.


More information about the Haskell-Cafe mailing list