[Haskell-cafe] Hugs vs GHC (again) was: Re: Some random
Malcolm.Wallace at cs.york.ac.uk
Fri Jan 7 07:29:47 EST 2005
"Simon Marlow" <simonmar at microsoft.com> writes:
> Here's a summary of the state of Unicode support in GHC and other
> compilers. There are several aspects:
> - Can the Char type hold the full range of Unicode characters?
> This has been true in GHC for some time, and is now true in Hugs.
> I don't think it's true in nhc98 (please correct me if I'm wrong).
You're wrong :-). nhc98 has always had 32-bit characters internally.
> - Do the character class functions (isUpper, isAlpha etc.) work
> correctly on the full range of Unicode characters? This is true in
> Hugs. It's true with GHC on some systems (basically we were lazy
> and used the underlying C library's support here, which is patchy).
In nhc98, currently the character class functions work only on the
8-bit Latin-1 range.
> - Can you use (some encoding of) Unicode for your Haskell source files?
> I don't think this is true in any Haskell compiler right now.
Many years ago, hbc claimed to be the only compiler with support for this.
> - Can you do String I/O in some encoding of Unicode? No Haskell
> compiler has support for this yet, and there are design decisions
> to be made. Some progress has been made on an experimental prototype
> (see recent discussion on this list).
Apparently some Haskell/XML toolkits already do I/O conversions in a
selection of the encodings permitted by the XML standard, namely ASCII,
Latin-1, UTF-8, and UTF-16 (either byte ordering), but not yet UCS-4
(four possible byte orderings), or EBCDIC. See for example:
> - What about Unicode FilePaths? This was discussed a few months ago
> on the haskell(-cafe) list, no support yet in any compiler.
More information about the Haskell-Cafe