[Haskell-cafe] Has character changed in GHC 6.8?

Jules Bean jules at jellybean.co.uk
Wed Jan 23 05:56:17 EST 2008


Peter Verswyvelen wrote:

> Now I'm getting a bit confused here. To summarize, what encoding does 
> GHC 6.8.2 use for [Char]? UCS-32?

How dare you! Such a personal question! This is none of your business.

I jest, but the point is sound: the internal storage of Char is ghc's 
business, and it should not leak to the programmer. All the programmer 
needs to know is that Char is capable of storing unicode characters. GHC 
might choose some custom storage method, including making Char an ADT 
behind the scenes, or whatever it likes. Other haskell compilers or 
interpreters are free to choose their own representation.

In practice, I believe that for GHC it's a wchar, which is typically a 
32bit character with reasonably efficient libc support.

What *does* matter to the programmer is what encodings putStr and 
getLine use. AFAIK, they use "lower 8 bits of unicode code point" which 
is almost functionally equivalent to latin-1.

Jules



More information about the Haskell-Cafe mailing list