[Haskell-cafe] The Nature of Char and String
John Goerzen
jgoerzen at complete.org
Sun Jan 30 09:47:53 EST 2005
Char in Haskell represents a Unicode character. I don't know exactly
what its size is, but it must be at least 16 bits and maybe more.
String would then share those properties.
However, usually I'm accustomed to dealing with data in 8-bit words.
So I have some questions:
* If I use hPutStr on a string, is it guaranteed that the number of
8-bit bytes written equals (length stringWritten)?
+ If no, what is the representation written? I'm assuming UTF-8.
How could I find out how many bytes were actually written?
+ If yes, what happens to the upper 8 bits? Are they simply
stripped off?
* If I run hGetChar, is it possible that it would consume more than
one byte of input? How can I determine whether or not this has
happend?
* Does Haskell treat the "this is a Unicode file" marker special in
any way?
* Same questions on withCString and related String<->CString
conversions.
-- John
More information about the Haskell-Cafe
mailing list