[Haskell-cafe] UTF-8 in Haskell.

Magicloud Magiclouds magicloud.magiclouds at gmail.com
Mon Dec 27 02:40:15 CET 2010


Thanks for the ideas.
In this case, ssh, it is a transfer layer protocol, which means it
does not convert anything. For example the server was using ascii, and
the client was using ascii, then good. If the client was using UTF-8
instead, then he might get a broken display, ssh itself would not
care.
My idea for CString is because in C, this is easy, "I" do not pay
attention to which encode the given string is using.
But I am not sure how CString works. If it just convert things into
ASCII, then it is bad.

On Thu, Dec 23, 2010 at 7:18 PM, Max Bolingbroke
<batterseapower at hotmail.com> wrote:
> On 23 December 2010 05:29, Magicloud Magiclouds
> <magicloud.magiclouds at gmail.com> wrote:
>>  If so, OK, then I think I could make a packInt which turns an Int
>> into 4 Word8 first. Thus under all situation (ascii, UTF-8, or even
>> UTF-32), my program always send 4 bytes through the network. Is that
>> OK?
>
> I think you are describing the UTF-32 encoding (under the assumption
> that fromEnum on Char returns the Unicode code point of that
> character, which I think is true). UTF-32 is capable of describing
> every Unicode code point so this is indeed non-lossy. UTF-32 is a
> reasonable wire transfer format (if a bit inefficient!).
>
> Don't roll your own encoding logic though, System.IO provides a
> TextEncoding for UTF-32 you can use to do the job more reliably.
>
> Cheers,
> Max
>



-- 
竹密岂妨流水过
山高哪阻野云飞



More information about the Haskell-Cafe mailing list