[Haskell-cafe] UTF-8 in Haskell.

Magicloud Magiclouds magicloud.magiclouds at gmail.com
Mon Dec 27 07:48:09 CET 2010


Sorry, I just noticed that I had a misunderstanding here.
With encode and bytestring hackages, I think it should be OK for my requirement.

On Mon, Dec 27, 2010 at 10:32 AM, Antoine Latter <aslatter at gmail.com> wrote:
> Hi,
>
> What would you be using the CString for? A CString is really a lot
> less useful than a ByteString for almost all purposes. If I allready
> had a ByteString, the only reason I would want to convert it to a
> CString is to call a C function.
>
> Take care,
> Antoine
>
> On Sun, Dec 26, 2010 at 7:40 PM, Magicloud Magiclouds
> <magicloud.magiclouds at gmail.com> wrote:
>> Thanks for the ideas.
>> In this case, ssh, it is a transfer layer protocol, which means it
>> does not convert anything. For example the server was using ascii, and
>> the client was using ascii, then good. If the client was using UTF-8
>> instead, then he might get a broken display, ssh itself would not
>> care.
>> My idea for CString is because in C, this is easy, "I" do not pay
>> attention to which encode the given string is using.
>> But I am not sure how CString works. If it just convert things into
>> ASCII, then it is bad.
>>
>> On Thu, Dec 23, 2010 at 7:18 PM, Max Bolingbroke
>> <batterseapower at hotmail.com> wrote:
>>> On 23 December 2010 05:29, Magicloud Magiclouds
>>> <magicloud.magiclouds at gmail.com> wrote:
>>>>  If so, OK, then I think I could make a packInt which turns an Int
>>>> into 4 Word8 first. Thus under all situation (ascii, UTF-8, or even
>>>> UTF-32), my program always send 4 bytes through the network. Is that
>>>> OK?
>>>
>>> I think you are describing the UTF-32 encoding (under the assumption
>>> that fromEnum on Char returns the Unicode code point of that
>>> character, which I think is true). UTF-32 is capable of describing
>>> every Unicode code point so this is indeed non-lossy. UTF-32 is a
>>> reasonable wire transfer format (if a bit inefficient!).
>>>
>>> Don't roll your own encoding logic though, System.IO provides a
>>> TextEncoding for UTF-32 you can use to do the job more reliably.
>>>
>>> Cheers,
>>> Max
>>>
>>
>>
>>
>> --
>> 竹密岂妨流水过
>> 山高哪阻野云飞
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>



-- 
竹密岂妨流水过
山高哪阻野云飞



More information about the Haskell-Cafe mailing list