[Haskell-cafe] gbp sign showing as unknown character by GHC
Colin Paul Adams
colin at colina.demon.co.uk
Thu Aug 20 03:12:53 EDT 2009
>>>>> "Stuart" == Stuart Cook <scook0 at gmail.com> writes:
Stuart> On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul
Stuart> Adams<colin at colina.demon.co.uk> wrote:
>> But how do you get Latin-1 bytes from a Unicode string? This
>> would need a transcoding process.
Stuart> The first 256 code-points of Unicode coincide with
Stuart> Latin-1. Therefore, if you truncate Unicode characters
Stuart> down to 8 bits you'll effectively end up with Latin-1 text
Stuart> (except that any code points above U+00FF will give
Stuart> strange results).
Stuart> If your terminal then interprets these bytes as UTF-8 (or
Stuart> anything else, really), the result will be gibberish or
Stuart> worse.
Yes, but surely this will work both ways. The same bytes on input
should come back on output, shouldn't they?
--
Colin Adams
Preston Lancashire
More information about the Haskell-Cafe
mailing list