[Haskell-cafe] Re: String vs ByteString
Donn Cave
donn at avvanta.com
Sat Aug 14 11:49:02 EDT 2010
Quoth Brandon S Allbery KF8NH <allbery at ece.cmu.edu>,
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 8/14/10 01:29 , Kevin Jardine wrote:
>> I think that this kind of programming detail should be handled
>> internally (even if necessary by switching automatically from UTF-8 to
>> UTF-16 depending upon the language).
It seems like the right thing, described in the wrong words - wouldn't
it be a more sensible ideal, to simply `switch' depending on the
character encoding?
I mean, to start with, you'd surely wish for some standardization,
so that the difference between UTF-8 and UTF-16 is essentially internal,
while you use the same API indifferently.
Second, a key requirement to effectively work with external data is
support for multiple character encodings. E.g., if Text is internally
UTF-16, it still must be able to input and output UTF-8, and presumably
also UTF-16 where appropriate.
So given full support for _both_ encodings (for example, Text
implementation for `native' UTF-8), and support for input data of
_either_ encoding as encountered at run time ... then the internal
implementation choice should simply follow the external data. For
Chinese inputs you'd be running UTF-16 functions, for French UTF-8.
Donn Cave, donn at avvanta.com
More information about the Haskell-Cafe
mailing list