add utf8-string in haskell platform
marlowsd at gmail.com
Fri May 15 10:14:13 EDT 2009
On 15/05/2009 14:55, Edward Kmett wrote:
> Fortunately, the bytewise encoding of '\n' is sufficient to recognize a
> newline, any other attempted representation in UTF8 (i.e. as a 2-byte
> symbol starting with 0xc0) would be non-canonical and per RFC 3629
> should be rejected anyways.
> So if you view ByteString as a stream of bytes that may or may not be
> utf8 encoded, scanning for 0x0a gives you the correct behavior for both
The byte string can be in any encoding, not just UTF-8.
> -Edward Kmett
> On Fri, May 15, 2009 at 7:02 AM, Simon Marlow <marlowsd at gmail.com
> <mailto:marlowsd at gmail.com>> wrote:
> On 15/05/2009 03:07, Bryan O'Sullivan wrote:
> On Thu, May 14, 2009 at 4:23 PM, Simon Michael <simon at joyful.com
> <mailto:simon at joyful.com>
> <mailto:simon at joyful.com <mailto:simon at joyful.com>>> wrote:
> I'd like to request that utf8-string be added to the haskell
> platform, so that HP users can work with non-ascii text.
> I'd rather this wasn't added. It's an acceptable crutch for the
> term, but we shouldn't be using String for text manipulation, and
> bundling utf8-string implicitly blesses that approach. The text
> needs a few weeks of polish and some more testing work for QA,
> but it'll
> be the right answer well before the end of this year.
> We ought to think about the interaction between text (and
> bytestring) and the new Unicode IO library. What does text have in
> the way of IO operations?
> I've been wondering about what bytestring's hGetLine should do.
> Right now I have it doing decoding and then taking the low 8 bits,
> but that's not right. OTOH, looking for '\n' in a stream of bytes
> doesn't seem right. Maybe it should just be deprecated.
> Libraries mailing list
> Libraries at haskell.org <mailto:Libraries at haskell.org>
More information about the Libraries