add utf8-string in haskell platform

Edward Kmett ekmett at
Fri May 15 09:55:27 EDT 2009

Fortunately, the bytewise encoding of '\n' is sufficient to recognize a
newline, any other attempted representation in UTF8 (i.e. as a 2-byte symbol
starting with 0xc0) would be non-canonical and per RFC 3629 should be
rejected anyways.

So if you view ByteString as a stream of bytes that may or may not be utf8
encoded, scanning for 0x0a gives you the correct behavior for both

-Edward Kmett
On Fri, May 15, 2009 at 7:02 AM, Simon Marlow <marlowsd at> wrote:

> On 15/05/2009 03:07, Bryan O'Sullivan wrote:
>> On Thu, May 14, 2009 at 4:23 PM, Simon Michael <simon at
>>  <mailto:simon at>> wrote:
>>    I'd like to request that utf8-string be added to the haskell
>>    platform, so that HP users can work with non-ascii text.
>> I'd rather this wasn't added. It's an acceptable crutch for the short
>> term, but we shouldn't be using String for text manipulation, and
>> bundling utf8-string implicitly blesses that approach. The text library
>> needs a few weeks of polish and some more testing work for QA, but it'll
>> be the right answer well before the end of this year.
> We ought to think about the interaction between text (and bytestring) and
> the new Unicode IO library.  What does text have in the way of IO
> operations?
> I've been wondering about what bytestring's hGetLine should do.  Right now
> I have it doing decoding and then taking the low 8 bits, but that's not
> right.  OTOH, looking for '\n' in a stream of bytes doesn't seem right.
>  Maybe it should just be deprecated.
> Cheers,
>        Simon
> _______________________________________________
> Libraries mailing list
> Libraries at
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Libraries mailing list