[Haskell-cafe] Policy for taking over a package on Hackage

Ivan Lazar Miljenovic ivan.miljenovic at gmail.com
Thu May 26 01:02:56 CEST 2011


On 26 May 2011 08:49, wren ng thornton <wren at freegeek.org> wrote:
> On 5/25/11 1:03 PM, Bryan O'Sullivan wrote:
>>
>> On Wed, May 25, 2011 at 5:59 AM, Ivan Lazar Miljenovic<
>> ivan.miljenovic at gmail.com>  wrote:
>>
>>> Well, using the Char8 version.
>>
>> Just because you *could* do that, it doesn't mean that you *should*. It's
>> a
>> bad idea to use bytestrings for manipulating text, yet the only plausible
>> reason to have wl-pprint handle bytestrings is so that they can be used as
>> text.
>
> It's worth highlighting that even with the Char8 version of ByteStrings you
> still run into encoding issues. Remember the days before Unicode came about?
> True, 8-bit encodings are often ASCII-compatible and therefore the
> representation of digits and whitespace are consistent regardless of
> (ASCII-compatible) encoding, but that's still just begging for issues. What
> are the semantics of the byte 0xA0 with respect to pretty-printing issues
> like linewraps? Are they consistent among all extant 8-bit encodings? What
> about bytes in 0x80..0x9F? What about 0x7F for that matter?
>
> I won't say that ByteStrings should never be used for text (there are plenty
> of programs whose use of text involves only whitespace splitting and moving
> around the resultant opaque blobs of memory). But at a bare minimum, the use
> of ByteStrings for encoding text needs to be done via newtype wrapper(s)
> which keep track of the encoding. Especially for typeclass instances.

*shrug* this discussion on #haskell came about because lispy wanted to
generate textual ByteStrings (using just ASCII) and would prefer not
to have the overhead of Text.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
IvanMiljenovic.wordpress.com



More information about the Haskell-Cafe mailing list