[Haskell-cafe] Policy for taking over a package on Hackage
wren ng thornton
wren at freegeek.org
Thu May 26 00:49:28 CEST 2011
On 5/25/11 1:03 PM, Bryan O'Sullivan wrote:
> On Wed, May 25, 2011 at 5:59 AM, Ivan Lazar Miljenovic<
> ivan.miljenovic at gmail.com> wrote:
>
>> Well, using the Char8 version.
>
> Just because you *could* do that, it doesn't mean that you *should*. It's a
> bad idea to use bytestrings for manipulating text, yet the only plausible
> reason to have wl-pprint handle bytestrings is so that they can be used as
> text.
It's worth highlighting that even with the Char8 version of ByteStrings
you still run into encoding issues. Remember the days before Unicode
came about? True, 8-bit encodings are often ASCII-compatible and
therefore the representation of digits and whitespace are consistent
regardless of (ASCII-compatible) encoding, but that's still just begging
for issues. What are the semantics of the byte 0xA0 with respect to
pretty-printing issues like linewraps? Are they consistent among all
extant 8-bit encodings? What about bytes in 0x80..0x9F? What about 0x7F
for that matter?
I won't say that ByteStrings should never be used for text (there are
plenty of programs whose use of text involves only whitespace splitting
and moving around the resultant opaque blobs of memory). But at a bare
minimum, the use of ByteStrings for encoding text needs to be done via
newtype wrapper(s) which keep track of the encoding. Especially for
typeclass instances.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list