[Haskell-cafe] Unicode Haskell source -- Yippie!

Fri Apr 25 10:43:04 UTC 2014

On 2014年04月25日 17:37, Christopher Allen wrote:
> http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/

Much of this article relates to what I wrote in my first reply:

On 2014年04月25日 15:54, Travis Cardwell wrote:
> Python does indeed have great Unicode support, but using Unicode for
> everything is not efficient in cases where it is not needed.

Armin says that Python 3 is not appropriate for real-world applications
due to this issue.  He wants functionality in the standard library that
processes bytes directly (as in Python 2).  The problem is that processing
bytes directly is not safe.  The `urlparse` example is a good one: naively
parsing URLs as bytes can lead to major security vulnerabilities.  While
Armin would not parse things naively, people with less experience with
encodings are less likely to make mistakes in Python 3, at the expense of
performance.

I think that Haskell's support for byte-strings and Unicode strings (as
well as many other encodings via ICU transcoding) is quite nice because it
supports doing whatever needs to be done while giving the programmer the
control necessary to implement real-world applications.  Though one can
still manage to shoot themselves in the foot, Haskell's types make the
confusing subject of encoding more approachable and significantly reduces
chance of error, IMHO.

The article also talks about how Python's codec system is used for
non-character encodings (such as zlib!) in addition to character
encodings.  I do not think that it is particularly good design.  Attempts
to clean up the design have resulted in compatibility issues with old
code: type errors!  As a Haskell programmer, I am clearly biased, but I
think that the design of such modules could be significantly improved by
using static types and type classes! ;)

Cheers,

Travis