[Haskell-cafe] [perl #129843] [LTA] Indexing on a Str throws generic “out of range” message which is less than awesome (“hello”)
jo at durchholz.org
Tue May 9 10:59:06 UTC 2017
Am 08.05.2017 um 23:00 schrieb Brandon Allbery:
> On Mon, May 8, 2017 at 4:49 PM, Joachim Durchholz <jo at durchholz.org
> <mailto:jo at durchholz.org>> wrote:
> If the mental model for Perl6 strings is "array of characters" though
> Perl has never had that mental model, is my point.
Right, I should have written "is supposed to evolve to" instead of "is".
Array of characters may be a useful abstraction to have in Perl6, or not
> It's generally
> imported by folks who come from languages where strings *are* "arrays of
> characters" --- and where that model has a strong tendency to cause
> problems. (See Python 3's struggles with Unicode as an example. And
> C/C++, well, don't even get me started.
Some of these struggles originate from equating bytes with characters.
Since Perl6 is more or less a clean slate, it can avoid these.
Other struggles originate from the structure of Unicode: it defines
multiple levels of sequences, each useful for different tasks:
- code points
- characters (various normalizations exist)
- word parts (for line breaking)
and possible a few more.
Ideally, developers will be able to use the same API structure at each
level, maybe with the exception of the grapeme level where Perl6 has its
native representation (the better the API, the less of such
implementation details is visible and relevant to the programmer).
> Bytes stopped being the basis of
> characters even *before* Unicode. C and C++ are still struggling to
> understand that.
I think you're being unfair to them.
The issues are actually well-understood in the C++ arena, as
demonstrated by the ICU library.
It's just that language evolution is constrained by legacy, plus
possibly short-sighed decisions by compiler makers. Also, C++ (by
necessity) evolves slower than Unicode. Under these conditions, Unicode
support in a library is actually preferrable to anything inside the
language, it's enough if the language can interoperate with the library.
More information about the Haskell-Cafe