<p dir="ltr">FWIW, C++ has:</p>

<p dir="ltr">- char* and const char*, inherited from C<br>

- wchar_t*, const wchar_t*<br>

- the above, but with an explicit length passed along as a separate argument<br>

- std::string<br>

- std::wstring (is that what it's called?)<br>

- various string implementations, provided by platform APIs and frameworks (QString, LPTCHAR, and other nonsense)</p>

<p dir="ltr">And they all suck - most are really just byte arrays, some try to implement Unicode but fall short, and the ones that do it mostly right are specific to a sub-ecosystem. It's a mess.</p>

<p dir="ltr">And do I need to mention PHP? That one doesn't have a useful string type at all, and also lacks the language feature to build it yourself - you're stuck with broken semantics either way, best you can hope for is that they are only mildly broken and you can get away with it.</p>

<p dir="ltr">C, by the way, shares C++'s problem, except that it doesn't even come with a string type that does bounds checking.</p>

<p dir="ltr">And finally: while Haskell makes you choose between "byte array", "string", and "list of code points", this isn't really awfully different from languages like Java or C#, where you make a similar choice (string? StringBuilder? byte[]?), except that the default is saner (for historical reasons). Well, that, and that there are lazy flavors of the packed string amd bytestring types, which has nothing to do with string type choices and everything with defaulting to and leveraging non-strict semantics.</p>

<div class="gmail_extra"><br><div class="gmail_quote">On Sep 30, 2016 8:17 AM, "Joachim Durchholz" <<a href="mailto:jo@durchholz.org">jo@durchholz.org</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am 30.09.2016 um 04:16 schrieb Richard A. O'Keefe:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

On 30/09/16 4:18 AM, Joachim Durchholz wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Each language does define its preferred string representation.<br>

</blockquote>

<br>

Java again:  it has *two* string representations baked into the<br>

language.<br>

</blockquote>

<br>

There is a single standard representation.<br>

I'm not even aware of a second one, and I've been programming Java for quite a while now.<br>

<br>

Unless you mean StringBuilder/StringBuffer (that would be three String types then). However, these classes are by no means "preferred" in practice: the vast majority of APIs demands and returns String objects.<br>

<br>

Even then, Java has its preferred string representation nailed down pretty strongly: a hidden array of 16-bit Unicode code points, referenced by a descriptor object (the actual String), immutable.<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The Smalltalk system I use most has<br>

 - read-only strings (preferred)<br>

 - unique read-only strings<br>

 - mutable strings<br>

 - substrings (positionable read-only slices)<br>

 - extensible strings<br>

 - streams over strings<br>

 - lazy concatenations of strings<br>

 - read-only byte arrays viewed as strings<br>

 - mutable byte arrays viewed as strings<br>

</blockquote>

<br>

Ah, Smalltalk. I haven't looked at that in ages.<br>

I'll give you that these classes all exist, but I am not sure whether a Smalltalk programmer would consider them all equivalent or not.<br>

______________________________<wbr>_________________<br>

Haskell-Cafe mailing list<br>

To (un)subscribe, modify options or view archives go to:<br>

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bi<wbr>n/mailman/listinfo/haskell-caf<wbr>e</a><br>

Only members subscribed via the mailman list are allowed to post.</blockquote></div></div>