UniCode

Andrew J Bromage andrew@bromage.org
Sun, 7 Oct 2001 01:09:26 +1000


G'day all.

On Fri, Oct 05, 2001 at 06:17:26PM +0000, Marcin 'Qrczak' Kowalczyk wrote:

> This information is out of date. AFAIR about 40000 of them is assigned.
> Most for Chinese (current, not historic).

I wasn't aware of this.  Last time I looked was Unicode 3.0.  Thanks
for the update.

> In Haskell String = [Char].

I'll concede that String and [Char] are identical as far as the
programmer is concerned. :-)

There was some research 10+ years ago about alternative representations
for lists which were semantically identical but a little more efficient
in memory use.  Even if you don't go that far (it is fiddly), constant
strings, for example, could be representable as UTF-16/UTF-8/whatever
along with some machinery to generate the list on demand.  Char objects
could be implemented as flyweights.  Lots of possibilities.

Cheers,
Andrew Bromage