Why are strings linked lists?

Glynn Clements glynn.clements at virgin.net
Sat Nov 29 10:01:46 EST 2003


John Meacham wrote:

> > What Unicode support?
> > 
> > Simply claiming that values of type Char are Unicode characters
> > doesn't make it so.
> > 
> > Actually supporting Unicode would require re-implementing toUpper,
> > toLower and the is* functions, as well as at least re-implementing the
> > I/O library (and, realistically, re-designing it; while you *could*
> > just force the use of a specific encoding, the result of doing so
> > would be an I/O system which was almost worthless for real use).
> > 
> > Right now, values of type Char are, in reality, ISO Latin-1 codepoints
> > padded out to 4 bytes per char.
> > 
> > It isn't possible to "drop" support which isn't there.
> 
> I use unicode support with ghc all the time. using my CWString library
> and an alternate set of h* routines. Works quite well. A standard UTF8
> packed string type might be handy though.

IOW, you've written your own Unicode support to get around the fact
that GHC doesn't provide any.

Unless I'm missing something, the only "support" that GHC provides is
that Char is 4 bytes. If you use Char to store anything other than ISO
Latin-1 characters, none of the Haskell functions with Char in their
signature will be of any use. You could just as easily have added
"type WChar = Word32", and made your library use that instead of Char.

-- 
Glynn Clements <glynn.clements at virgin.net>


More information about the Haskell mailing list