[Haskell-cafe] invalid character encoding

Glynn Clements glynn at gclements.plus.com
Sat Mar 19 09:32:01 EST 2005

Einar Karttunen wrote:

> > In what way is ISO-2022 non-reversible? Is it possible that a ISO-2022 
> > file name that is converted to Unicode cannot be converted back any 
> > more (assuming you know for sure that it was ISO-2022 in the first 
> > place)?
> I am no expert on ISO-2022 so the following may contain errors,
> please correct if it is wrong.
> ISO-2022 -> Unicode is always possible.
> Also Unicode -> ISO-2022 should be always possible, but is a relation
> not a function. This means there are an infinite? ways of encoding a
> particular unicode string in ISO-2022.
> ISO-2022 works by providing escape sequences to switch between different
> character sets. One can freely use these escapes in almost any way you
> wish.


Moreover, while there are an infinite number of equivalent
representations in theory (you can add as many redundant switching
sequences as you wish), there are multiple "plausible" equivalent
representations in practice.

Glynn Clements <glynn at gclements.plus.com>

More information about the Haskell-Cafe mailing list