[Haskell-cafe] How to reverse ghc encoding of commandline arguments

Donn Cave donn at avvanta.com
Wed Nov 19 07:56:32 UTC 2014


quoth Donn Cave <donn at avvanta.com>
...
> Umlaut u turns up as 0xFC for UTF-8 users;  0xDCFC, for Latin-1 users. 
> This is an ordinary hello world type program, can't think of any
> unique environmental issues.

Well, I mischaracterized that problem, so to speak.

I find that GHC is not picking up on my "current locale" encoding,
and instead seems to be hard-wired to UTF-8.  On MacOS X, I can
select an encoding in Terminal Preferences, open a new window, and
for all intents and purposes it's an ISO8859-1 world, including
LANG=en_US.ISO8859-1, but GHC isn't going along with it.

So the ISO8859-1 umlaut u is undecodable if GHC is stuck in UTF-8,
which seems to explain what I'm seeing.  If I understand this right,
the 0xDC00 high byte is recognized in some circumstances, and the
value is spared from UTF-8 encoding and instead simply copied.

Hope that was interesting!

	Donn


More information about the Haskell-Cafe mailing list