Output character encoding for ghc on OpenBSD

Matthias Kilian kili at outback.escape.de
Sun Apr 18 10:01:04 EDT 2010


Hi,

as some of you may know, I'm working on an update of OpenBSDs ghc
port to 6.12.2, currently chasing down the last remaining testsuite
failures. Yesterday, I ran into a problem which I have a fix for,
but only a really ugly fix, and I need some opinions of what users
would prefer.

The problem is that Haskell uses unicode characters internally (ghc
itself uses UTF-32 internally, where the endianess depends on the
architecture it's running on), and that any Haskell program (including
ghc and ghci) has to convert between the internal representation
and the actual locale settings of the system it's running on.
Unfortunately, OpenBSD is really bad if it comes to locale support;
the only supported locales are the C and the POSIX locales, so even
if you set LC_ALL or LC_CTYPE to something like, for example,
de_DE.iso88591, this would have no effect on OpenBSD.

Anyway, the short story is that I have to either hard-code the
character set to something like utf-8, or ghc will start to behave
really strange (for example, ghci would terminate immediately if
you just *type* a non-ASCII character).

So what would you prefer?

- Use utf-8 and only utf-8 (i.e. hardcoded)?

- Use something like iso-8859-15 (hardcoded)?

- Make it configurable via some non-standard environment variable
  (GHC_CODESET, for example). If so, what should be the default if
  the environment variable isn't set? Back to 7 bit (ASCII)? utf-8?
  Some of the latin variants?

Your suggestions are appreciated.

Thanks in advance.

Ciao,
	Kili


More information about the Glasgow-haskell-users mailing list