[Haskell-cafe] invalid character encoding

Glynn Clements glynn at gclements.plus.com
Wed Mar 16 12:13:25 EST 2005


Marcin 'Qrczak' Kowalczyk wrote:

> >> It doesn't affect functions added by the hierarchical libraries,
> >> i.e. those functions are safe only with the ASCII subset. (There is
> >> a vague plan to make Foreign.C.String conform to the FFI spec,
> >> which mandates locale-based encoding, and thus would change all
> >> those, but it's still up in the air.)
> >
> > Hmm. I'm not convinced that automatically converting to the current
> > locale is the ideal behaviour (it'd certianly break all my programs!).
> > Certainly a function for converting into the encoding of the current
> > locale would be useful for may users but it's important to be able to
> > know the encoding with certainty.
> 
> It should only be the default, not the only option.

I'm not sure that it should be available at all.

> It should be possible to specify the encoding explicitly.

Conversely, it shouldn't be possible to avoid specifying the encoding
explicitly.

Personally, I wouldn't provide an all-in-one "convert String to
CString using locale's encoding" function, just in case anyone was
tempted to actually use it.

The decision as to the encoding belongs in application code; not in
(most) libraries, and definitely not in the language.

[Libraries dealing with file formats or communication protocols which
mandate a specific encoding are an exception. But they will be using a
fixed encoding, not the locale's encoding.]

If application code chooses to use the locale's encoding, it can
retrieve it then pass it as the encoding argument to any applicable
functions.

If application code doesn't want to use the locale's encoding, it
shouldn't be shoe-horned into doing so because a library developer
decided to duck the encoding issues by grabbing whatever encoding was
readily to hand (i.e. the locale's encoding).

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the Haskell-Cafe mailing list