[Haskell-cafe] invalid character encoding

Wed Mar 16 18:36:55 EST 2005

On Wed, Mar 16, 2005 at 05:13:25PM +0000, Glynn Clements wrote:
> 
> Marcin 'Qrczak' Kowalczyk wrote:
> 
> > >> It doesn't affect functions added by the hierarchical libraries,
> > >> i.e. those functions are safe only with the ASCII subset. (There is
> > >> a vague plan to make Foreign.C.String conform to the FFI spec,
> > >> which mandates locale-based encoding, and thus would change all
> > >> those, but it's still up in the air.)
> > >
> > > Hmm. I'm not convinced that automatically converting to the current
> > > locale is the ideal behaviour (it'd certianly break all my programs!).
> > > Certainly a function for converting into the encoding of the current
> > > locale would be useful for may users but it's important to be able to
> > > know the encoding with certainty.
> > 
> > It should only be the default, not the only option.
> 
> I'm not sure that it should be available at all.
> 
> > It should be possible to specify the encoding explicitly.
> 
> Conversely, it shouldn't be possible to avoid specifying the encoding
> explicitly.
> 
> Personally, I wouldn't provide an all-in-one "convert String to
> CString using locale's encoding" function, just in case anyone was
> tempted to actually use it.

But this is exactly what is needed for most C library bindings. Which is
why I had to write my own and proposed it to the FFI. Most C libraries
expect char * to be in the standard encoding of the current locale.
When a binding explicitly uses another encoding, then great,  we can use
different marshaling functions. In any case, we need tools to be able to
conform to the common cases of ascii-only (withCAStrirg) and current
locale  (withCString).

withUTF8String would be a nice addition, but is much less important to
come standard as it can easily be written by end users, unlike locale
specific versions which are necessarily system dependent.

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈