CWString API

John Meacham john at repetae.net
Tue Nov 30 05:29:08 EST 2004


On Tue, Nov 30, 2004 at 10:17:13AM -0000, Simon Marlow wrote:
> On 30 November 2004 09:35, John Meacham wrote:
> 
> > On Tue, Nov 30, 2004 at 12:41:04AM -0800, Krasimir Angelov wrote:
> >>    Hello guys,
> >> 
> >> I am working on updated version of HDirect and now I
> >> am going to use CWString API to marshal (wchar_t *)
> >> type to String. I found some inconsistencies in the API.
> >>   - castCWcharToChar and castCharToCWchar functions
> >> are defined only for Posix systems and they aren't
> >> exported. In the same time castCCharToChar and
> >> castCharToCChar have the same meaning and they are
> >> defined and exported on all platforms.
> > 
> > The problem is that these operations are very unsafe, there is no
> > guarenteed isomorphism or even injection between wchar_ts and Chars.
> > If people really know what they are doing, they can do the conversion
> > themselves via fromIntegral/ord/chr, but I don't think we should
> > encourage such unsafe usage with functions when it is simple for the
> > user to work around it themselves.
> 
> That's right - castCWcharToChar and its dual are unlikely to be correct
> on Windows, where wchar_t is UTF-16.
> 
> However, AFAICS the whole Windows API works in terms of UTF-16, only
> dealing with surrogate pairs in the text output routines.  So it might
> sometimes be more convenient and efficient, but not strictly speaking
> correct, to do no conversion between a UTF-16 value and Haskell's Char
> in the FFI on Windows.  I think we want to provide an interface that
> lets you do this if you know what you're doing.

Yeah, the user is always free to do fromIntgral and ord/chr. For windows I
imagine something like what is done on glibc based systems (where
wchar_t is guarenteed to be unicode) can be done where simpler
specialized routines are used. On most common systems, it will never
have to fall back to the general purpose character conversion libraries.
My autoconf scripts take care of this for unixy boxen, but I don't know
enough about windows to do anything there. if it is straight up UTF-16,
that should be easy enough to write specialized routines for.
        John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 


More information about the Glasgow-haskell-users mailing list