Let's get this finished

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Fri Jan 5 13:38:10 EST 2001


Sat, 06 Jan 2001 02:24:00 +1100, Manuel M. T. Chakravarty <chak at cse.unsw.edu.au> pisze:

> Actually, given this unicode mess, how are we supposed to
> handle individual `Char's?  A Haskell `Char' may expand to a
> sequence of 8bit `char's.  That's a problem when the C side
> only expects a `char' and not a `*char'.

And BTW: many languages have strings only and don't work on individual
characters at all. Libraries usually work on strings. IMHO it's
not a problem at all to not provide much of support for individual
characters. Only strings are important in practice.

Conversion which changes the length might be a problem only if a
library works on positions within text. For example readline can show
the whole line being read, and the cursor position. But conversion
of the whole line to Haskell does not allow determining the right
cursor position in terms of Haskell characters.

Well, readline does not support multibyte encodings anyway, and
it can be assumed that positions will match in practice. (I once
hacked bash to work on a UTF-8 terminal but I don't think the patch
is used anywhere.)

Handling strings in C is so "manual" that I don't think much fun
can be provided besides withCString / mallocCString / peekCString.
Somebody might wrap a string in ForeignPtr CChar to avoid a conversion
to a Haskell string, but it's already easy enough, and IMHO strings
don't need yet another type just for interfacing with C.

-- 
 __("<  Marcin Kowalczyk * qrczak at knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTÊPCZA
QRCZAK





More information about the FFI mailing list