String type in Socket I/O

Mon, 08 Apr 2002 00:31:41 +1200

Hi all,

I am writing an HTTP client-side library, using the SocketPrim library.  
During the implementation of Base64 encode/decode I began to have some 
doubts over the use of the Char type for socket I/O.

As far as I can tell, "sendTo" and "recvFrom" are simply handles on the 
underlying OS calls.  My winsock2.h file tells me the data passed into and 
received from these functions are C style chars, 8 bits each.  In unix these 
functions (sys/sockets.h) appear to use a C void pointer.  Finally I notice 
that the Haskell98 report defines Haskell Char as a Unicode char (which I 
figure isn't guaranteed 8 bits).

So I am curious, what happens when I send these unicode Haskell chars to the 
SocketPrim.sendTo function?  My current guess is that the low 8 bits of each 
Char become a C style char.  My base64 encoding and decoding functions sort 
of make this assumption, by ripping out the low 8 bits of chars before 
encoding this binary representation (3x8bit chars become 4x6bit base64 
chars).

So far everything works fine, but is this approach going to break in the 
near future?  What I would like to see are socket functions that do not 
introduce this extra level of binary indirection, perhaps using a ByteArray, 
or Word8 (as I see has previously been discussed - 
http://www.haskell.org/pipermail/libraries/2001-August/000482.html).  How 
realistic is this?

Warrick.

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.