[Haskell-cafe] Re: hSetEncoding on socket handles
Simon Marlow
marlowsd at gmail.com
Wed May 12 09:14:45 EDT 2010
On 12/05/2010 01:56, David Powell wrote:
> Greetings,
>
> I am having trouble sending unicode characters as utf8 over a socket handle.
> Despite setting the encoding on the socket handle to utf8, it still seems to
> use some other encoding when writing to the socket. It works correctly when
> writing to stdout, but not to a socket handle. I am using ghc 6.12.1 and
> network-2.2.1.7. I can get it to work using System.IO.UTF8, but I was under
> the impression this was no longer necessary?
>
> I also don't seem to understand the interaction between hSetEncoding and
> hSetBinaryMode because if I set the binary mode to 'False' and the
> encoding to
> utf8 on the socket, then when writing to the socket the string seems to be
> truncated at the first non-ascii codepoint.
>
> Here is a test snippet, which can be used with netcat as a listening server
> (ie. nc -l 1234).
>
> > import System.IO
> > import Network
> > main = do
> > let a="λ"
> > s <- connectTo "127.0.0.1" (PortNumber 1234)
> > hSetEncoding s utf8
> > hSetEncoding stdout utf8
> > hPutStrLn s a
> > putStrLn a
> > hClose s
You've found a bug, thanks. The bug is that a socket is bidirectional
and we're only setting the encoding for one side (the read side) but we
should be setting it for both sides.
I just created a ticket:
http://hackage.haskell.org/trac/ghc/ticket/4066
Expect a fix in GHC 6.12.3. In the meantime you can work around it,
e.g. this worked for me to create a write-only socket that hSetEncoding
works with:
connectTo hostname (PortNumber port) = do
proto <- getProtocolNumber "tcp"
bracketOnError
(socket AF_INET Stream proto)
(sClose) -- only done if there's an error
(\sock -> do
he <- getHostByName hostname
connect sock (SockAddrInet port (hostAddress he))
socketToHandle sock WriteMode
)
Cheers,
Simon
More information about the Haskell-Cafe
mailing list