[Haskell-cafe] How to make asynchronous I/O composable and safe?
David Barbour
dmbarbour at gmail.com
Tue Jan 17 21:20:26 CET 2012
I'd say use of asynchronous exceptions should be a last resort. Developers
should be encouraged to explicitly model any event notification system they
use.
Regards,
Dave
On Tue, Jan 17, 2012 at 1:42 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> This is an interesting problem, I think I might incorporate parts of it
> into the next revision of my Concurrent Haskell tutorial.
>
> It sounds like you're getting overwhelmed by several different problems,
> and dealing with them separately would probably help. e.g. you want some
> infrastructure to run two threads and send an exception to one whenever the
> other one dies. You also want to be able to avoid a thread being
> interrupted while performing an operation that should be atomic, like
> sending a message - this is slightly tricky, because there's a tradeoff
> between keeping the thread responsive and not interrupting an operation.
> The biggest hammer is maskUninterruptible, which can be used if all else
> fails.
>
> Whether Network.TLS supports simultaneous read and write I don't know, but
> you can examine the code or talk to the maintainer. If it doesn't, adding
> a layer of locking is straightforward, and doesn't increase overall
> complexity (it's localised).
>
> Cheers,
> Simon
>
>
> On 14/01/2012 05:24, Joey Adams wrote:
>
>> I'm not happy with asynchronous I/O in Haskell. It's hard to reason
>> about, and doesn't compose well. At least in my code.
>>
>> I'm currently trying to build a networking layer for my application
>> using Network.TLS. Here is a rather minimalist API:
>>
>> newtype Connection = Connection (TLSCtx Handle)
>>
>> connectClient :: Handle -- ^ Connection handle, as returned
>> by 'connectTo'
>> -> X509 -- ^ TLS certificate (i.e. public key)
>> -> IO Connection
>>
>> connectServer :: Handle -- ^ Connection handle, as returned
>> by 'accept'
>> -> X509 -- ^ TLS certificate (i.e. public key)
>> -> TLS.PrivateKey -- ^ TLS private key
>> -> IO Connection
>>
>> close :: Connection -> IO ()
>>
>> sendMessage :: Connection -> Message -> IO ()
>>
>> recvMessage :: Connection -> ByteString -> IO (Message, ByteString)
>>
>> The module provides little more than connection initialization and
>> message serialization. I don't try to use locks or STM to multiplex
>> the connection or, in the case of recvMessage, hide connection state.
>> I just be sure to only use sendMessage in one thread at a time, only
>> use recvMessage in one thread at a time, and marshal the "extra bytes"
>> parameter of recvMessage from call to call (with the help of StateT).
>>
>> I wrote a simple "chat server" to test it. The client turned out okay:
>>
>> main :: IO ()
>> main = do
>> cert<- getCertificate
>> handle<- connectTo "localhost" (PortNumber 1337)
>> conn<- connectClient handle cert
>> _<- forkIO $ forever $ do
>> s<- getLine
>> sendMessage conn $ TestMessage s
>> forever $ flip runStateT B.empty $ do
>> msg<- StateT $ recvMessage conn
>> case msg of
>> TestMessage s ->
>> liftIO $ putStrLn s
>> _ ->
>> liftIO $ hPrintf stderr
>> "Warning: unrecognized message from server: %s\n"
>> (messageTypeName msg)
>>
>> The only glaring problem is that, if the user presses Ctrl+D, the
>> forked (sending) thread dies, but the main (receiving) thread lingers.
>> I'd have to add exception handlers to ensure that when one thread
>> dies, the other thread dies too.
>>
>> However, the server is an abomination (see attachment).
>>
>> Unfortunately, it's not as simple as "spawn one thread per client".
>> We need at least two threads, one to listen for messages from the
>> client, and another to send messages to the client. GHC won't let us
>> simultaneously, in the same thread, wait for input from a connection
>> and wait for an STM transaction to succeed.
>>
>> Another source of complexity is: what if we throw an exception at a
>> thread while it is in the middle of sending a packet? Then we can't
>> shut down the connection properly (i.e. Network.TLS.bye), because the
>> receiver might think the close_notify packet is part of the
>> interrupted packet.
>>
>> Having a thread for each client is good, as it:
>>
>> * Lets us think about each client separately. No need to turn our
>> code inside out or write one big loop that juggles all the clients.
>>
>> * Isolates exceptions. If sendMessage or recvMessage throws an
>> exception, it doesn't bring the whole server down.
>>
>> On the other hand, having multiple threads interact with a single
>> client is hard to think about:
>>
>> * We have to synchronize the threads (e.g. when one dies, kill the other
>> one)
>>
>> * Multiple places where an exception can arise
>>
>> * Can multiple threads interact with the connection handle
>> simultaneously?
>>
>> So why don't I make my connection API handle some of this? Well, I
>> tried. There are so many ways to do it, and I couldn't find a way
>> that simplified usage much. The approach used by Handle and by
>> Network.TLS is to use MVars and IORefs to ensure that, if two threads
>> access the same connection, the connection doesn't become totally
>> corrupt. If I do the same, then I'll have *three* layers of locking
>> under the hood.
>>
>> Worse, the locking done by Handle and Network.TLS doesn't guarantee
>> much. I don't know if it's safe to have one thread sending and
>> another thread receiving. Especially in the case of Network.TLS,
>> where 'recvData' automatically handshakes in some cases, which sends
>> packets. Since I don't know how much thread safety to expect, I can't
>> write networking code and know for sure that it is safe.
>>
>> I'm certainly not protected from interleaved data if multiple threads
>> send on the same handle. For example:
>>
>> import Control.Concurrent
>> import System.IO
>>
>> main :: IO ()
>> main = do
>> hSetBuffering stdout NoBuffering
>> _<- forkIO $ putStrLn "One sentence."
>> putStrLn "Another sentence."
>>
>> produces:
>>
>> AnOonteh esre nsteenntceen.c
>> e.
>>
>> That is, I can't rely on putStrLn being "atomic". To produce
>> intelligible output (without changing the buffering mode), I have to
>> "lock" the output each time I write something. putStrLn doesn't do it
>> for me.
>>
>> === Summary ===
>>
>> In Haskell, sound logic and a great type system lead to elegant,
>> composable code in a variety of domains, such as:
>>
>> * Expression evaluation
>> * Parsing
>> * Concurrent programming (thanks to STM)
>>
>> Asynchronous I/O is tricky. However, Haskell currently does little to
>> alleviate the complexity (at least for me).
>>
>> How can we structure network protocol APIs so that they stack well
>> (e.g. only lock once, rather than locking each layer's connection
>> state)? How can we deal with I/O errors without having to think about
>> them at every turn?
>>
>> For now, how can I structure my application's communication API so
>> it's less messy to use?
>>
>> Thanks,
>> - Joey
>>
>>
>>
>> ______________________________**_________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>>
>
>
> ______________________________**_________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120117/672509e2/attachment.htm>
More information about the Haskell-Cafe
mailing list