[Haskell-cafe] debugging a hanging program: where to start?

Aran Donohue aran.donohue at gmail.com
Thu May 13 15:28:21 EDT 2010


I have an accept-loop:

do (conn, _saddr) <- accept sock
     forkIO $ initializeConnection conn

Which allocates memory iff accept allocates, I suppose. To test the theory,
is there a way I can force an allocation that won't get optimized away?

According to the old print-statement debugging method, it is accept that
causes the problem. The accept is the last thing that happens before the
hang.

I created a test program to hammer on accept.

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Control.Concurrent (forkIO)
import Control.Concurrent.MVar
import Control.Monad (forever, liftM)
import Network (listenOn, withSocketsDo, PortID(..))
import Network.BSD (getProtocolNumber)
import Network.Socket (Family(..), SockAddr(..), SocketType(..), Socket(..),
accept, connect, inet_addr, socket, sClose)
import Network.Socket.ByteString
import           Data.ByteString.Char8 ()



import System.Environment (getArgs)
import System.IO (stdout, hFlush)

port = 8080

main = do
    args <- getArgs
    if length args < 1 || (head args /= "client" && head args /= "server")
        then putStrLn "say client or server"
        else withSocketsDo $ case head args of
                                "client" -> client
                                "server" -> server

client = do
    putStrLn "Client mode"
    tcp <- getProtocolNumber "tcp"
    forever $ do sock <- socket AF_INET Stream tcp
                 localhost <- inet_addr "127.0.0.1"
                 putStr "Making a connection..."
                 connect sock (SockAddrInet port localhost)
                 send sock "a"
                 _ <- recv sock 1
                 sClose sock
                 putStrLn " done client connection"


server = do
    putStrLn "Server mode"
    sock <- listenOn (PortNumber port)
    count <- newMVar 0
    forever $ do count' <- modifyMVar count (\c -> return (c+1,c+1))
                 putStr $ (show count') ++ " About to accept..."
                 (conn, _saddr) <- accept sock
                 putStrLn " accepted."
                 hFlush stdout
                 forkIO $ handleServerConnection conn count'

handleServerConnection conn count = do
    putStr $ (show count) ++ " Handling a server connection..."
    rd <- recv conn 1
    send conn "a"
    sClose conn
    putStrLn " done server connection."
    hFlush stdout

While it doesn't seem to fully hang, it does go through mysterious "hiccups"
in which an accept call takes many seconds to return. I'm on Mac OS X Snow
Leopard, ghc 6.12.1.

I'm not quite sure how I ought to proceed. I'm still very open to debugging
tools and techniques I could use to approach the problem!

Aran



On Thu, May 13, 2010 at 12:10 PM, Jason Dagit <dagit at codersbase.com> wrote:

>
>
> On Thu, May 13, 2010 at 5:53 AM, Aran Donohue <aran.donohue at gmail.com>wrote:
>
>> Thanks folks! Forward progress is made...
>>
>> Unfortunately, programs don't seem to write out their threadscope event
>> logs until they terminate, and mine hangs until I kill it, so I can't get at
>> the event log.
>>
>> Tracing has taught me that before the hang-cause, my program splits its
>> time in pthread_cond_wait in two different threads, and select in a
>> third. After the hang, it no longer calls select and one of those
>> pthread_cond_waits  in the other. In the version without -threaded that
>> doesn't hang, it never does any pthread_cond_wait and never misses the
>> select.
>>
>> Now to go figure out what impossible condition it's waiting on, I guess.
>>
>
> The select sounds like the IO manager thread (a thread in the RTS not your
> code).  Is it possible that one of your threads does work but never
> allocates memory?  I've heard in some cases that can lead to starvation.  I
> think the explanation was that thread switching happens on allocation?
>
> Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100513/598b1f94/attachment-0001.html


More information about the Haskell-Cafe mailing list