[Haskell-cafe] debugging a hanging program: where to start?
Aran Donohue
aran.donohue at gmail.com
Thu May 13 15:28:21 EDT 2010
I have an accept-loop:
do (conn, _saddr) <- accept sock
forkIO $ initializeConnection conn
Which allocates memory iff accept allocates, I suppose. To test the theory,
is there a way I can force an allocation that won't get optimized away?
According to the old print-statement debugging method, it is accept that
causes the problem. The accept is the last thing that happens before the
hang.
I created a test program to hammer on accept.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Control.Concurrent (forkIO)
import Control.Concurrent.MVar
import Control.Monad (forever, liftM)
import Network (listenOn, withSocketsDo, PortID(..))
import Network.BSD (getProtocolNumber)
import Network.Socket (Family(..), SockAddr(..), SocketType(..), Socket(..),
accept, connect, inet_addr, socket, sClose)
import Network.Socket.ByteString
import Data.ByteString.Char8 ()
import System.Environment (getArgs)
import System.IO (stdout, hFlush)
port = 8080
main = do
args <- getArgs
if length args < 1 || (head args /= "client" && head args /= "server")
then putStrLn "say client or server"
else withSocketsDo $ case head args of
"client" -> client
"server" -> server
client = do
putStrLn "Client mode"
tcp <- getProtocolNumber "tcp"
forever $ do sock <- socket AF_INET Stream tcp
localhost <- inet_addr "127.0.0.1"
putStr "Making a connection..."
connect sock (SockAddrInet port localhost)
send sock "a"
_ <- recv sock 1
sClose sock
putStrLn " done client connection"
server = do
putStrLn "Server mode"
sock <- listenOn (PortNumber port)
count <- newMVar 0
forever $ do count' <- modifyMVar count (\c -> return (c+1,c+1))
putStr $ (show count') ++ " About to accept..."
(conn, _saddr) <- accept sock
putStrLn " accepted."
hFlush stdout
forkIO $ handleServerConnection conn count'
handleServerConnection conn count = do
putStr $ (show count) ++ " Handling a server connection..."
rd <- recv conn 1
send conn "a"
sClose conn
putStrLn " done server connection."
hFlush stdout
While it doesn't seem to fully hang, it does go through mysterious "hiccups"
in which an accept call takes many seconds to return. I'm on Mac OS X Snow
Leopard, ghc 6.12.1.
I'm not quite sure how I ought to proceed. I'm still very open to debugging
tools and techniques I could use to approach the problem!
Aran
On Thu, May 13, 2010 at 12:10 PM, Jason Dagit <dagit at codersbase.com> wrote:
>
>
> On Thu, May 13, 2010 at 5:53 AM, Aran Donohue <aran.donohue at gmail.com>wrote:
>
>> Thanks folks! Forward progress is made...
>>
>> Unfortunately, programs don't seem to write out their threadscope event
>> logs until they terminate, and mine hangs until I kill it, so I can't get at
>> the event log.
>>
>> Tracing has taught me that before the hang-cause, my program splits its
>> time in pthread_cond_wait in two different threads, and select in a
>> third. After the hang, it no longer calls select and one of those
>> pthread_cond_waits in the other. In the version without -threaded that
>> doesn't hang, it never does any pthread_cond_wait and never misses the
>> select.
>>
>> Now to go figure out what impossible condition it's waiting on, I guess.
>>
>
> The select sounds like the IO manager thread (a thread in the RTS not your
> code). Is it possible that one of your threads does work but never
> allocates memory? I've heard in some cases that can lead to starvation. I
> think the explanation was that thread switching happens on allocation?
>
> Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100513/598b1f94/attachment-0001.html
More information about the Haskell-Cafe
mailing list