Help debugging a deadlock in salvia on GHC 6.10 i386
Corey O'Connor
coreyoconnor at gmail.com
Sat Jun 6 17:09:18 EDT 2009
I'm working on trimming down the test code and filing a real bug. I'm
going to list out what I know right now and if anything jumps out
please let me know. Thanks!
I'm running a webserver built using salvia [1] and GHC 6.10 [2]. I've
trimmed down the code enough such that there is no obvious source of a
deadlock in either salvia or the reset of the web server. I don't have
any specific conditions that reproduce the issue as well. Just after
some time, anywhere from a few minutes to a few hours, the server
deadlocks. No particular request or number of requests seem to trigger
the deadlock.
1) Salvia accepts connections on the main thread then forkIOs a new
thread to actually handle the request. The new thread uses Handle
based IO.
2) As I understand it, there are issues with forkProcess and Handle
based IO. While this is a web server I'm avoiding using "daemonize"
code that relies on forkProcess. no forkProcess is occurring that I
know of.
3) The thread state summary printed by calling printAllThreads() from GDB is:
all threads:
threads on capability 0:
other threads:
thread 2 @ 0xb7d66000 is blocked on an MVar @ 0xb7d670b4
thread 3 @ 0xb7d74214 is blocked on an MVar @ 0xb7da88f0
4) The thread states according to a "thread apply all bt" from GDB is:
1. GDB backtrace
Thread 4 (Thread 0xb7cffb90 (LWP 30891)):
#0 0xb8080416 in __kernel_vsyscall ()
#1 0xb7fd0075 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/tls/i686/cmov/libpthread.so.0
#2 0x083f4320 in waitCondition (pCond=0x9a7cc1c, pMut=0x9a7cc4c) at
posix/OSThreads.c:65
#3 0x0840de64 in yieldCapability (pCap=0xb7cff36c, task=0x9a7cc00) at
Capability.c:506
#4 0x083eb292 in schedule (initialCapability=0x8565aa0,
task=0x9a7cc00) at Schedule.c:293
#5 0x083ed5ff in workerStart (task=0x9a7cc00) at Schedule.c:1923
#6 0xb7fcc50f in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb7f49a0e in clone () from /lib/tls/i686/cmov/libc.so.6
Thread 3 (Thread 0xb74feb90 (LWP 30892)):
#0 0xb8080416 in __kernel_vsyscall ()
#1 0xb7fd0075 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/tls/i686/cmov/libpthread.so.0
#2 0x083f4320 in waitCondition (pCond=0x9a7ef3c, pMut=0x9a7ef6c) at
posix/OSThreads.c:65
#3 0x0840de64 in yieldCapability (pCap=0xb74fe36c, task=0x9a7ef20) at
Capability.c:506
#4 0x083eb292 in schedule (initialCapability=0x8565aa0,
task=0x9a7ef20) at Schedule.c:293
#5 0x083ed5ff in workerStart (task=0x9a7ef20) at Schedule.c:1923
#6 0xb7fcc50f in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb7f49a0e in clone () from /lib/tls/i686/cmov/libc.so.6
Thread 2 (Thread 0xb6cfdb90 (LWP 30916)):
#0 0xb8080416 in __kernel_vsyscall ()
#1 0xb7fd0075 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/tls/i686/cmov/libpthread.so.0
#2 0x083f4320 in waitCondition (pCond=0x9a7e12c, pMut=0x9a7e15c) at
posix/OSThreads.c:65
#3 0x0840de64 in yieldCapability (pCap=0xb6cfd36c, task=0x9a7e110) at
Capability.c:506
#4 0x083eb292 in schedule (initialCapability=0x8565aa0,
task=0x9a7e110) at Schedule.c:293
#5 0x083ed5ff in workerStart (task=0x9a7e110) at Schedule.c:1923
#6 0xb7fcc50f in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7 0xb7f49a0e in clone () from /lib/tls/i686/cmov/libc.so.6
Thread 1 (Thread 0xb7e666b0 (LWP 30890)):
#0 0xb8080416 in __kernel_vsyscall ()
#1 0xb7fd0075 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/tls/i686/cmov/libpthread.so.0
#2 0x083f4320 in waitCondition (pCond=0x9a7cb3c, pMut=0x9a7cb6c) at
posix/OSThreads.c:65
#3 0x0840de64 in yieldCapability (pCap=0xbfa822ac, task=0x9a7cb20) at
Capability.c:506
#4 0x083eb292 in schedule (initialCapability=0x8565aa0,
task=0x9a7cb20) at Schedule.c:293
#5 0x083ed463 in scheduleWaitThread (tso=0xb7d80800, ret=0x0,
cap=0x8565aa0) at Schedule.c:1895
#6 0x083e851a in rts_evalLazyIO (cap=0x8565aa0, p=0x8489478, ret=0x0)
at RtsAPI.c:517
#7 0x083e79d5 in real_main () at Main.c:111
Anybody think of anything so far?
Cheers,
Corey O'Connor
More information about the Glasgow-haskell-users
mailing list