[GHC] #7993: ghc 7.6 (not 7.4) sometimes hangs at child process exit on s390x
GHC
ghc-devs at haskell.org
Mon Jun 17 23:32:42 CEST 2013
#7993: ghc 7.6 (not 7.4) sometimes hangs at child process exit on s390x
---------------------+------------------------------------------------------
Reporter: cjwatson | Owner:
Type: bug | Status: new
Priority: normal | Component: Runtime System
Version: 7.6.3 | Keywords:
Os: Linux | Architecture: Other
Failure: Other | Blockedby:
Blocking: | Related:
---------------------+------------------------------------------------------
On Debian's s390x architecture (64-bit S/390, Linux kernel), builds of
several packages hang with GHC 7.6 where they did not hang with GHC 7.4.
In particular, ghc itself hangs during its own build when bootstrapping
with 7.6. This is quite easy to reproduce on affected systems, although
it doesn't hang in exactly the same place every time. It appears that the
runtime sometimes deadlocks when a subprocess exits; the strace looks like
this:
{{{
7523 exit_group(0) = ?
6680 <... futex resumed> ) = ? ERESTARTSYS (To be restarted)
6680 --- SIGCHLD (Child exited) @ 0 (0) ---
6680 futex(0x84fa86ac, FUTEX_WAIT_PRIVATE, 1143, NULL) = ? ERESTARTSYS
(To be restarted)
6680 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
6680 sigreturn() = ? (mask now [])
6680 futex(0x84fa86ac, FUTEX_WAIT_PRIVATE, 1143, NULL) = ? ERESTARTSYS
(To be restarted)
6680 --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
6680 sigreturn() = ? (mask now [])
6680 futex(0x84fa86ac, FUTEX_WAIT_PRIVATE, 1143, NULL) = ? ERESTARTSYS
(To be restarted)
[repeats forever]
}}}
ghc spawns enough subprocesses (gcc etc.) that it's essentially bound to
hit this sooner or later. I suspect perhaps a lack of signal-safety
somewhere - at an extremely wild guess, perhaps the type of an important
variable written in a signal handler happens to exceed the size of
sig_atomic_t on s390x and not elsewhere - but I haven't yet been able to
track this down in the time available to me.
If you don't immediately recognise this as something obvious, then perhaps
somebody more fluent in Haskell than I would be good enough to suggest
test code that exercises this and is somewhat simpler than "build ghc"?
If my analysis is at all close to the mark, then something that sits in a
loop forking and reaping a trivial child process on each iteration should
be enough to reproduce this. On the assumption that most non-Debian-
developers don't have convenient access to S/390 machines (Debian
developers can use zelenka.debian.org), I'd be happy to try things out.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7993>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list