[GHC] #14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic

GHC ghc-devs at haskell.org
Fri Mar 2 09:32:48 UTC 2018


#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic
-------------------------------------+-------------------------------------
        Reporter:  bgamari           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:
       Component:  Runtime System    |              Version:  8.3
      Resolution:                    |             Keywords:
Operating System:  MacOS X           |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by osa1):

 Here's a code path that may be causing this:

 - rts/Schedule.c `forkProcess()` (called by the library) acquires
 `all_tasks_mutex` (in line 1987)

 - `forkProcess()` calls `fork()`

 - If in parent process (which means all locks are still held), it releases
 a few locks (but not `all_tasks_mutex`) and calls `releaseCapability_` for
 all capabilities.

 - In rts/Capability.c `releaseCapability_()`, when these conditions hold

   1. `cap->n_returning_tasks == 0`
   2. There is not a pending sync
   3. Next thread in the run queue is not a bound one
   4. The capability has spare workers (`cap->spare_workers` is not `NULL`)
   5. The capability's run queue is not empty (`cap->n_run_queue != 0`) and
 we're not shutting down (`sched_state != SCHED_SHUTTING_DOWN`)

 When all these hold `releaseCapability_()` calls `startWorkerTask()`
 (rts/Task.c), which in turn calls `newTask()`, which tries to take
 `all_tasks_mutex`, causing this bug.

 Btw, if I'm reading this correctly there is at least one more bug. The
 fork(2) man page says state of mutex is also replicated in the child
 process, so `all_tasks_mutex` will be acquired in the child process.
 However in the "child" branch of `forkProcess()` we initialize
 `all_tasks_mutex` without releasing it, and `pthread_mutex_init` man page
 says "Attempting to initialize an already initialized mutex results in
 undefined behavior.".

 So far I've run this test more than 1500 times on ghc-mini and it passed
 every time. I'll try to reproduce locally based on the information above.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14538#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list