[GHC] #14707: setNumCapabilities can cause threads to get stuck in gcWorkerThread

GHC ghc-devs at haskell.org
Mon Jan 22 22:43:11 UTC 2018


#14707: setNumCapabilities can cause threads to get stuck in gcWorkerThread
-------------------------------------+-------------------------------------
        Reporter:  duog              |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Runtime System    |              Version:  8.5
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Description changed by duog:

Old description:

> I have a patch with some instrumentation that proves that sometimes
> threads do not leave gcWorkerThread until the following gc.
>
> I suspect it's caused by `idle_caps` being mutated in `scheduleDoGC`
> after the call to `requestSync`. A thread enters `yieldCapability` sees
> that itself is not idle, so enters `gcWorkerThread`, but then `idle_caps`
> is mutated so that that thread ''is'' idle, and it's spin locks are not
> touched by the garbage collector.
>
> Potential fixes:
> * Don't look at `idle_caps` in the garbage collector when we're touching
> the spin-locks, just do it for all capabilities. I don't ''think'' this
> does any harm.
> * Don't mutate `idle_caps` after the call to `requestSync`; move that
> logic to before the call.

New description:

 I have a patch with some instrumentation (Phab:D4339) that proves that
 sometimes threads do not leave gcWorkerThread until the following gc.

 I suspect it's caused by `idle_caps` being mutated in `scheduleDoGC` after
 the call to `requestSync`. A thread enters `yieldCapability` sees that
 itself is not idle, so enters `gcWorkerThread`, but then `idle_caps` is
 mutated so that that thread ''is'' idle, and it's spin locks are not
 touched by the garbage collector.

 Potential fixes:
 * Don't look at `idle_caps` in the garbage collector when we're touching
 the spin-locks, just do it for all capabilities. I don't ''think'' this
 does any harm.
 * Don't mutate `idle_caps` after the call to `requestSync`; move that
 logic to before the call.

 Of course, maybe I'm misunderstanding and this isn't a bug?

--

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14707#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list