[GHC] #12038: Shutdown interacts badly with requestSync()

GHC ghc-devs at haskell.org
Tue Jan 10 19:21:41 UTC 2017


#12038: Shutdown interacts badly with requestSync()
-------------------------------------+-------------------------------------
        Reporter:  simonmar          |                Owner:
            Type:  bug               |               Status:  patch
        Priority:  normal            |            Milestone:  8.4.1
       Component:  Runtime System    |              Version:  7.10.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D2926
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Ben Gamari <ben@…>):

 In [changeset:"6de7613604216f65fae92d8066a078bf9cd3c088/ghc"
 6de76136/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="6de7613604216f65fae92d8066a078bf9cd3c088"
 event manager: Don't worry if attempt to wake dead manager fails

 This fixes #12038, where the TimerManager would attempt to wake up a
 manager that was already dead, resulting in setnumcapabilities001
 occassionally failing during shutdown with unexpected output on stderr.

 I'm frankly still not entirely confident in this solution but perhaps it
 will help to get a few more eyes on this.

 My hypothesis is that the TimerManager is racing:

   thread                   TimerManager worker
   -------                  --------------------
   requests that thread
   manager shuts down

                            begins to clean up,
                            closing eventfd

   calls wakeManager,
   which tries to write
   to closed eventfd

 To prevent this `wakeManager` will need to synchronize with the
 TimerManger worker to ensure that the worker doesn't clean up the
 `Control` while another thread is trying to send a wakeup. However, this
 would add a bit of overhead on every timer interaction, which feels
 rather costly for what is really a problem only at shutdown.  Moreover,
 it seems that the event manager (e.g.  `GHC.Event.Manager`) is also
 afflicted by a similar race.

 This patch instead simply tries to catch the write failure after it has
 happened and silence it in the case that the fd has vanished. It feels
 rather hacky but it seems to work.

 Test Plan: Run `setnumcapabilities001` repeatedly

 Reviewers: austin, hvr, simonmar

 Reviewed By: simonmar

 Subscribers: thomie

 Differential Revision: https://phabricator.haskell.org/D2926

 GHC Trac Issues: #12038
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12038#comment:13>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list