[GHC] #13751: Runtime crash with <<loop>> after concurrent stressing of STM computations

GHC ghc-devs at haskell.org
Thu Jun 8 07:38:17 UTC 2017


#13751: Runtime crash with <<loop>> after concurrent stressing of STM computations
-------------------------------------+-------------------------------------
        Reporter:  literon           |                Owner:  simonmar
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.2.1
       Component:  Runtime System    |              Version:  8.0.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime crash     |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  10414             |  Differential Rev(s):  Phab:D3630
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Simon Marlow <marlowsd@…>):

 In [changeset:"598472908ebb08f6811b892f285490554c290ae3/ghc" 5984729/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="598472908ebb08f6811b892f285490554c290ae3"
 Fix a lost-wakeup bug in BLACKHOLE handling (#13751)

 Summary:
 The problem occurred when
 * Threads A & B evaluate the same thunk
 * Thread A context-switches, so the thunk gets blackholed
 * Thread C enters the blackhole, creates a BLOCKING_QUEUE attached to
   the blackhole and thread A's `tso->bq` queue
 * Thread B updates the blackhole with a value, overwriting the
 BLOCKING_QUEUE
 * We GC, replacing A's update frame with stg_enter_checkbh
 * Throw an exception in A, which ignores the stg_enter_checkbh frame

 Now we have C blocked on A's tso->bq queue, but we forgot to check the
 queue because the stg_enter_checkbh frame has been thrown away by the
 exception.

 The solution and alternative designs are discussed in Note [upd-black-
 hole].

 This also exposed a bug in the interpreter, whereby we were sometimes
 context-switching without calling `threadPaused()`.  I've fixed this
 and added some Notes.

 Test Plan:
 * `cd testsuite/tests/concurrent && make slow`
 * validate

 Reviewers: niteria, bgamari, austin, erikd

 Reviewed By: erikd

 Subscribers: rwbarton, thomie

 GHC Trac Issues: #13751

 Differential Revision: https://phabricator.haskell.org/D3630
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13751#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list