[GHC] #4: -fext-core -fno-core behaves funny

GHC ghc-devs at haskell.org
Thu Sep 6 12:53:30 UTC 2018


#4: -fext-core -fno-core behaves funny
--------------------------------+--------------------
       Reporter:  josefs        |      Owner:  nobody
           Type:  bug           |     Status:  closed
       Priority:  normal        |  Milestone:
      Component:  Driver        |    Version:  None
     Resolution:  Fixed         |   Keywords:
Type of failure:  None/Unknown  |
--------------------------------+--------------------

Comment (by Ömer Sinan Ağacan <omeragacan@…>):

 In [changeset:"c6fbac6a6a69a2f4be89701b2c386ae53214f9a3/ghc" c6fbac6/ghc]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="c6fbac6a6a69a2f4be89701b2c386ae53214f9a3"
 Fix a race between GC threads in concurrent scavenging

 While debugging #15285 I realized that free block lists (free_list in
 BlockAlloc.c) get corrupted when multiple scavenge threads allocate and
 release blocks concurrently. Here's a picture of one such race:

     Thread 2 (Thread 32573.32601):
     #0  check_tail
         (bd=0x940d40 <stg_TSO_info>) at rts/sm/BlockAlloc.c:860
     #1  0x0000000000928ef7 in checkFreeListSanity
         () at rts/sm/BlockAlloc.c:896
     #2  0x0000000000928979 in freeGroup
         (p=0x7e998ce02880) at rts/sm/BlockAlloc.c:721
     #3  0x0000000000928a17 in freeChain
         (bd=0x7e998ce02880) at rts/sm/BlockAlloc.c:738
     #4  0x0000000000926911 in freeChain_sync
         (bd=0x7e998ce02880) at rts/sm/GCUtils.c:80
     #5  0x0000000000934720 in scavenge_capability_mut_lists
         (cap=0x1acae80) at rts/sm/Scav.c:1665
     #6  0x000000000092b411 in gcWorkerThread
         (cap=0x1acae80) at rts/sm/GC.c:1157
     #7  0x000000000090be9a in yieldCapability
         (pCap=0x7f9994e69e20, task=0x7e9984000b70, gcAllowed=true) at
 rts/Capability.c:861
     #8  0x0000000000906120 in scheduleYield
         (pcap=0x7f9994e69e50, task=0x7e9984000b70) at rts/Schedule.c:673
     #9  0x0000000000905500 in schedule
         (initialCapability=0x1acae80, task=0x7e9984000b70) at
 rts/Schedule.c:293
     #10 0x0000000000908d4f in scheduleWorker
         (cap=0x1acae80, task=0x7e9984000b70) at rts/Schedule.c:2554
     #11 0x000000000091a30a in workerStart
         (task=0x7e9984000b70) at rts/Task.c:444
     #12 0x00007f99937fa6db in start_thread
         (arg=0x7f9994e6a700) at pthread_create.c:463
     #13 0x000061654d59f88f in clone
         () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

     Thread 1 (Thread 32573.32573):
     #0  checkFreeListSanity
         () at rts/sm/BlockAlloc.c:887
     #1  0x0000000000928979 in freeGroup
         (p=0x7e998d303540) at rts/sm/BlockAlloc.c:721
     #2  0x0000000000926f23 in todo_block_full
         (size=513, ws=0x1aa8ce0) at rts/sm/GCUtils.c:264
     #3  0x00000000009583b9 in alloc_for_copy
         (size=513, gen_no=0) at rts/sm/Evac.c:80
     #4  0x000000000095850d in copy_tag_nolock
         (p=0x7e998c675f28, info=0x421d98 <Main_Large_con_info>,
 src=0x7e998d075d80, size=513,
         gen_no=0, tag=1) at rts/sm/Evac.c:153
     #5  0x0000000000959177 in evacuate
         (p=0x7e998c675f28) at rts/sm/Evac.c:715
     #6  0x0000000000932388 in scavenge_small_bitmap
         (p=0x7e998c675f28, size=1, bitmap=0) at rts/sm/Scav.c:271
     #7  0x0000000000934aaf in scavenge_stack
         (p=0x7e998c675f28, stack_end=0x7e998c676000) at rts/sm/Scav.c:1908
     #8  0x0000000000934295 in scavenge_one
         (p=0x7e998c66e000) at rts/sm/Scav.c:1466
     #9  0x0000000000934662 in scavenge_mutable_list
         (bd=0x7e998d300440, gen=0x1b1d880) at rts/sm/Scav.c:1643
     #10 0x0000000000934700 in scavenge_capability_mut_lists
         (cap=0x1aaa340) at rts/sm/Scav.c:1664
     #11 0x00000000009299b6 in GarbageCollect
         (collect_gen=0, do_heap_census=false, gc_type=2, cap=0x1aaa340,
 idle_cap=0x1b38aa0)
         at rts/sm/GC.c:378
     #12 0x0000000000907a4a in scheduleDoGC
         (pcap=0x7ffdec5b5310, task=0x1b36650, force_major=false) at
 rts/Schedule.c:1798
     #13 0x0000000000905de7 in schedule
         (initialCapability=0x1aaa340, task=0x1b36650) at
 rts/Schedule.c:546
     #14 0x0000000000908bc4 in scheduleWaitThread
         (tso=0x7e998c0067c8, ret=0x0, pcap=0x7ffdec5b5430) at
 rts/Schedule.c:2537
     #15 0x000000000091b5a0 in rts_evalLazyIO
         (cap=0x7ffdec5b5430, p=0x9c11f0, ret=0x0) at rts/RtsAPI.c:530
     #16 0x000000000091ca56 in hs_main
         (argc=1, argv=0x7ffdec5b5628, main_closure=0x9c11f0,
 rts_config=...) at rts/RtsMain.c:72
     #17 0x0000000000421ea0 in main
         ()

 In particular, dbl_link_onto() which is used to add a freed block to a
 doubly-linked free list is not thread safe and corrupts the list when
 called concurrently.

 Note that thread 1 is to blame here as thread 2 is properly taking the
 spinlock. With this patch we now take the spinlock when freeing a todo
 block in GC, avoiding this race.

 Test Plan:
 - Tried slow validate locally: this patch does not introduce new failures.
 - circleci: https://circleci.com/gh/ghc/ghc-diffs/283 The test got killed
   because it took 5 hours but T7919 (which was previously failing on
 circleci)
   passed.

 Reviewers: simonmar, bgamari, erikd

 Reviewed By: simonmar

 Subscribers: rwbarton, carter

 GHC Trac Issues: #15285

 Differential Revision: https://phabricator.haskell.org/D5115
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list