[GHC] #8134: ghc enters a loop while building 7.6.3 for powerpc64 platform.

GHC ghc-devs at haskell.org
Mon Oct 28 14:26:23 UTC 2013


#8134: ghc enters a loop while building 7.6.3 for powerpc64 platform.
-------------------------------------+-----------------------------
        Reporter:  k0da              |            Owner:
            Type:  bug               |           Status:  new
        Priority:  normal            |        Milestone:  7.6.3
       Component:  Compiler          |          Version:  7.6.3
      Resolution:                    |         Keywords:
Operating System:  Unknown/Multiple  |     Architecture:  powerpc64
 Type of failure:  None/Unknown      |       Difficulty:  Unknown
       Test Case:                    |       Blocked By:
        Blocking:                    |  Related Tickets:
-------------------------------------+-----------------------------

Comment (by gustavold):

 After building 7.6.3 with -debug (bootstrapped with 7.4.2), I was able to
 reproduce this issue running ghc under gdb and get a sane stack trace:


 {{{
 (gdb) bt full
 #0  cas (p=0x13ab01d0 <token_locked>, o=0, n=1) at includes/stg/SMP.h:230
         result = 1
 #1  0x0000000012c91144 in getTokenBatch (cap=0x13aaf980 <MainCapability>)
 at rts/STM.c:933
 No locals.
 #2  0x0000000012c9121c in getToken (cap=0x13aaf980 <MainCapability>) at
 rts/STM.c:942
 No locals.
 #3  0x0000000012c912b8 in stmStartTransaction (cap=0x13aaf980
 <MainCapability>, outer=0x13715078 <stg_NO_TREC_closure>) at rts/STM.c:961
         t = 0x3f88fe530
 #4  0x0000000012cb8c0c in .stg_atomicallyzh ()
 No symbol table info available.
 #5  0x0000000012c80a68 in StgRun (f=0x0, basereg=0x1ffff2cb9790) at
 rts/StgCRun.c:81
 No locals.
 }}}

 The issue seems to be that function cas() expects a pointer to StgWord
 (which translates to unsigned long), passing a pointer to StgBool (which
 translates to int) does not provide enough storage, causing cas() to
 corrupt memory on 64 bits platforms. Subsequently, getTokenBatch() will
 try to release the lock on token_locked, but will overwrite only the first
 32 bits, which will have no effect on big endian platforms. Next time
 getTokenBatch() is called, it will loop forever waiting for token_locked
 to be released.

 I can't tell why this didn't show up before, as the code in question
 doesn't seem to have changed recently.

 I changed token_locked to StgWord and it seems to have fixed this issue. I
 was able to get ghc 7.6.3 to built itself successfully on ppc64. Also,
 "make test" didn't show any regression.

 Thanks a lot to the folks on IRC channel #ghc (rwbarton, thoughtpolice,
 carter, ezyang, leroux, hvr), who walked me through ghc's build system and
 gave me valuable hints on debugging ghc.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8134#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list