[GHC] #12070: SMP primitives broken on power(pc)

GHC ghc-devs at haskell.org
Mon May 16 09:08:52 UTC 2016


#12070: SMP primitives broken on power(pc)
-------------------------------------+---------------------------------
        Reporter:  hvr               |                Owner:  trommler
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.0.1
       Component:  Runtime System    |              Version:  8.0.1-rc4
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:  powerpc
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+---------------------------------
Description changed by hvr:

@@ -24,0 +24,1 @@
+ }
@@ -29,1 +30,7 @@
- (including in `ghc -j`).
+ (including in `ghc --make -j`) such as for instance
+
+ {{{
+ internal error: END_TSO_QUEUE object entered!
+ (GHC version 8.0.0.20160421 for powerpc64_unknown_linux)
+ }}}
+
@@ -33,3 +40,30 @@
- portable than inline-asm. I've been testing the patch already and it seems
- to have made all issues I experienced so far disappear, as well as fixing
- the `concprog01` test which was also failing infrequently.
+ portable than inline-asm. This would result in e.g.
+
+
+ {{{#!c
+ StgWord
+ cas(StgVolatilePtr p, StgWord o, StgWord n)
+ {
+     return __sync_val_compare_and_swap (p, o, n);
+ }
+ }}}
+
+ which then gets compiled as
+
+ {{{#!asm
+ 000000000000004c <.cas>:
+   4c:   7c 00 04 ac     sync
+   50:   7d 20 18 a8     ldarx   r9,0,r3
+   54:   7c 29 20 00     cmpd    r9,r4
+   58:   40 c2 00 0c     bne-    64 <.cas+0x18>
+   5c:   7c a0 19 ad     stdcx.  r5,0,r3
+   60:   40 c2 ff f0     bne-    50 <.cas+0x4>
+   64:   4c 00 01 2c     isync
+   68:   7d 23 4b 78     mr      r3,r9
+   6c:   4e 80 00 20     blr
+ }}}
+
+
+ I've been testing the patch already and it seems to have made all issues I
+ experienced so far disappear, as well as fixing the `concprog01` test
+ which was also failing infrequently.

New description:

 I originally noticed this when working on the AIX port (32-bit powerpc),
 and recently saw this also on Linux/powerpc64, which lead to talking to
 Peter Trommler who already had a suspicion:

 Here's for example the CAS definition (in `<stg/SMP.h>`):

 {{{#!c
 StgWord
 cas(StgVolatilePtr p, StgWord o, StgWord n)
 {
     StgWord result;
     __asm__ __volatile__ (
         "1:     ldarx     %0, 0, %3\n"
         "       cmpd      %0, %1\n"
         "       bne       2f\n"
         "       stdcx.    %2, 0, %3\n"
         "       bne-      1b\n"
         "2:"
         :"=&r" (result)
         :"r" (o), "r" (n), "r" (p)
         :"cc", "memory"
     );
     return result;
 }
 }}}

 The important detail is the lack any barrier instructions, such as `isync`
 at the end. This results in infrequent heap-corruptions which in turn
 result in all sorts of infrequent and hard to track down runtime-crashes
 (including in `ghc --make -j`) such as for instance

 {{{
 internal error: END_TSO_QUEUE object entered!
 (GHC version 8.0.0.20160421 for powerpc64_unknown_linux)
 }}}


 Peter has already a patch in the works which simply replaces the atomic
 powerpc primitives with `__sync_*` intrinsics which turn out to be more
 portable than inline-asm. This would result in e.g.


 {{{#!c
 StgWord
 cas(StgVolatilePtr p, StgWord o, StgWord n)
 {
     return __sync_val_compare_and_swap (p, o, n);
 }
 }}}

 which then gets compiled as

 {{{#!asm
 000000000000004c <.cas>:
   4c:   7c 00 04 ac     sync
   50:   7d 20 18 a8     ldarx   r9,0,r3
   54:   7c 29 20 00     cmpd    r9,r4
   58:   40 c2 00 0c     bne-    64 <.cas+0x18>
   5c:   7c a0 19 ad     stdcx.  r5,0,r3
   60:   40 c2 ff f0     bne-    50 <.cas+0x4>
   64:   4c 00 01 2c     isync
   68:   7d 23 4b 78     mr      r3,r9
   6c:   4e 80 00 20     blr
 }}}


 I've been testing the patch already and it seems to have made all issues I
 experienced so far disappear, as well as fixing the `concprog01` test
 which was also failing infrequently.

--

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12070#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list