[GHC] #12070: SMP primitives broken on power(pc)
GHC
ghc-devs at haskell.org
Mon May 16 09:08:52 UTC 2016
#12070: SMP primitives broken on power(pc)
-------------------------------------+---------------------------------
Reporter: hvr | Owner: trommler
Type: bug | Status: new
Priority: highest | Milestone: 8.0.1
Component: Runtime System | Version: 8.0.1-rc4
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: powerpc
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+---------------------------------
Description changed by hvr:
@@ -24,0 +24,1 @@
+ }
@@ -29,1 +30,7 @@
- (including in `ghc -j`).
+ (including in `ghc --make -j`) such as for instance
+
+ {{{
+ internal error: END_TSO_QUEUE object entered!
+ (GHC version 8.0.0.20160421 for powerpc64_unknown_linux)
+ }}}
+
@@ -33,3 +40,30 @@
- portable than inline-asm. I've been testing the patch already and it seems
- to have made all issues I experienced so far disappear, as well as fixing
- the `concprog01` test which was also failing infrequently.
+ portable than inline-asm. This would result in e.g.
+
+
+ {{{#!c
+ StgWord
+ cas(StgVolatilePtr p, StgWord o, StgWord n)
+ {
+ return __sync_val_compare_and_swap (p, o, n);
+ }
+ }}}
+
+ which then gets compiled as
+
+ {{{#!asm
+ 000000000000004c <.cas>:
+ 4c: 7c 00 04 ac sync
+ 50: 7d 20 18 a8 ldarx r9,0,r3
+ 54: 7c 29 20 00 cmpd r9,r4
+ 58: 40 c2 00 0c bne- 64 <.cas+0x18>
+ 5c: 7c a0 19 ad stdcx. r5,0,r3
+ 60: 40 c2 ff f0 bne- 50 <.cas+0x4>
+ 64: 4c 00 01 2c isync
+ 68: 7d 23 4b 78 mr r3,r9
+ 6c: 4e 80 00 20 blr
+ }}}
+
+
+ I've been testing the patch already and it seems to have made all issues I
+ experienced so far disappear, as well as fixing the `concprog01` test
+ which was also failing infrequently.
New description:
I originally noticed this when working on the AIX port (32-bit powerpc),
and recently saw this also on Linux/powerpc64, which lead to talking to
Peter Trommler who already had a suspicion:
Here's for example the CAS definition (in `<stg/SMP.h>`):
{{{#!c
StgWord
cas(StgVolatilePtr p, StgWord o, StgWord n)
{
StgWord result;
__asm__ __volatile__ (
"1: ldarx %0, 0, %3\n"
" cmpd %0, %1\n"
" bne 2f\n"
" stdcx. %2, 0, %3\n"
" bne- 1b\n"
"2:"
:"=&r" (result)
:"r" (o), "r" (n), "r" (p)
:"cc", "memory"
);
return result;
}
}}}
The important detail is the lack any barrier instructions, such as `isync`
at the end. This results in infrequent heap-corruptions which in turn
result in all sorts of infrequent and hard to track down runtime-crashes
(including in `ghc --make -j`) such as for instance
{{{
internal error: END_TSO_QUEUE object entered!
(GHC version 8.0.0.20160421 for powerpc64_unknown_linux)
}}}
Peter has already a patch in the works which simply replaces the atomic
powerpc primitives with `__sync_*` intrinsics which turn out to be more
portable than inline-asm. This would result in e.g.
{{{#!c
StgWord
cas(StgVolatilePtr p, StgWord o, StgWord n)
{
return __sync_val_compare_and_swap (p, o, n);
}
}}}
which then gets compiled as
{{{#!asm
000000000000004c <.cas>:
4c: 7c 00 04 ac sync
50: 7d 20 18 a8 ldarx r9,0,r3
54: 7c 29 20 00 cmpd r9,r4
58: 40 c2 00 0c bne- 64 <.cas+0x18>
5c: 7c a0 19 ad stdcx. r5,0,r3
60: 40 c2 ff f0 bne- 50 <.cas+0x4>
64: 4c 00 01 2c isync
68: 7d 23 4b 78 mr r3,r9
6c: 4e 80 00 20 blr
}}}
I've been testing the patch already and it seems to have made all issues I
experienced so far disappear, as well as fixing the `concprog01` test
which was also failing infrequently.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12070#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list