[GHC] #15449: Nondeterministic Failure on aarch64 with -jn, n > 1
GHC
ghc-devs at haskell.org
Tue Jul 31 20:28:22 UTC 2018
#15449: Nondeterministic Failure on aarch64 with -jn, n > 1
-------------------------------------+-------------------------------------
Reporter: tmobile | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.4.3
Resolution: | Keywords:
Operating System: Linux | Architecture: aarch64
Type of failure: Compile-time | Test Case:
crash or panic |
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by tmobile):
If this
[https://en.wikipedia.org/wiki/Memory_ordering#In_symmetric_multiprocessing_(SMP)_microprocessor_systems
table] is to be trusted, it looks like ARM 7 and PPC allow for the same
sorts of load/store reorderings. As far as the difference between 32-bit
and 64-bit ARM, the only thing I can guess is that perhaps the smaller ARM
chips have much simpler instruction pipelines and don't necessarily
perform the allowed reorderings in practice? But I might be missing
something. As for x86 it does seem like we're getting away with this
because of the stricter memory model.
If I'm reading [https://llvm.org/docs/Atomics.html#sequentiallyconsistent
this bit] correctly, as a frontend, we shouldn't have to emit any fences
to ensure the semantics of `seq_cst`. If the target machine's memory model
would require a fence to encode a `load atomic ... seq_cst` then it's on
`llc` to emit it, not GHC, right? This seems to be contradicted by this
equation in `genCall`:
{{{
genCall (PrimTarget MO_WriteBarrier) _ _ = do
platform <- getLlvmPlatform
if platformArch platform `elem` [ArchX86, ArchX86_64, ArchSPARC]
then return (nilOL, [])
else barrier
}}}
Here we implement an arch-specific optimization on our own; I would've
expected LLVM to be responsible for that, not GHC.
I'm also a bit confused by this equation:
{{{
genCall (PrimTarget (MO_AtomicWrite _width)) [] [addr, val] =
runStmtsDecls $ do
addrVar <- exprToVarW addr
valVar <- exprToVarW val
let ptrTy = pLift $ getVarType valVar
ptrExpr = Cast LM_Inttoptr addrVar ptrTy
ptrVar <- doExprW ptrTy ptrExpr
statement $ Expr $ AtomicRMW LAO_Xchg ptrVar valVar SyncSeqCst
}}}
I must be missing some trick here; why isn't this implemented with `store
atomic`? There isn't even a constructor for `store atomic` in
`LlvmExpression` in `compiler/llvmGen/Llvm/AbsSyn.hs`.
I think I'll try the sledgehammer approach of sticking fences before each
atomic read and after each atomic write; if the behavior improves at least
that's some evidence this is where the issue lies.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15449#comment:9>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list