[GHC] #15449: Nondeterministic Failure on aarch64 with -jn, n > 1
GHC
ghc-devs at haskell.org
Wed Sep 5 21:12:37 UTC 2018
#15449: Nondeterministic Failure on aarch64 with -jn, n > 1
-------------------------------------+-------------------------------------
Reporter: tmobile | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.8.1
Component: Compiler | Version: 8.4.3
Resolution: | Keywords:
Operating System: Linux | Architecture: aarch64
Type of failure: Compile-time | Test Case:
crash or panic |
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Thra11):
I have noticed something which might be relevant. I have two aarch64
machines:
1. Quad-core laptop: 4 x A53, 2GB RAM
2. Hex-core SBC: 2 x A72 + 4 x A53, 4GB RAM
Testing with trommler's test-package, GHC on the Hex-core with the A72
cores fails often (segmentation fault/illegal hardware instruction/bus
error), while the Quad core ''without'' the A72 cores consistently
succeeds.
> As far as the difference between 32-bit and 64-bit ARM, the only thing I
can guess is that perhaps the smaller ARM chips have much simpler
instruction pipelines and don't necessarily perform the allowed
reorderings in practice?
Following this line of thinking, I'm wondering if the A53's fall into the
'simpler instruction pipeline' bucket, while the A72's and Denver2's are
more complex. The other possibility that springs to mind is that having
faster cores simply changes timings so as to make certain race conditions
more likely. However, if this was the case, I think I would expect to see
at least ''some'' failures on the slower CPU.
trommler mentions that he was seeing the failures on a NVIDIA Jetson TX2,
which appears to be 2 x Denver2 + 4 x A57. I'm not familiar with these
cores, but I assume that at least the Denver2 is fairly complex.
I have found that the laptop's (Quad core A53) success isn't limited to
this little test case. Before I got the SBC (2xA72 + 4xA53, which I use as
a nix build server), I successfully built GHC and a range of haskell
packages on the laptop (slowly: 2G RAM ends up swapping quite a bit
building GHC). However, using the SBC, I haven't been able to build GHC
itself, and package building is inconsistent (some packages sometimes
succeed, others always fail).
Apologies if this is all rather speculative and anecdotal, but I'm hoping
it might give someone more familiar with ghc, llvm and CPUs ideas.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15449#comment:13>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list