The curious case of #367: Infinite loops can hang Concurrent Haskell

Mon Aug 17 08:40:06 UTC 2020

Hi there!

While working on a NCG, I eventually came across #367[0], which make GHC produce
code that looks similar to this:

```
label:
  [non-branch-instructions]*
  brach-instruction label
```

so essentially an uninterruptible loop. The solution for GHC to
produce code that
can be interrupted is to pass -fno-omit-yields.

So far so good. Out of curiosity, I did add a small piece of code to
detect this to my NCG
to complain if code like the above was generated[1].

Three weeks ago, I kind of maneuvered myself into a memory blow up
corner, and then
life happened, but this weekend I managed to find some time to revert
some memory
blow up and continue working on the NCG.  Turns out I can build a
stage2 "quick" flavour
of the NCG without dynamic support just fine.  I never saw the dead
lock detection code fire.

Now I did leave the test suite running yesterday night, and when
looking through the
test suite results, there were quite a few failure. Curiously a lot of
them were due to
ghc missing dynamic support (doh!).  But also quite a few that failed
due to the deadlock
detection.

T12485, hs_try_putmvar003, ds-wildcard, ds001, read029, T2817, tc011,
tc021, T4524

So, my question then is this: are we fine with ghc generating this
code? Or, if we are not, do we want to figure out if we can eliminate
it? The issue 367 goes into quite a bit of detail why this is tricky
to handle generally.

Or should we add -fno-omit-yields to the test-cases? The ultimate
option is to just turn of the
detection, and I'm fine with doing so. However I'd rather ask if
anyone sees value in detecting
this or not.

Cheers,
 Moritz

--
[0]: https://gitlab.haskell.org/ghc/ghc/-/issues/367
[1]: https://gitlab.haskell.org/ghc/ghc/-/blob/46fba2c91e1c4d23d46fa2d9b18dcd000c80363d/compiler/GHC/CmmToAsm/AArch64/Ppr.hs#L134-159