[GHC] #15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to hang
GHC
ghc-devs at haskell.org
Sat Jul 21 11:34:12 UTC 2018
#15427: Calling hs_try_putmvar from an unsafe foreign call can cause the RTS to
hang
-------------------------------------+-------------------------------------
Reporter: syntheorem | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Runtime | Version: 8.4.3
System |
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: Runtime crash
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
An unsafe foreign call which calls `hs_try_putmvar` can cause the RTS to
hang, preventing any Haskell threads from making progress. However,
compiling with `-debug` causes it instead to fail an assertion in the
scheduler:
{{{
internal error: ASSERTION FAILED: file rts/Schedule.c, line 510
(GHC version 8.4.3 for x86_64_apple_darwin)
}}}
Here is a minimal test case which reproduces the assertion. It needs to be
built with `-debug -threaded` and run with `+RTS -N2` or higher.
{{{#!hs
import Control.Concurrent (forkIO, threadDelay)
import Control.Concurrent.MVar (MVar, newEmptyMVar, takeMVar)
import Control.Monad (forever)
import Foreign.C.Types (CInt(..))
import Foreign.StablePtr (StablePtr)
import GHC.Conc (PrimMVar, newStablePtrPrimMVar)
foreign import ccall unsafe hs_try_putmvar :: CInt -> StablePtr PrimMVar
-> IO ()
main = do
mvar <- newEmptyMVar
forkIO $ forever $ do
takeMVar mvar
forkIO $ forever $ do
sp <- newStablePtrPrimMVar mvar
hs_try_putmvar (-1) sp
threadDelay 1
-- Let it spin a few times to trigger the bug
threadDelay 500
}}}
I actually checked out GHC and added this as a test case and did some
debugging. The specific assertion that fails is `ASSERT(task->cap ==
cap)`. This seems to happen because of this code in `hs_try_putmvar`:
{{{#!c
Task *task = getTask();
// ...
ACQUIRE_LOCK(&cap->lock);
// If the capability is free, we can perform the tryPutMVar immediately
if (cap->running_task == NULL) {
cap->running_task = task;
task->cap = cap;
RELEASE_LOCK(&cap->lock);
// ...
releaseCapability(cap);
} else {
// ...
}
}}}
Basically it assumes that the current thread's task isn't currently
running a capability, so it takes a new one and then releases it without
restoring the previous value of `task->cap`.
Modifying the code to restore the value of `task->cap` after releasing the
capability fixes the assertion. But I don't know enough about the RTS to
be sure I'm not missing something here. In particular, is there a problem
with the task basically holding two capabilities for a short time?
My other thought is that maybe it should check if its task is currently
running a capability, and in that case do something else. But I'm not sure
what.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15427>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list