[GHC] #8006: Asynchronous exception rethrown synchronously inside runStmt
GHC
ghc-devs at haskell.org
Fri Jun 21 14:30:22 CEST 2013
#8006: Asynchronous exception rethrown synchronously inside runStmt
-----------------------------+----------------------------------------------
Reporter: edsko | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 7.6.3 | Keywords:
Os: Unknown/Multiple | Architecture: Unknown/Multiple
Failure: None/Unknown | Blockedby:
Blocking: | Related:
-----------------------------+----------------------------------------------
This bug is related to various bugs to do with asynchronous exceptions
[http://hackage.haskell.org/trac/ghc/ticket/3997],
[http://hackage.haskell.org/trac/ghc/ticket/5902], as well as the request
for a means to terminate a code snippet started with `runStmt`
[http://hackage.haskell.org/trac/ghc/ticket/1381].
The only means to terminate such a snippet is to send an asynchronous
exception to the thread that executed the `runStmt` (while, in versions of
ghc prior to Simon M's patch listed in 1381, making sure that sandboxing
is disabled). The problem with this approach is that this exception might
interrupt the *typechecker* instead of the snippet.
This is problematic, because a relatively large part of the typechecker
runs inside a call to `unsafeInterleaveIO`, through a call to `forkM` or
`forkM_maybe` from
* `loadDecl` in `compiler/iface/LoadIFace.lhs` [[BR]]
* `tc_iface_decl`, `tcIfaceDataCons`, `tcIfaceInst`, `tcIfaceFamInst`,
`tcIfaceRule`, `tcIfaceVectInfo`, `tcUnfolding`, `tcIfaceWrapper`,
`tcPragExpr`, `tcUnfolding`, `tcIfaceWrapper` and `tcPragExpr` in
`compiler/iface/TcIFace.lhs` [[BR]]
If the asynchronous exception is caught inside `unsafeInterleaveIO` and
rethrown synchronously (typically implicitly through a call to `try` or
`catch` that only catches exceptions of a certain type -- specifically
`IOEnvFailure` in this case) then this exception becomes the value of the
thunk and any subsequent attempt to poke that thunk will rethrow the
exception.
The attached test case illustrates this. After some minimal set up, we
repeatedly use `runStmt` to execute a snippet which simply waits for 1
second. Before we do this, we fork a thread which waits a random delay
between 0 and 0.1 seconds, and then sends an asynchronous exception to the
main thread. We catch this exception, make sure it's the exception we're
expecting, and repeat. On every iteration we throw a different exception;
however, when we run the code we will get
{{{
Unexpected exception CountedThreadKilled 4
Expected (CountedThreadKilled 8)
}}}
or similar output. In other words, we are catching an *old* exception,
that we threw and caught earlier. This must be because this exception has
become the value of a thunk, either inside `unsafePerformIO` or
`unsafeInterleaveIO` -- and so far the only candidate I have found is the
`unsafeInterleaveIO` inside `forkM_maybe`.
So this leaves us with two questions:
1. Are there other (relevant) places where `unsafePerformIO` or
`unsafeInterleaveIO` are used? I ''think'' the answer to this is is no,
but I am not sure.
2. Where do we catch and rethrow exceptions?
Unfortunately, question 2 is more difficult to answer. The function `tryM`
(`compiler/utils/IOEnv.hs`) wraps around `try`, but it in turn gets
wrapped in lots of places: `recoverM`, `recoverTR`, `tryTcErrs`,
`checkNoErrs` (this one is used a lot), `mapAndRecoverM` (also used
frequently), `tryTc` and `runPlans`.
There is a call to `tryM` directly inside the `unsafeInterleaveIO` in
`forkM_maybe`, so that's an obvious candidate. However, as an experiment,
I replaced the `tryM` with `tryAllM` (not really a solution, as discussed
below) and re-ran the tests, and it was still failing in a similar way.
This means that this `tryM` cannot be the only culprit, because with this
change `forkM` will "re"throw all exceptions as an `GhcException`, never
as the original exception (certainly not as the custom exception type that
the test is using). Since we were still getting our custom exception, this
must have been rethrown elsewhere.
Unfortunately `tryM` or one of its variants gets called in lots of places:
* `initDs` in `compiler/deSugar/DsMonad.lhs` [[BR]]
* `dataConInfoPtrToName` in `compiler/ghci/DebuggerUtils.hs` [[BR]]
* `addConstraint`, `cvObtainTerm`, `cvReconstructType` and
`congruenceNewtypes` in `compiler/ghci/RtClosureInspect.hs` [[BR]]
* `tcPolyBinds`, `tcPolyInfer`, `tcSpecPrags`, `tcImpPrags` and `tcTySigs`
in `compiler/typecheck/TcBinds.lhs` [[BR]]
* `tcClassDecl2` in `compiler/typecheck/TcClassDcl.lhs` [[BR]]
* `check_instance` in `compiler/typecheck/TcDefaults.lhs` [[BR]]
* `tcDeriving`, `makeDerivSpecs` and `inferInstanceContexts` in
`compiler/typecheck/TcDeriv.lhs` [[BR]]
* `tc_hs_type` in `compiler/typecheck/TcHsType.lhs`
* `tcInstDecls1`, `tcClsInstDecl` and `tcInstDecl2` in
`compiler/typecheck/TcInstDcls.lhs` [[BR]]
* `tcRnExtCore`, `tc_rn_src_decls`, `rnTopSrcDecls`, `tcUserStmt`,
`tcGhciStmts`, `lookup_rdr_name` in `compiler/typecheck/TcRnDriver.lhs`
[[BR]]
* `initTc` in `compiler/typecheck/TcRnMonad.lhs` [[BR]]
* `tcTopSplice`, `tcTopSpliceExpr`, `tcTopSpliceType`, `runMeta`,
`qRecover`, `reifyInstances` in `compiler/typecheck/TcSplice.lhs` [[BR]]
* `tcTyAndClassDecls`, `tcTyClGroup` and `checkValidTyCon` in
`compiler/typecheck/TcTyClsDecls.lhs`
Figuring out which of these may be called inside the `forkM` calls is
tricky. I tried running the test with a profiled built of `ghc` with
`-fprof-auto` enabled, but unfortunately I was unable to reproduce a test
failure with that approach, I'm not sure why.
But even if we did find the culprits, fixing it will not be easy because
there is no generic solution. In `forkM_maybe` we could return `Nothing`
when we catch an asynchronous exception, but then we would call `pgmError`
inside `forkM` thereby rethrowing the asynchronous exception as a
synchronous `GhcException`. That wouldn't solve anything.
Instead, we'd have to rethrow the asynchronous exception using `throwTo`,
but then we need to decide what to do after the `throwTo` returns, and
that might be a rather difficult question to answer generically for all of
the typechecker. Can we just re-run the `thing_inside`? Probably not,
there might be all kinds of side effects; and checking for all those side
effects would be quite a large task, esp given that there are so many
places (list above) where we are caling `tryM` (directly through `forkM`
or deeper down the call stack).
I think a more realistic solution is to provide a means to detect when the
snippet starts running. I outlined such a solution at
http://www.haskell.org/pipermail/ghc-devs/2013-June/001380.html, but Simon
M objected that this proposal might not work because there is no clear
point where type checking finishes and the snippet starts executing. So
far I think it would probably be easier to make sure that such a point
*does* exist than to fix the asynchronous exception problem. (Note that I
have adopted the solution outlined in my email and it seems to work for
me, but since this bug is non-deterministic that of course doesn't mean it
actually works..)
However, I've run out of time to work on this for now. Might be continued.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/8006>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list