Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Merijn Verstraaten merijn at inconsistent.nl
Wed Sep 24 16:58:03 UTC 2014


Hi Simon,

Well, another issue that was raised (I forget by whom!) was the fact that stackoverflows are currently thrown externally from the executing thread and that this should really be changed into being thrown to the thread by itself, thereby avoiding the mask.

Cheers,
Merijn

On 24 Sep 2014, at 05:51 , Simon Marlow <marlowsd at gmail.com> wrote:
> Ok, sorry for the delay, we still need a resolution on this one.
> 
> So thanks to your persuasive comments I think I'm convinced.  What finally tipped me over the edge was this:
> 
> https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/base/Control/Concurrent/QSem.hs;165072b334ebb2ccbef38a963ac4d126f1e08c96$103-112
> 
> It turns out I've been a victim of this "bug" myself :-)  So let's fix it.
> 
> But what is the cost? Adding an uninterruptibleMask won't be free.
> 
> In the case of `catch`, since the mask is already built in to the primitive, we can just change it to be an uninterruptibleMask, and that applies to handle and onException too.  For `finally` we can replace the mask with an uninterruptibleMask, but for `bracket` we have to add a new layer of uninterruptibleMask.
> 
> Lots of documentation probably needs to be updated.  Any chance you could make a patch and upload it to Phabricator?
> 
> Cheers,
> Simon
> 
> On 05/09/2014 18:34, Eyal Lotem wrote:
>> Hey Simon, thanks for the reply!
>> 
>> 
>> On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <marlowsd at gmail.com
>> <mailto:marlowsd at gmail.com>> wrote:
>> 
>>    Eyal, thanks for bringing up this issue.  It's been at the back of
>>    my mind for a while, but I've never really thought through the
>>    issues and consequences of changes.  So this is a good opportunity
>>    to do that.  You point out (in another email in the thread) that:
>> 
>>    A) Cases that were not interruptible will remain the same.
>>    B) Cases that were interruptible were bugs and will be fixed.
>> 
>>    However,
>> 
>>    C) Some bugs will turn into deadlocks (unkillable threads)
>> 
>>    Being able to recover from bugs is an important property in large
>>    long-running systems.  So this is a serious problem.  Hence why I
>>    always treat uninterruptibleMask with the deepest suspicion.
>> 
>> 
>> Recovering from various kinds of failures makes a lot of sense. But how
>> can you recover from arbitrary invariants of the program being broken?
>> 
>> For example, if you use a bracket on some semaphore monitoring a global
>> resource. How do you recover from a bug of leaking semaphore tokens?
>> 
>> Recovering from crashes of whole processes whose internal state can be
>> recovered to a fresh, usable state, is a great feature.
>> Recovering from thread crashes that share arbitrary mutable state with
>> other threads is not practical, I believe.
>> 
>>    Let's consider the case where we have an interruptible operation in
>>    the handler, and divide it into two (er three):
>> 
>>      1. it blocks for a short bounded amount of time.
>>      2. It blocks for a long time
>>      3. It blocks indefinitely
>> 
>>    These are all buggy, but in different ways.  Only (1) is fixed by
>>    adding uninterruptibleMask.  (2) is "fixed", but in exchange for an
>>    unresponsive thread - also undesirable.  (3) was a bug in the
>>    application code, and turns into a deadlock with
>>    uninterruptibleMask, which is undesirable.
>> 
>> 
>> I think that (1) is by far the most common and is very prevalent. I
>> think 0-time interruptible (that can block but almost never do)
>> operations are the most common cleanup handlers.
>> 
>> For (2) and (3), we need to choose the lesser evil:
>> 
>> A) Deadlocks and/or unresponsiveness
>> B) Arbitrary invariants being broken and leaks
>> 
>> In my experience, A tends to manifest exactly where the bug is, and is
>> therefore easy to debug and mostly a "performance bug" .
>> B tends to manifest as difficult to explain behavior elsewhere from
>> where the bug actually is, and is usually a "correctness bug", which is
>> almost always worse.
>> 
>> Therefore, I think A is a far far lesser evil than B, when (2) and (3)
>> are involved.
>> 
>> I'd like to reemphasize that this change will almost always fix the
>> problem completely since the most common case is (1), and in rare cases,
>> it will convert B to A, which is also, IMO, very desirable.
>> 
>> 
>>    This is as far as I've got thinking through the issues so far.  I
>>    wonder to what extent the programmer can and should mitigate these
>>    cases, and how much we can help them.  I don't want unkillable
>>    threads, even when caused by buggy code.
>> 
>> 
>>    Cheers,
>>    Simon
>> 
>> 
>>    On 04/09/2014 16:46, Roman Cheplyaka wrote:
>> 
>>        I find your arguments quite convincing. Count that as +1 from me.
>> 
>>        Roman
>> 
>> 
>> 
>>        _________________________________________________
>>        Libraries mailing list
>>        Libraries at haskell.org <mailto:Libraries at haskell.org>
>>        http://www.haskell.org/__mailman/listinfo/libraries
>>        <http://www.haskell.org/mailman/listinfo/libraries>
>> 
>> 
>> 
>> 
>> --
>> Eyal
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140924/679bbae6/attachment.sig>


More information about the Libraries mailing list