Proposal: Use uninterruptibleMask for cleanup actions in Control.Exception

Fri Sep 5 17:34:50 UTC 2014

Hey Simon, thanks for the reply!

On Fri, Sep 5, 2014 at 6:39 PM, Simon Marlow <marlowsd at gmail.com> wrote:

> Eyal, thanks for bringing up this issue.  It's been at the back of my mind
> for a while, but I've never really thought through the issues and
> consequences of changes.  So this is a good opportunity to do that.  You
> point out (in another email in the thread) that:
>
> A) Cases that were not interruptible will remain the same.
> B) Cases that were interruptible were bugs and will be fixed.
>
> However,
>
> C) Some bugs will turn into deadlocks (unkillable threads)
>
> Being able to recover from bugs is an important property in large
> long-running systems.  So this is a serious problem.  Hence why I always
> treat uninterruptibleMask with the deepest suspicion.
>

Recovering from various kinds of failures makes a lot of sense. But how can
you recover from arbitrary invariants of the program being broken?

For example, if you use a bracket on some semaphore monitoring a global
resource. How do you recover from a bug of leaking semaphore tokens?

Recovering from crashes of whole processes whose internal state can be
recovered to a fresh, usable state, is a great feature.
Recovering from thread crashes that share arbitrary mutable state with
other threads is not practical, I believe.

> Let's consider the case where we have an interruptible operation in the
> handler, and divide it into two (er three):
>
>  1. it blocks for a short bounded amount of time.
>  2. It blocks for a long time
>  3. It blocks indefinitely
>
> These are all buggy, but in different ways.  Only (1) is fixed by adding
> uninterruptibleMask.  (2) is "fixed", but in exchange for an unresponsive
> thread - also undesirable.  (3) was a bug in the application code, and
> turns into a deadlock with uninterruptibleMask, which is undesirable.
>

I think that (1) is by far the most common and is very prevalent. I think
0-time interruptible (that can block but almost never do) operations are
the most common cleanup handlers.

For (2) and (3), we need to choose the lesser evil:

A) Deadlocks and/or unresponsiveness
B) Arbitrary invariants being broken and leaks

In my experience, A tends to manifest exactly where the bug is, and is
therefore easy to debug and mostly a "performance bug" .
B tends to manifest as difficult to explain behavior elsewhere from where
the bug actually is, and is usually a "correctness bug", which is almost
always worse.

Therefore, I think A is a far far lesser evil than B, when (2) and (3) are
involved.

I'd like to reemphasize that this change will almost always fix the problem
completely since the most common case is (1), and in rare cases, it will
convert B to A, which is also, IMO, very desirable.

> This is as far as I've got thinking through the issues so far.  I wonder
> to what extent the programmer can and should mitigate these cases, and how
> much we can help them.  I don't want unkillable threads, even when caused
> by buggy code.

> Cheers,
> Simon
>
>
> On 04/09/2014 16:46, Roman Cheplyaka wrote:
>
>> I find your arguments quite convincing. Count that as +1 from me.
>>
>> Roman
>>
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://www.haskell.org/mailman/listinfo/libraries
>>
>>

-- 
Eyal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140905/6d960214/attachment.html>