Final bikeshedding call: Fixing Control.Exception.bracket

Thu Nov 13 10:58:07 UTC 2014

On 11/11/2014 23:28, Yuras Shumovich wrote:
> On Wed, 2014-11-12 at 02:02 +0300, Yuras Shumovich wrote:
>> On Tue, 2014-11-11 at 22:14 +0000, John Lato wrote:
>>> I don't follow your argument. You state that a user should assume their
>>> cleanup action can block.
>>
>> Lets use accurate terms. You probably mean "can be interrupted". If
>> action can block, then it is may (usual case) or may not (e.g. when
>> async exceptions are masked) be interruptible. Please let me know if you
>> disagree with this definitions -- I just want to ensure everybody uses
>> the same definitions.
>>
>>> Blocking implies that the function may receive an
>>> async exception. The only way to guarantee code runs in the presence of
>>> async exceptions is uninterruptibleMask (or something built on it).
>>
>> I don't understand that. The code definitely runs with or without
>> uninterruptibleMask. In the first case it can be interrupted by async
>> exception. But in both cases it can throw sync exception.
>
> Hmm, I reread your email, and now I probably see what you mean. You
> think about cleanup as atomic action that either runs or is interrupted.
> But that is not true. Consider the example I already used earlier:
>
> -- | database that uses two files
> data DB = DB Handle Handle
>
> closeDB :: DB -> IO ()
> closeDB (DB h1 h2) = hClose h1 >> hClose h2
>
> The cleanup action "closeDB" above is buggy because the first hClose can
> be interrupted. In that case the first handle will be closed, but the
> second will leak. Note: "closeDB" is not atomic -- it consists from two
> interruptible different actions. The same with hClose itself -- if can
> be interrupted somewhere in the middle, but it is able to handle that.
>
> The correct cleanup probably should look like the next:
>
> closeDB (DB h1 h2) = hClose h1 `finally` hClose h2
>
> Note: the initial version is buggy with respect to both async and sync
> exceptions, and uninterruptibleMask will fix it only with respect to
> async exceptions.

But in order to know that this is correct, you have to know how hClose 
works, namely that if an async exception is raised then it has safely 
closed the handle anyway.  If you didn't know that, then you would have 
to code it like this instead

closeDB (DB h1 h2) = cleanup1
   where
   cleanup1 = (hClose h1 `onException` cleanup1) >> cleanup2
   cleanup2 = hClose h2 `onException` cleanup2

Does this really work?  I think so, but I haven't tested it...

I think your point about async exceptions under mask just being a 
special case of sync exceptions is a good one, though.  I'm now 
wondering whether maybe the right solution is to make it easier to use 
the idiom in the above code.

Cheers,
Simon

> The second version is (I hope) exception-safe -- it handle both async
> and sync exceptions. That is important point -- if you need
> uninterruptibleMask, then probably you have issue with sync exceptions
> too. Lets fix the original issue and make code exception safe instead of
> hiding it behind uninterruptibleMask.
>
>>
>>> So
>>> doesn't that assumption imply that uninterruptibleMask should be the
>>> default?
>>
>> No, I don't see how it implies that.
>>
>>>
>>> Although you also seem to be saying that functions like hClose etc. should
>>> use uninterruptibleMask internally anyway. Possibly, but I'm less convinced
>>> that's a good idea. It's a pretty blunt hammer to have hidden from the
>>> user, and if it has to be used bracket seems like a better choice because
>>> it will fix many cases instead of just one and library authors won't have
>>> to struggle to get it right.
>>
>> No, I propose to write exception safe code in the first place. And use
>> uninterruptibleMask if it is necessary (usually it is not necessary.)
>>
>> Just adding uninterruptibleMask to bracket doesn't make is easer to
>> write exception safe code. hClose can throw (synchronous) exceptions
>> anyway, so you should be prepared. There is no other way except ensuring
>> that hClose is exception-safe *and* you are using it in exception-safe
>> manner.
>>
>>>
>>> If you can write an hClose implementation that does the right thing, is
>>> async-safe, and doesn't use uninterruptibleMask or the equivalent, I might
>>> be convinced the proposed solution is bad.
>>
>> It is already implemented in such the way. Let me explain.
>> There are two sources of possible interruptions in hClose:
>>    a) takeMVar
>>    b) flushing internal buffer
>>
>> a) is not an issue in practice -- it will not be interrupted unless
>> someone already uses the Handle (if it is the case, then you probably
>> has bigger issue -- you may use already closed handle.) But it probably
>> should be more careful and use uninterruptibleMask here... I don't have
>> strong opinion.
>> b) is handled correctly, see
>> https://github.com/ghc/ghc/blob/805ee118b823f271dfd8036d35b15eb3454a95ad/libraries/base/GHC/IO/Handle/Internals.hs#L734
>> Basically it catches all exceptions (including async,) closes the handle
>> and rethrows the exception.
>>
>> Let me state it again: hClose closes the handle in case of exception,
>> including async exception. And that is the only correct behavior --
>> every cleanup action should do that, otherwise it is not exception safe.
>> Masking async exception doesn't magically make code exception safe.
>>
>>>
>>> John L.
>>>
>>> On 13:03, Tue, Nov 11, 2014 Yuras Shumovich <shumovichy at gmail.com> wrote:
>>>
>>>> On Tue, 2014-11-11 at 12:17 -0800, Merijn Verstraaten wrote:
>>>>> Allocation should not use uninterruptibleMask as it is possible to
>>>> handle async exceptions during allocation by nesting bracketOnError
>>>>>
>>>>> Example:
>>>>> someFun mvar1 mvar2 = do
>>>>>      (val1, val2) <- bracketOnError
>>>>>          (takeMVar mvar1)
>>>>>          (putMVar mvar1)
>>>>>          (\x -> takeMVar mvar2 >>= \y -> return (x, y)))
>>>>>
>>>>> This can be made nicer using the Cont monad to hide the marching to the
>>>> left. The same cannot be done for cleanup, as there's no sane thing as
>>>> "half a cleanup".
>>>>
>>>> It definitely can be done for cleanups too. E.g. using
>>>> uninterruptibleMask as the last resort.
>>>>
>>>> And "half a cleanup" is valid thing. E.g. database that uses two files:
>>>>
>>>> data DB = DB Handle Handle
>>>>
>>>> The following cleanup action is buggy:
>>>>
>>>> close :: DB -> IO ()
>>>> close (DB h1 h2) = hClose h1 >> hClose h2
>>>>
>>>> Note: it is broken regardless async exceptions(!)
>>>> You can get half cleanup even under uninterruptibleMask
>>>>
>>>> The code is either exception-safe or buggy. You can't magically fix
>>>> buddy code using uninterruptibleMask.
>>>>
>>>>>
>>>>> I disagree that it should be left to the author of allocation operation
>>>> to ensure uninterruptibility as it is impossible to know whether a given IO
>>>> blocks internally and thus should be masked without inspecting the *entire*
>>>> code path potentially called by the cleanup handler.
>>>>
>>>> (You probably mean "the author of *cleanup* operation"? I'll assume
>>>> that)
>>>>
>>>> Hm... You have to inspect code if you expect it to be buggy. Otherwise
>>>> you should assume it is interruptible (unless listed in Cotrol.Exception
>>>> module in "Interruptible operations" section or explicitly stated in
>>>> other way), but all cleanup actions do full cleanup even if they throw
>>>> exception. If the code is buggy -- it should be fixed, not hidden.
>>>>
>>>>>
>>>>> Both Eyal and me have had trouble with this where we had to entire half
>>>> of base and part of the runtime, to figure out whether our code was async
>>>> exception safe. Auditing half the ecosystem to be able to write a safe
>>>> cleanup handler is *NOT* a viable option.
>>>>
>>>> You need to audit half of ecosystem anyway to ensure allocating actions
>>>> are not buggy.
>>>> And as the example above shows, even uninterruptibleMask doesn't
>>>> guaranty anything.
>>>>
>>>>
>>>> I agree that current situation is bad. I have two drafts in my mailbox
>>>> where I propose to use uninterruptibleMask in bracket, but I didn't send
>>>> them -- every time I found that it doesn't fix anything actually. I
>>>> don't know better solution, but the proposal is not even a solution.
>>>>
>>>> Thanks,
>>>> Yuras
>>>>
>>>>>
>>>>> Cheers,
>>>>> Merijn
>>>>>
>>>>>> On 11 Nov 2014, at 11:58, Yuras Shumovich <shumovichy at gmail.com>
>>>> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Should we use `uninterrubtibleMask` for allocating action too?
>>>>>>
>>>>>>
>>>>>> I'm not sure my voice will be counted, but anyway,
>>>>>> I'm strong -1 because it fixes wrong issue.
>>>>>>
>>>>>> `hClose` is interruptible, but it closes the handle in any case. I'm
>>>>>> pretty sure. I ask that question (see
>>>>>> http://haskell.1045720.n5.nabble.com/Control-Exception-
>>>> bracket-is-broken-td5752251.html ) but didn't get any answer, so I read
>>>> code and made experiments. IIRC `hClose` wraps internal interruptible
>>>> action into `try` and handles everything correctly.
>>>>>>
>>>>>> I argue that cleanup action can be interruptible, but should ensure
>>>>>> cleanup is done. As the last resort, it should use
>>>> `uninterrubtibleMask`
>>>>>> internally.
>>>>>>
>>>>>> Other issue is that a lot of allocating action are broken because they
>>>>>> perform interruptible actions after allocating resource without
>>>> handling
>>>>>> async exceptions. So my point is that masking async exceptions solves
>>>>>> only one half of the issue while masking the other.
>>>>>>
>>>>>> Handling async exceptions is hard, and we can't make is easy using
>>>>>> `uninterrubtibleMask`. Instead we should educate ourselves to do it
>>>>>> correctly from the very beginning. There is only one alternative --
>>>>>> remove async exceptions from haskell.
>>>>>>
>>>>>> To summarize,
>>>>>> - allocating action should either allocate resource or throw exception;
>>>>>> it is a bug to allocate resource *and* throw exception
>>>>>> - cleanup action should release resource even if it throws an exception
>>>>>> Developer should ensure both properties holds.
>>>>>>
>>>>>> Sorry my poor English.
>>>>>>
>>>>>> Thanks,
>>>>>> Yuras
>>>>>>
>>>>>> On Tue, 2014-11-11 at 10:09 -0800, Merijn Verstraaten wrote:
>>>>>>> Ola!
>>>>>>>
>>>>>>> In September Eyal Lotem raised the issue of bracket's cleanup handler
>>>> not being uninterruptible [1]. This is a final bikeshedding email before I
>>>> submit a patch.
>>>>>>>
>>>>>>> The problem, summarised:
>>>>>>> Blocking cleanup actions can be interrupted, causing cleanup not to
>>>> happen and potentially leaking resources.
>>>>>>>
>>>>>>> Main objection to making the cleanup handler uninterruptible:
>>>>>>> Could cause deadlock if the code relies on async exceptions to
>>>> interrupt a blocked thread.
>>>>>>>
>>>>>>> I count only two objections in the previous thread, 1 on the grounds
>>>> that "deadlocks are NOT unlikely" and 1 that is conditioned on "I don't
>>>> believe this is a problem".
>>>>>>>
>>>>>>> The rest seems either +1, or at least agrees that the status quo is
>>>> *worse* than the proposed solution.
>>>>>>>
>>>>>>> My counter to these objections is:
>>>>>>> 1) No one has yet shown me any code that relies on the cleanup
>>>> handler being interruptible
>>>>>>>
>>>>>>> 2) There are plenty of examples of current code being broken, for
>>>> example every single 'bracket' using file handles is broken due to handle
>>>> operations using a potentially blocking MVar operation internally,
>>>> potentially leaking file descriptors/handles.
>>>>>>>
>>>>>>> 3) Even GHC-HQ can't use bracket correctly (see Simon's emails)
>>>>>>>
>>>>>>> Potential solution #1:
>>>>>>> Leave bracket as-is, add bracketUninterruptible with an
>>>> uninterruptible cleanup handler.
>>>>>>>
>>>>>>> Potential solution #2:
>>>>>>> Change bracket to use uninterruptible cleanup handler, add
>>>> bracketInterruptible for interruptible cleanups.
>>>>>>>
>>>>>>> Trade-offs:
>>>>>>> Solution 1 won't change the semantics of any existing code, however
>>>> this also means that any currently broken uses of bracket will remain
>>>> broken, possibly indefinitely.
>>>>>>>
>>>>>>> Solution 2 will change the semantics of bracket, which means any
>>>> currently broken uses of bracket will be fixed, at the cost of creating
>>>> potential deadlocks in code that relies on the interruptibility of cleanup.
>>>>>>>
>>>>>>> I will argue that solution #2 is preferable, since I have yet to see
>>>> any code that uses the interruptibility of the cleanup handler. Whereas
>>>> there's many broken assumption assuming the cleanup handler is not
>>>> interruptible.
>>>>>>>
>>>>>>> Secondly, it is easier to detect deadlocks caused by this problem
>>>> than it is to detect resource leaks which only happen in unlucky timings of
>>>> async exceptions. Especially since any deadlock caused by the change can be
>>>> fixed by replacing bracket with bracketInterruptible.
>>>>>>>
>>>>>>> [1] - https://www.haskell.org/pipermail/libraries/2014-
>>>> September/023675.html
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Merijn
>>>>>>> _______________________________________________
>>>>>>> Libraries mailing list
>>>>>>> Libraries at haskell.org
>>>>>>> http://www.haskell.org/mailman/listinfo/libraries
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> Libraries at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/libraries
>>>>
>>
>>
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>