Async exceptions and delimited continuations

Ben Gamari ben at smart-cactus.org
Thu Jul 2 17:46:20 UTC 2020


Alexis King <lexi.lambda at gmail.com> writes:

> Hi all,
>
> As some of you are likely aware, I have an open GHC proposal[1] to add
> native support for delimited continuations to the RTS. I also have an
> incomplete implementation,[2] and the only major remaining obstacle
> concerns async exceptions. The issue is subtle, so I’ve tried to
> provide all the necessary context in this email. If you’re already
> familiar with the ideas at play, you can skip the context about how
> delimited continuations work.
>
...

> This tiny optimization actually allows not one but two savings:
>
>     1. The immediate benefit is that we save a stack frame.
>
>     2. The hidden benefit is that we never need to push the old
>        exception masking state onto the stack.
>
>        If we had multiple mask_ frames on the stack simultaneously, we
>        wouldn’t know what to do when returning: should we unmask them,
>        or should they stay masked? We’d need to push that information
>        onto the stack in the same way we must push a return address
>        when calling a function.
>
>        By skipping these redundant stack frames, we can always be
>        certain the right thing to do on return is to unmask async
>        exceptions. No need to store anything else.
>
Indeed. However, I don't think it would be that expensive to store the
masking state with a suitable implementation strategy. Specifically,
don't add a "masking state" field to the return frame. Rather, I think
you want to have `mask` push one of two possible return frame types:

 * in the case that we are already masked push an "already masked"
   frame. The entry code for this would be a no-op.

 * in the case that we aren't currently masked push the usual
   stg_maskAsyncExceptionszh_ret_info return frame.

This incurs minimal overhead while preserving the information that you
need. We avoid adding a word to the stack frame (likely saving an
instruction) and in cases where today the optimsation fires the user
would only pay for a return to a trivial entry frame. It seems to me
that this is as close to free as we will get.

Under this scheme `control` can simply look for "already masked" frames
when copying the stack.

> (This explanation is a slight simplification, especially once
> maskUninterruptible comes into play, but it’s close enough.)
>
> Now you may see the looming problem: this strategy completely breaks
> down in the presence of delimited continuations. With delimited
> continuations, we might have a program like this:
>
>     mask_ $ prompt $ mask_ $ ...
>
> If we capture a continuation up to this prompt, we expect the inner
> mask_ frame to be captured along with it. But that won’t happen if we
> never pushed a mask_ frame at all due to the aforementioned
> optimization.
>
> So now I can finally state my question: what is the right solution for
> this? I see three obvious possible ways forward:
>
>     1. Keep track of whether or not we’re inside a prompt and skip the
>        optimization in that case.
>
>     2. Keep some bookkeeping that tracks the modifications to the
>        async exception masking state since the most recently pushed
>        prompt.
>
>     3. Just don’t bother with the optimization at all.
>
> Option 3 certainly seems the most appealing from a simplicity point of
> view, and I suspect the optimization doesn’t matter much in practice.
> Why? Because the real `mask` implementation from Control.Exception
> already avoids re-masking exceptions if they’re masked! (And that’s
> okay, because prompt# and control0# are not intended to be used
> directly in IO, so code that uses them can provide its own version of
> `mask`.) However, it is admittedly possible for the restore action
> passed to the argument of `mask` to create redundant calls, as the
> check is only performed in `mask` itself.
>
> Is eliminating this optimization an acceptable compromise? Or is there
> reason to believe this is important for performance of real programs?
>
It never hurts to measure. Perhaps establish the worst-case performance
impact by looking at a simple program which just masks repeatedly.

For what it's worth, I rather doubt that this optimisation is going to
make or break any program. Exception operations in general are somewhat
rare in my experience and the cost of pushing a stack frame (or, perhaps
more importantly on modern hardware, returning through it) should be
reasonably cheap.

By the way, when you go to implement this do add a few comments to the
masking primitives. It took a bit of staring to work out what was going
on in there.

Cheers,

- Ben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200702/9200d5c2/attachment.sig>


More information about the ghc-devs mailing list