Feasibility of native RTS support for continuations?

Thu Feb 6 08:28:23 UTC 2020

On Sat, 1 Feb 2020 at 00:23, Alexis King <lexi.lambda at gmail.com> wrote:

> > On Jan 30, 2020, at 02:35, Simon Marlow <marlowsd at gmail.com> wrote:
>
> > Also you might want to optimise the implementation so that it doesn't
> actually tear down the stack as it copies it into the heap, so that you
> could avoid the need to copy it back from the heap again in shift#.
>
> I don’t fully understand this point — do you mean “avoid the need to copy
> it back on continuation application”? shift# just copies the stack slice to
> the heap, so it doesn’t need to copy it back.
>

The issue here is that raiseAsync is destructive - it *moves* the stack to
the heap, rather than copying it. So if you want to continue execution
where you left off, for shift#, you would have to copy it back onto the
stack again. That's the point I was trying to highlight here.

 Cheers
Simon

> If I was right, and you were referring to continuation application rather
> than shift#, I agree with you there. It seems as though it ought to be
> possible to represent the stack slice as a stack itself, so if the stack
> looks like
>
>     ┌───────────┐
>     │ RET_SMALL │
>     ├───────────┤
>     │ CATCH     │
>     ├───────────┤
>     │ RESET     │
>     ├───────────┤
>
> then the captured continuation could itself essentially be a stack like
>
>     ┌───────────┐
>     │ RET_SMALL │
>     ├───────────┤
>     │ CATCH     │
>     ├───────────┤
>     │ UNDERFLOW │
>     └───────────┘
>
> where restoring a continuation just copies the captured stack and updates
> its underflow frame to point at the top of the current stack. If the caller
> promises not to use the continuation again after applying it, the copying
> could be skipped entirely, and the captured stack could just become the new
> stack.
>
> However, I don’t understand enough about the way the RTS currently works
> to know if this is viable. For example, what if I write this:
>
>     reset (mask_ (shift f))
>
> Now presumably I want to ensure the masking state is restored when I
> invoke the continuation. But it can’t just be unconditionally restored,
> since if I write
>
>     mask_ (reset (shift f >>= g))
>
> then the mask frame isn’t included on the stack, so applying the
> continuation shouldn’t affect the masking state. Presumably this means a
> continuation restore can’t be as simple as copying the captured stack
> frames onto the current stack, since restoring certain frames affects other
> parts of the RTS state.
>
> (Tangentially, it seems like this particular example is not handled
> properly in the existing capture/restore code, as this comment in
> Exception.cmm notes:
>
>      NB. there's a bug in here.  If a thread is inside an
>      unsafePerformIO, and inside maskAsyncExceptions# (there is an
>      unmaskAsyncExceptions_ret on the stack), and it is blocked in an
>      interruptible operation, and it receives an exception, then the
>      unsafePerformIO thunk will be updated with a stack object
>      containing the unmaskAsyncExceptions_ret frame.  Later, when
>      someone else evaluates this thunk, the original masking state is
>      not restored.
>
> I think getting this right probably matters more if continuations are
> added, so that’s something else to worry about.)
>
> > So that's shift#. What about reset#? I expect it's something like
> `unsafeInterleaveIO`, that is it creates a thunk to name the continuation.
> You probably also want a `catch` in there, so that we don't tear down more
> of the stack than we need to.
>
> It would be nice to be able to capture slices of the stack that include
> catch frames, though theoretically it isn’t necessary — it would be
> possible to arrange for a continuation that wants to capture through a
> catch to do something like
>
>     reset (... (catch (reset ...) ...))
>
> instead, then call shift twice and compose the two continuations by hand
> (inserting another call to catch in between). I don’t know enough yet to
> understand all the tradeoffs involved, but I’ll see if it’s possible to get
> away with the userland emulation, since I figure the less code in the RTS
> the better!
>
> > Hope this is helpful.
>
> Very much so, thank you!
>
> > On Jan 30, 2020, at 10:31, Ben Gamari <ben at smart-cactus.org> wrote:
> >
> > For the record, runtime system captures the stack state in an AP_STACK
> > closure. This is done in rts/RaiseAsync.c:raiseAsync and some of this is
> > described in the comment attached to that function.
> >
> > As Simon PJ points out, this is all very tricky stuff, especially in a
> > concurrent context. If you make any changes in this area do be sure to
> > keep in mind the considerations described in Note [AP_STACKs must be
> > eagerly blackholed], which arose out of the very-nast #13615.
>
> Thank you for the pointers — I did manage to find raiseAsync, but I hadn’t
> seen that Note. I will try my best to be suitably wary, though I imagine
> I’m in far enough over my head that I don’t yet know half the things I need
> to be wary of. :)
>
> Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200206/e21c80ab/attachment.html>