Feasibility of native RTS support for continuations?
Simon Peyton Jones
simonpj at microsoft.com
Tue Jan 28 10:09:20 UTC 2020
Alexis
I've thought about this quite a bit in the past, but got stalled for lack of cycles to think about it more. But there's a paper or two:
https://www.microsoft.com/en-us/research/publication/composable-scheduler-activations-haskell/
On that link you can also see a link to an earlier, shorter, conference version (rejected š).
Also this earlier (2007) work
https://www.microsoft.com/en-us/research/publication/lightweight-concurrency-primitives-for-ghc/
On the effects front I think Daan Leijen is doing interesting stuff, although I'm not very up to date:
https://www.microsoft.com/en-us/research/people/daan/publications/
One interesting dimension is whether or not the continuations you capture are one-shot. If so, particularly efficient implementations are possible.
Also: much of the "capture stack chunk" stuff is *already* implemented, because it is (I think) what happens when a thread receives an asynchronous exception, and just abandon its evaluation of thunks that it has started work on.
Simon
| -----Original Message-----
| From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of Alexis King
| Sent: 28 January 2020 08:20
| To: ghc-devs <ghc-devs at haskell.org>
| Subject: Feasibility of native RTS support for continuations?
|
| Hi all,
|
| tl;dr: I want to try to implement native support for capturing slices of
| the RTS stack as a personal experiment; please tell me the obstacles I
| am likely to run into. Much more context follows.
|
| ---
|
| I have been working on an implementation of an algebraic effect system
| that uses unsafe primops to be as performant as possible. However, the
| unavoidable need to CPS the entire program balloons heap allocation.
| Consider the following definition:
|
| f a b = g a >>= \c -> h (b + c)
|
| Assume `g` and `h` are not inlined. If the monad used is IO, this will
| be compiled efficiently: the result of `g` is returned on the stack, and
| no closure needs to be allocated for the lambda. However, if the monad
| supports capturing the continuation, the above definition must be CPSād.
| After inlining, we end up with
|
| f a b = \k -> let lvl = \c -> h (b + c) k in g a lvl
|
| which must allocate a closure on the heap. This is frustrating, as it
| must happen for every call to a non-inlined monadic operation, even if
| that operation never captures the continuation. In an algebraic effect
| system, there are many shortcuts that avoid the need to capture the
| continuation, and under my implementation, large swaths of code never do
| so. Iāve managed to exploit that to get some savings, but I canāt escape
| the need to allocate all these closures.
|
| This motivates my question: how difficult would it be to allow capturing
| a portion of the RTS call stack directly? My requirements are fairly
| minimal, as continuations go:
|
| 1. Capturing a continuation is only legal from within a strict state
| thread (i.e. IO or strict ST).
|
| 2. The continuation is captured up to a prompt, which would be a new
| kind of RTS stack frame. Prompts are not tagged, so there is only
| ever exactly one prompt active at any time (which may be the root
| prompt).
|
| 3. Capturing a continuation is unsafe. The behavior of capturing a
| continuation is undefined if the current prompt was not created by
| the current state thread (and it is never legal to capture up to
| the root prompt).
|
| 4. Applying a continuation is unsafe. Captured continuations return
| `Any`, and type safety is the callerās obligation.
|
| 5. Continuations are āfunctional,ā which is to say applying them does
| not trigger any additional stack unwinding.
|
| This minimal support for primitive continuation capturing would be
| enough to support my efficient, safe delimited control implementation.
| In my ignorant mind, implementing this ought to be as simple as defining
| two new primops,
|
| reset# :: (State# s -> (# State# s, a #))
| -> State# s -> (# State# s, a #)
|
| shift# :: ((a -> State# s -> (# State# s, Any #))
| -> State# s -> (# State# s, Any #))
| -> State# s -> (# State# s, a #)
|
| where reset# pushes a new prompt frame and shift# captures a slice of
| the RTS stack up to that frame and copies it into the heap. Restoring a
| continuation would copy all the captured frames onto the end of the
| current stack. Sounds simple enough!
|
| I would like to experiment with implementing something like this myself,
| just to see if it would really work, but somehow I doubt it is actually
| as simple as it sounds. Minor complications are fine, but what worries
| me are major obstacles I havenāt found in my limited attempts to learn
| about the RTS.
|
| So far, Iāve read the old āThe New GHC/Hugs Runtime Systemā paper, which
| still seems mostly accurate from a high level, though I imagine many
| details have changed since then. Iāve also read the
| Commentary/RTS/Storage/Stack wiki page, and Iāve peeked at some of the
| RTS source code. Iāve also skimmed a handful of other papers and wiki
| pages, but itās hard to know what Iām looking for. Can anyone point me
| to better resources or give me some insight into what will be hard?
|
| Thanks in advance,
| Alexis
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask
| ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
| devs&data=02%7C01%7Csimonpj%40microsoft.com%7C4ee2fda49b5346e0b2b508d7
| a3cadd55%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637157964004023983&a
| mp;sdata=3iyrmA6Ie5yLSkZkBz3Wcj5LS7GvE3zwivDEcxv2Y%2B4%3D&reserved=0
More information about the ghc-devs
mailing list