Thoughts on async RTS API?

Wed Dec 15 16:05:15 UTC 2021

Cheng Shao <cheng.shao at tweag.io> writes:

> Hi devs,
>
> To invoke Haskell computation in C, we need to call one of rts_eval*
> functions, which enters the scheduler loop, and returns only when the
> specified Haskell thread is finished or killed. We'd like to enhance
> the scheduler and add async variants of the rts_eval* functions, which
> take C callbacks to consume the Haskell thread result, kick off the
> scheduler loop, and the loop is allowed to exit when the Haskell
> thread is blocked. Sync variants of RTS API will continue to work with
> unchanged behavior.
>
> The main intended use case is async foreign calls for the WebAssembly
> target. When an async foreign call is made, the Haskell thread will
> block on an MVar to be fulfilled with the call result. But the
> scheduler will eventually fail to find work due to empty run queue and
> exit with error! We need a way to gracefully exit the scheduler, so
> the RTS API caller can process the async foreign call, fulfill that
> MVar and resume Haskell computation later.
>
> Question I: does the idea of adding async RTS API sound acceptable by
> GHC HQ? To be honest, it's not impossible to workaround lack of async
> RTS API: reuse the awaitEvent() logic in non-threaded RTS, pretend
> each async foreign call reads from a file descriptor and can be
> handled by the POSIX select() function in awaitEvent(). But it'd
> surely be nice to avoid such hacks and do things the principled way.
>
While the idea here sounds reasonable, I'm not sure I quite understand
how this will be used in Asterius's case. Specifically, I would be
worried about the lack of fairness in this scheme: no progress will be
made on any foreign call until all Haskell evaluation has blocked.
Is this really the semantics that you want?

> Question II: how to modify the scheduler loop to implement this
> feature? Straightforward answer seems to be: check some RTS API
> non-blocking flag, if present, allow early exit due to empty run
> queue.
>
`schedule` is already a very large function with loops, gotos,
mutability, and quite complex control flow. I would be reluctant
to add to this complexity without first carrying out some
simplification. Instead of adding yet another bail-out case to the loop,
I would probably rather try to extract the loop body into a new
function. That is, currently `schedule` is of the form:

    // Perform work until we are asked to shut down.
    Capability *schedule (Capability *initialCapability, Task *task) {
        Capability *cap = initialCapability;
        while (1) {
            scheduleYield(&cap, task);

            if (emptyRunQueue(cap)) {
                continue;
            }

            if (shutting_down) {
                return cap;
            }

            StgTSO *t = popRunQueue(cap);

            if (! t.can_run_on_capability(cap)) {
                // Push back on the run queue and loop around again to
                // yield the capability to the appropriate task
                pushOnRunQueue(cap, t);
                continue;
            }

            runMutator(t);

            if (needs_gc) {
                scheduleDoGC();
            }
        }
    }

I might rather extract this into something like:

    enum ScheduleResult {
        NoWork,          // There was no work to do
        PerformedWork,   // Ran precisely one thread
        Yield,           // The next thread scheduled to run cannot run on the
                         // given capability; yield.
        ShuttingDown,    // We were asked to shut down
    }

    // Schedule at most one thread once
    ScheduleResult scheduleOnce (Capability **cap, Task *task) {
        if (emptyRunQueue(cap)) {
            return NoWork;
        }

        if (shutting_down) {
            return ShuttingDown;
        }

        StgTSO *t = popRunQueue(cap);

        if (! t.can_run_on_capability(cap)) {
            pushOnRunQueue(cap, t);
            return Yield;
        }

        runMutator(t);

        if (needs_gc) {
            scheduleDoGC();
        }

        return PerformedWork;
    }

This is just a sketch but I hope it's clear that with something like
this this you can easily implement the existing `schedule` function, as
well as your asynchronous variant. 

Cheers,

- Ben
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 905 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20211215/0f7419b0/attachment.sig>