Thoughts on async RTS API?

Wed Dec 15 17:06:19 UTC 2021

> While the idea here sounds reasonable, I'm not sure I quite understand
> how this will be used in Asterius's case. Specifically, I would be
> worried about the lack of fairness in this scheme: no progress will be
> made on any foreign call until all Haskell evaluation has blocked.
> Is this really the semantics that you want?

Asterius runtime scheduler divides work into individual "tick"s. Each
tick does some work, much like one single iteration in the while(1)
scheduler loop. Ticks are not synchronously invoked by previous ticks,
instead they are started asynchronously and placed inside the host
event loop, fully interleaved with other host events. This way,
Haskell concurrency works with host concurrency without requiring host
multi-threading.

It's possible to wait for run queue to be emptied, then process all
blocking foreign calls in one batch, similar to awaitEvent() logic in
non-threaded RTS. It's also possible to exit scheduler and resume it
many more times, similar to current Asterius scheduler. Both semantics
can be implemented, to guarantee fairness, the latter sounds more
preferrable. The key issue is finding a way to break up the current
while(1) loop in schedule() in a principled way.

> `schedule` is already a very large function with loops, gotos,
> mutability, and quite complex control flow. I would be reluctant
> to add to this complexity without first carrying out some
> simplification. Instead of adding yet another bail-out case to the loop,
> I would probably rather try to extract the loop body into a new
> function. That is, currently `schedule` is of the form:
>
>     // Perform work until we are asked to shut down.
>     Capability *schedule (Capability *initialCapability, Task *task) {
>         Capability *cap = initialCapability;
>         while (1) {
>             scheduleYield(&cap, task);
>
>             if (emptyRunQueue(cap)) {
>                 continue;
>             }
>
>             if (shutting_down) {
>                 return cap;
>             }
>
>             StgTSO *t = popRunQueue(cap);
>
>             if (! t.can_run_on_capability(cap)) {
>                 // Push back on the run queue and loop around again to
>                 // yield the capability to the appropriate task
>                 pushOnRunQueue(cap, t);
>                 continue;
>             }
>
>             runMutator(t);
>
>             if (needs_gc) {
>                 scheduleDoGC();
>             }
>         }
>     }
>
> I might rather extract this into something like:
>
>     enum ScheduleResult {
>         NoWork,          // There was no work to do
>         PerformedWork,   // Ran precisely one thread
>         Yield,           // The next thread scheduled to run cannot run on the
>                          // given capability; yield.
>         ShuttingDown,    // We were asked to shut down
>     }
>
>     // Schedule at most one thread once
>     ScheduleResult scheduleOnce (Capability **cap, Task *task) {
>         if (emptyRunQueue(cap)) {
>             return NoWork;
>         }
>
>         if (shutting_down) {
>             return ShuttingDown;
>         }
>
>         StgTSO *t = popRunQueue(cap);
>
>         if (! t.can_run_on_capability(cap)) {
>             pushOnRunQueue(cap, t);
>             return Yield;
>         }
>
>         runMutator(t);
>
>         if (needs_gc) {
>             scheduleDoGC();
>         }
>
>         return PerformedWork;
>     }
>
> This is just a sketch but I hope it's clear that with something like
> this this you can easily implement the existing `schedule` function, as
> well as your asynchronous variant.
>

Thanks for the sketch! I definitely agree we should simplify
schedule() in some way instead of adding ad-hoc bail out case. The
ScheduleResult type and scheduleOnce() function looks good to me,
although I need to do a lot more experiments to confirm.

Cheers,
Cheng