[Git][ghc/ghc][wip/ghc-debug_pause_and_resume] RtsAPI: pause and resume the RTS
David Eichmann
gitlab at gitlab.haskell.org
Tue Oct 20 15:44:10 UTC 2020
David Eichmann pushed to branch wip/ghc-debug_pause_and_resume at Glasgow Haskell Compiler / GHC
Commits:
12aec424 by David Eichmann at 2020-10-20T16:43:25+01:00
RtsAPI: pause and resume the RTS
The `rts_pause` and `rts_resume` functions have been added to `RtsAPI.h` and
allow an external process to completely pause and resume the RTS.
Co-authored-by: Sven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss at gmail.com>
- - - - -
30 changed files:
- includes/RtsAPI.h
- rts/Capability.c
- rts/RtsAPI.c
- rts/Schedule.c
- rts/Task.c
- rts/Task.h
- rts/sm/NonMoving.c
- + testsuite/tests/rts/pause-resume/all.T
- + testsuite/tests/rts/pause-resume/pause_and_use_rts_api.hs
- + testsuite/tests/rts/pause-resume/pause_and_use_rts_api.stdout
- + testsuite/tests/rts/pause-resume/pause_resume.c
- + testsuite/tests/rts/pause-resume/pause_resume.h
- + testsuite/tests/rts/pause-resume/pause_resume_via_pthread.hs
- + testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi.hs
- + testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi_concurrent.hs
- + testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi_concurrent.stdout
- + testsuite/tests/rts/pause-resume/shouldfail/all.T
- + testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.hs
- + testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.stderr
- + testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.stdout
- + testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.hs
- + testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.stderr
- + testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.stdout
- + testsuite/tests/rts/pause-resume/shouldfail/rts_pause_lock.c
- + testsuite/tests/rts/pause-resume/shouldfail/rts_pause_lock.h
- + testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.hs
- + testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.stderr
- + testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.stdout
- + testsuite/tests/rts/pause-resume/shouldfail/unsafe_rts_pause.hs
- + testsuite/tests/rts/pause-resume/shouldfail/unsafe_rts_pause.stderr
Changes:
=====================================
includes/RtsAPI.h
=====================================
@@ -330,14 +330,72 @@ extern void freeFullProgArgv ( void ) ;
/* exit() override */
extern void (*exitFn)(int);
-/* ----------------------------------------------------------------------------
- Locking.
-
- You have to surround all access to the RtsAPI with these calls.
- ------------------------------------------------------------------------- */
+/* Note [Locking and Pausing the RTS]
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * You have to surround all access to the RtsAPI with rts_lock/rts_unlock or
+ * with rts_pause/rts_resume.
+ *
+ *
+ * # rts_lock / rts_unlock
+ *
+ * Use `rts_lock` to acquire a token which may be used to call other RtsAPI
+ * functions. This is achieved by acquiring a capability. Hence, at most n locks
+ * can be held simultaneously, where n is the number of capabilities. It is an
+ * error to call `rts_lock` when the rts is already paused by the current OS
+ * thread (see rts_pause/rts_resume below).
+ *
+ *
+ * # rts_pause / rts_resume
+ *
+ * Use `rts_pause` to halt execution of all Haskell threads and `rts_resume` to
+ * continue them. This is achieved by acquiring all capabilities. `rts_resume`
+ * must be called on the same thread as `rts_pause`. `rts_pause`, much like
+ * rts_lock, Returns a token which may be used to call other RtsAPI functions.
+ *
+ * * With the RTS paused, garbage collections will not occur and haskell threads
+ * will not execute, allocate nor mutate their stack.
+ * * Non-Haskell (i.e. non-worker) threads such as those running safe FFI calls
+ * will NOT be paused and can still mutate pinned mutable data such as pinned
+ * `MuttableByteArray#`s.
+ * * You may call `rts_pause` from within a non-worker OS thread.
+ * * You may call `rts_pause` from within a safe FFI call. In this case, make
+ * sure to call `rts_resume` within the same FFI call, or the RTS will likely
+ * deadlock.
+ * * Calling `rts_pause` from an unsafe FFI call will cause an error.
+ * * `rts_pause` does not pause threads running safe FFI calls.
+ * * On return, the rts will be fully paused: all haskell threads are stopped
+ * and all capabilities are acquired by the current thread.
+ * * `rts_pause` is different to rts_lock which only pauses a single capability.
+ * * Calling `rts_pause` in between rts_lock/rts_unlock on the same thread will
+ * cause an error.
+ * * Calling `rts_pause` results in an error if the RTS is already paused by the
+ * current OS thread.
+ * * Only one OS thread at a time can keep the rts paused.
+ * * `rts_pause` will block while another thread is pausing the RTS, and
+ * continue when the current thread is given exclusive permission to pause the
+ * RTS.
+ *
+ *
+ * ## Note on implementation.
+ *
+ * Although it's likely sufficient for many use cases to call RtsAPI.h functions
+ * from a single thread, we still want to ensure that the API is thread safe.
+ * This is achieved almost entirely by the mechanism of acquiring and releasing
+ * Capabilities, resulting in a sort of mutex / critical section pattern.
+ * This has the following consequences:
+ *
+ * * There are at most `n_capabilities` threads currently in a
+ * rts_lock/rts_unlock section.
+ * * There is at most 1 threads in a rts_pause/rts_resume section. In that case
+ * there will be no threads in a rts_lock/rts_unlock section.
+ * * rts_pause and rts_lock may block in order to enforce the above 2
+ * invariants.
+ */
-// acquires a token which may be used to create new objects and
-// evaluate them.
+// acquires a token which may be used to create new objects and evaluate them.
+// Calling rts_lock in between rts_pause/rts_resume on the same thread will
+// cause an error.
Capability *rts_lock (void);
// releases the token acquired with rts_lock().
@@ -483,6 +541,18 @@ void rts_checkSchedStatus (char* site, Capability *);
SchedulerStatus rts_getSchedStatus (Capability *cap);
+// Halt execution of all Haskell threads.
+// See Note [Locking and Pausing the RTS] for correct usage.
+Capability * rts_pause (void);
+
+// Counterpart of rts_pause: Continue from a pause.
+// See Note [Locking and Pausing the RTS] for correct usage.
+// [in] cap: the token returned by rts_pause.
+void rts_resume (Capability * cap);
+
+// Returns true if the rts is paused. See rts_pause() and rts_resume().
+bool rts_isPaused(void);
+
/*
* The RTS allocates some thread-local data when you make a call into
* Haskell using one of the rts_eval() functions. This data is not
=====================================
rts/Capability.c
=====================================
@@ -858,7 +858,14 @@ void waitForCapability (Capability **pCap, Task *task)
/* See Note [GC livelock] in Schedule.c for why we have gcAllowed
and return the bool */
bool /* Did we GC? */
-yieldCapability (Capability** pCap, Task *task, bool gcAllowed)
+yieldCapability
+ ( Capability** pCap // [in/out] Task's owned capability. Set to the
+ // newly owned capability on return.
+ // pCap != NULL
+ // && *pCap != NULL
+ , Task *task // [in] This thread's task.
+ , bool gcAllowed
+ )
{
Capability *cap = *pCap;
=====================================
rts/RtsAPI.c
=====================================
@@ -577,6 +577,12 @@ rts_getSchedStatus (Capability *cap)
return cap->running_task->incall->rstat;
}
+#if defined(THREADED_RTS)
+// The task that paused the RTS. The rts_pausing_task variable is owned by the
+// task that owns all capabilities (there is at most one such task).
+Task * rts_pausing_task = NULL;
+#endif
+
Capability *
rts_lock (void)
{
@@ -593,6 +599,14 @@ rts_lock (void)
stg_exit(EXIT_FAILURE);
}
+#if defined(THREADED_RTS)
+ if (rts_pausing_task == task) {
+ errorBelch("error: rts_lock: The RTS is already paused by this thread.\n"
+ " There is no need to call rts_lock if you have already called rts_pause.");
+ stg_exit(EXIT_FAILURE);
+ }
+#endif
+
cap = NULL;
waitForCapability(&cap, task);
@@ -620,21 +634,21 @@ rts_unlock (Capability *cap)
task = cap->running_task;
ASSERT_FULL_CAPABILITY_INVARIANTS(cap,task);
- // Now release the Capability. With the capability released, GC
- // may happen. NB. does not try to put the current Task on the
+ // Now release the Capability. With the capability released, GC
+ // may happen. NB. does not try to put the current Task on the
// worker queue.
- // NB. keep cap->lock held while we call boundTaskExiting(). This
+ // NB. keep cap->lock held while we call exitMyTask(). This
// is necessary during shutdown, where we want the invariant that
// after shutdownCapability(), all the Tasks associated with the
- // Capability have completed their shutdown too. Otherwise we
- // could have boundTaskExiting()/workerTaskStop() running at some
+ // Capability have completed their shutdown too. Otherwise we
+ // could have exitMyTask()/workerTaskStop() running at some
// random point in the future, which causes problems for
// freeTaskManager().
ACQUIRE_LOCK(&cap->lock);
releaseCapability_(cap,false);
// Finally, we can release the Task to the free list.
- boundTaskExiting(task);
+ exitMyTask();
RELEASE_LOCK(&cap->lock);
if (task->incall == NULL) {
@@ -645,6 +659,141 @@ rts_unlock (Capability *cap)
}
}
+#if defined(THREADED_RTS)
+
+// See Note [Locking and Pausing the RTS]
+Capability * rts_pause (void)
+{
+ // It is an error if this thread already paused the RTS. If another
+ // thread has paused the RTS, then rts_pause will block until rts_resume is
+ // called (and compete with other threads calling rts_pause). The blocking
+ // behavior is implied by the use of `stopAllCapabilities`.
+ Task * task = getMyTask();
+ if (rts_pausing_task == task)
+ {
+ // This task already pased the RTS.
+ errorBelch("error: rts_pause: This thread has already paused the RTS.");
+ stg_exit(EXIT_FAILURE);
+ }
+
+ // The current task must not own a capability. This is true for non-worker
+ // threads e.g. when making a safe FFI call. We allow pausing when
+ // `task->cap->running_task != task` because the capability can be taken by
+ // other capabilities. Doing this check is justified because rts_pause is a
+ // user facing function and we want good error reporting. We also don't
+ // expect rts_pause to be performance critical.
+ if (task->cap && task->cap->running_task == task)
+ {
+ // This task owns a capability (at it can't be taken by other capabilities).
+ errorBelch(task->cap->in_haskell
+ ? ("error: rts_pause: attempting to pause via an unsafe FFI call.\n"
+ " Perhaps a 'foreign import unsafe' should be 'safe'?")
+ : ("error: rts_pause: attempting to pause from a Task that owns a capability.\n"
+ " Have you already acquired a capability e.g. with rts_lock?"));
+ stg_exit(EXIT_FAILURE);
+ }
+
+ task = newBoundTask();
+ stopAllCapabilities(NULL, task);
+
+ // Now we own all capabilities so we own rts_pausing_task and may set it.
+ rts_pausing_task = task;
+
+ return task->cap;
+}
+
+static void assert_isPausedOnMyTask(const char *functionName);
+
+// See Note [Locking and Pausing the RTS]. The cap argument is here just for
+// symmetry with rts_pause and to match the pattern of rts_lock/rts_unlock.
+void rts_resume (Capability * cap STG_UNUSED)
+{
+ assert_isPausedOnMyTask("rts_resume");
+ Task * task = getMyTask();
+
+ // Now we own all capabilities so we own rts_pausing_task and may write to
+ // it.
+ rts_pausing_task = NULL;
+
+ // releaseAllCapabilities will not block because the current task owns all
+ // capabilities.
+ releaseAllCapabilities(n_capabilities, NULL, task);
+ exitMyTask();
+}
+
+// See RtsAPI.h
+bool rts_isPaused(void)
+{
+ return rts_pausing_task != NULL;
+}
+
+// Check that the rts_pause was called on this thread/task and this thread owns
+// all capabilities. If not, outputs an error and exits with EXIT_FAILURE.
+static void assert_isPausedOnMyTask(const char *functionName)
+{
+ Task * task = getMyTask();
+ if (rts_pausing_task == NULL)
+ {
+ errorBelch (
+ "error: %s: the rts is not paused. Did you forget to call rts_pause?",
+ functionName);
+ stg_exit(EXIT_FAILURE);
+ }
+
+ if (task != rts_pausing_task)
+ {
+ // We don't have ownership of rts_pausing_task, so it may have changed
+ // just after the above read. Still, we are garanteed that
+ // rts_pausing_task won't be set to the current task (because the
+ // current task is here now!), so the error messages are still correct.
+ errorBelch (
+ "error: %s: called from a different OS thread than rts_pause.",
+ functionName);
+
+ stg_exit(EXIT_FAILURE);
+ }
+
+ // Check that we own all capabilities.
+ for (unsigned int i = 0; i < n_capabilities; i++)
+ {
+ Capability *cap = capabilities[i];
+ if (cap->running_task != task)
+ {
+ errorBelch (
+ "error: %s: the pausing thread does not own all capabilities.\n"
+ " Have you manually released a capability after calling rts_pause?",
+ functionName);
+ stg_exit(EXIT_FAILURE);
+ }
+ }
+}
+
+
+#else
+Capability * GNU_ATTRIBUTE(__noreturn__)
+rts_pause (void)
+{
+ errorBelch("Warning: Pausing the RTS is only possible for "
+ "multithreaded RTS.");
+ stg_exit(EXIT_FAILURE);
+}
+
+void GNU_ATTRIBUTE(__noreturn__)
+rts_resume (Capability * cap STG_UNUSED)
+{
+ errorBelch("Warning: Resuming the RTS is only possible for "
+ "multithreaded RTS.");
+ stg_exit(EXIT_FAILURE);
+}
+
+bool rts_isPaused()
+{
+ errorBelch("Warning: Pausing/Resuming the RTS is only possible for "
+ "multithreaded RTS.");
+ return false;
+}
+#endif
+
void rts_done (void)
{
freeMyTask();
@@ -680,7 +829,7 @@ void rts_done (void)
void hs_try_putmvar (/* in */ int capability,
/* in */ HsStablePtr mvar)
{
- Task *task = getTask();
+ Task *task = getMyTask();
Capability *cap;
Capability *task_old_cap USED_IF_THREADS;
=====================================
rts/Schedule.c
=====================================
@@ -1411,7 +1411,15 @@ scheduleNeedHeapProfile( bool ready_to_gc )
* -------------------------------------------------------------------------- */
#if defined(THREADED_RTS)
-void stopAllCapabilities (Capability **pCap, Task *task)
+void stopAllCapabilities
+ ( Capability **pCap // [in/out] This thread's task's owned capability.
+ // pCap may be NULL if no capability is owned.
+ // Else *pCap != NULL
+ // On return, set to the task's newly owned
+ // capability (task->cap). Though, the Task will
+ // technically own all capabilities.
+ , Task *task // [in] This thread's task.
+ )
{
stopAllCapabilitiesWith(pCap, task, SYNC_OTHER);
}
@@ -1463,9 +1471,15 @@ void stopAllCapabilitiesWith (Capability **pCap, Task *task, SyncType sync_type)
* -------------------------------------------------------------------------- */
#if defined(THREADED_RTS)
-static bool requestSync (
- Capability **pcap, Task *task, PendingSync *new_sync,
- SyncType *prev_sync_type)
+static bool requestSync
+ ( Capability **pcap // [in/out] This thread's task's owned capability.
+ // May change if there is an existing sync (true is returned).
+ // pcap may be NULL
+ // *pcap != NULL
+ , Task *task // [in] This thread's task.
+ , PendingSync *new_sync // [in] The new requested sync.
+ , SyncType *prev_sync_type // [out] Only set if there is an existing sync (true is returned).
+ )
{
PendingSync *sync;
@@ -1559,7 +1573,7 @@ static void acquireAllCapabilities(Capability *cap, Task *task)
void releaseAllCapabilities(uint32_t n, Capability *keep_cap, Task *task)
{
uint32_t i;
-
+ ASSERT( task != NULL);
for (i = 0; i < n; i++) {
Capability *tmpcap = capabilities[i];
if (keep_cap != tmpcap) {
@@ -2082,7 +2096,7 @@ forkProcess(HsStablePtr *entry
RELEASE_LOCK(&capabilities[i]->lock);
}
- boundTaskExiting(task);
+ exitMyTask();
// just return the pid
return pid;
@@ -2762,7 +2776,7 @@ exitScheduler (bool wait_foreign USED_IF_THREADS)
// debugBelch("n_failed_trygrab_idles = %d, n_idle_caps = %d\n",
// n_failed_trygrab_idles, n_idle_caps);
- boundTaskExiting(task);
+ exitMyTask();
}
void
@@ -2821,7 +2835,7 @@ performGC_(bool force_major)
waitForCapability(&cap,task);
scheduleDoGC(&cap,task,force_major,false);
releaseCapability(cap);
- boundTaskExiting(task);
+ exitMyTask();
}
void
=====================================
rts/Task.c
=====================================
@@ -118,7 +118,7 @@ freeTaskManager (void)
return tasksRunning;
}
-Task* getTask (void)
+Task* getMyTask (void)
{
Task *task;
@@ -306,7 +306,7 @@ newBoundTask (void)
stg_exit(EXIT_FAILURE);
}
- task = getTask();
+ task = getMyTask();
task->stopped = false;
@@ -317,13 +317,12 @@ newBoundTask (void)
}
void
-boundTaskExiting (Task *task)
+exitMyTask (void)
{
+ Task* task = myTask();
#if defined(THREADED_RTS)
ASSERT(osThreadId() == task->id);
#endif
- ASSERT(myTask() == task);
-
endInCall(task);
// Set task->stopped, but only if this is the last call (#4850).
@@ -524,7 +523,7 @@ void rts_setInCallCapability (
int preferred_capability,
int affinity USED_IF_THREADS)
{
- Task *task = getTask();
+ Task *task = getMyTask();
task->preferred_capability = preferred_capability;
#if defined(THREADED_RTS)
@@ -541,7 +540,7 @@ void rts_pinThreadToNumaNode (
{
#if defined(THREADED_RTS)
if (RtsFlags.GcFlags.numa) {
- Task *task = getTask();
+ Task *task = getMyTask();
task->node = capNoToNumaNode(node);
if (!DEBUG_IS_ON || !RtsFlags.DebugFlags.numa) { // faking NUMA
setThreadNode(numa_map[task->node]);
=====================================
rts/Task.h
=====================================
@@ -149,8 +149,8 @@ typedef struct Task_ {
struct InCall_ *spare_incalls;
bool worker; // == true if this is a worker Task
- bool stopped; // == true between newBoundTask and
- // boundTaskExiting, or in a worker Task.
+ bool stopped; // == false between newBoundTask and
+ // exitMyTask, or in a worker Task.
// So that we can detect when a finalizer illegally calls back into Haskell
bool running_finalizers;
@@ -200,9 +200,9 @@ extern Mutex all_tasks_mutex;
void initTaskManager (void);
uint32_t freeTaskManager (void);
-// Create a new Task for a bound thread. This Task must be released
-// by calling boundTaskExiting. The Task is cached in
-// thread-local storage and will remain even after boundTaskExiting()
+// Create a new Task for a bound thread. This Task must be released
+// by calling exitMyTask(). The Task is cached in
+// thread-local storage and will remain even after exitMyTask()
// has been called; to free the memory, see freeMyTask().
//
Task* newBoundTask (void);
@@ -210,11 +210,10 @@ Task* newBoundTask (void);
// Return the current OS thread's Task, which is created if it doesn't already
// exist. After you have finished using RTS APIs, you should call freeMyTask()
// to release this thread's Task.
-Task* getTask (void);
+Task* getMyTask (void);
-// The current task is a bound task that is exiting.
-//
-void boundTaskExiting (Task *task);
+// Exit myTask - This is the counterpart of newBoundTask().
+void exitMyTask (void);
// Free a Task if one was previously allocated by newBoundTask().
// This is not necessary unless the thread that called newBoundTask()
=====================================
rts/sm/NonMoving.c
=====================================
@@ -1215,7 +1215,7 @@ static void nonmovingMark_(MarkQueue *mark_queue, StgWeak **dead_weaks, StgTSO *
#if defined(THREADED_RTS)
finish:
- boundTaskExiting(task);
+ exitMyTask();
// We are done...
mark_thread = 0;
=====================================
testsuite/tests/rts/pause-resume/all.T
=====================================
@@ -0,0 +1,12 @@
+test('pause_resume_via_safe_ffi', [only_ways(threaded_ways),
+ extra_files(['pause_resume.c','pause_resume.h'])],
+ multi_compile_and_run, ['pause_resume_via_safe_ffi', [('pause_resume.c','')], ''])
+test('pause_resume_via_pthread', [only_ways(threaded_ways),
+ extra_files(['pause_resume.c','pause_resume.h'])],
+ multi_compile_and_run, ['pause_resume_via_pthread', [('pause_resume.c','')], ''])
+test('pause_resume_via_safe_ffi_concurrent', [only_ways(threaded_ways),
+ extra_files(['pause_resume.c','pause_resume.h'])],
+ multi_compile_and_run, ['pause_resume_via_safe_ffi_concurrent', [('pause_resume.c','')], ''])
+test('pause_and_use_rts_api', [only_ways(threaded_ways),
+ extra_files(['pause_resume.c','pause_resume.h'])],
+ multi_compile_and_run, ['pause_and_use_rts_api', [('pause_resume.c','')], ''])
=====================================
testsuite/tests/rts/pause-resume/pause_and_use_rts_api.hs
=====================================
@@ -0,0 +1,28 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Foreign
+import System.Exit
+import System.Timeout
+
+foreign import ccall safe "pause_resume.h pauseAndUseRtsAPIAndResume"
+ pauseAndUseRtsAPIAndResume
+ :: (StablePtr (Int -> Int))
+ -> Int
+ -> Int
+ -> Int
+ -> (StablePtr (IO Int))
+ -> IO ()
+
+main :: IO ()
+main = do
+ addOne <- newStablePtr ((+1) :: Int -> Int)
+ ioOne <- newStablePtr (return 1 :: IO Int)
+ successMay <- timeout 5000000 $ pauseAndUseRtsAPIAndResume
+ addOne
+ 1
+ 2
+ 3
+ ioOne
+ case successMay of
+ Nothing -> exitFailure
+ Just () -> exitSuccess
=====================================
testsuite/tests/rts/pause-resume/pause_and_use_rts_api.stdout
=====================================
@@ -0,0 +1,34 @@
+Pause the RTS...Paused
+getRTSStats...
+getRTSStatsEnabled...
+getAllocations...
+rts_getSchedStatus...
+rts_getChar, rts_mkChar...
+rts_getInt, rts_mkInt...
+rts_getInt8, rts_mkInt8...
+rts_getInt16, rts_mkInt16...
+rts_getInt32, rts_mkInt32...
+rts_getInt64, rts_mkInt64...
+rts_getWord, rts_mkWord...
+rts_getWord8, rts_mkWord8...
+rts_getWord16, rts_mkWord16...
+rts_getWord32, rts_mkWord32...
+rts_getWord64, rts_mkWord64...
+rts_getPtr, rts_mkPtr...
+rts_getFunPtr, rts_mkFunPtr...
+rts_getFloat, rts_mkFloat...
+rts_getDouble, rts_mkDouble...
+rts_getStablePtr, rts_mkStablePtr...
+rts_getBool, rts_mkBool...
+rts_mkString...
+rts_apply...
+rts_eval...
+rts_eval_...
+rts_evalIO...
+rts_evalStableIOMain...
+rts_evalStableIO...
+rts_evalLazyIO...
+rts_evalLazyIO_...
+rts_setInCallCapability...
+rts_pinThreadToNumaNode...
+Resume the RTS...Resumed
=====================================
testsuite/tests/rts/pause-resume/pause_resume.c
=====================================
@@ -0,0 +1,250 @@
+#include <assert.h>
+#include <stdio.h>
+#include <unistd.h>
+
+#include "Rts.h"
+#include "RtsAPI.h"
+
+#include "pause_resume.h"
+
+void expectNoChange(const char * msg, volatile unsigned int * count);
+void expectChange(const char * msg, volatile unsigned int * count);
+
+// Test rts_pause/rts_resume by observing a count that we expect to be
+// incremented by concurrent Haskell thread(s). We expect rts_pause to stop
+// those threads and hence stop incrementing the count.
+void pauseAndResume
+ ( bool assertNotPaused // [in] True to enable assertions before rts_pause and after rts_resume.
+ // Often disabled when calling this concurrently.
+ , volatile unsigned int * count // [in] Haskell threads should be forever incrementing this.
+ )
+{
+ // Assert the RTS is resumed.
+ if (assertNotPaused)
+ {
+ expectChange("RTS should be running", count);
+ if(rts_isPaused()) {
+ errorBelch("Expected the RTS to be resumed.");
+ exit(1);
+ }
+ }
+
+ // Pause and assert.
+ Capability * cap = rts_pause();
+ if(cap == NULL) {
+ errorBelch("rts_pause() returned NULL.");
+ exit(1);
+ }
+ Capability * cap2 = rts_pause(); // This should have no effect and return immediately.
+ if(cap != cap2) {
+ errorBelch("A second call to rts_pause() returned a different Capability.");
+ exit(1);
+ }
+
+ if(!rts_isPaused()) {
+ errorBelch("Expected the RTS to be paused.");
+ exit(1);
+ }
+
+ expectNoChange("RTS should be paused", count);
+
+ // Resume.
+ rts_resume(cap);
+
+ // Assert the RTS is resumed.
+ if (assertNotPaused)
+ {
+ expectChange("RTS should be resumed", count);
+ if(rts_isPaused()) {
+ errorBelch("Expected the RTS to be resumed.");
+ exit(1);
+ }
+ }
+}
+
+int addOne(int a)
+{
+ return a + 1;
+}
+
+// Pause tht RTS and call all RtsAPI.h functions.
+void pauseAndUseRtsAPIAndResume
+ ( HaskellObj haskellFn // [in] A Haskell function (StablePtr (a -> a))
+ , HaskellObj haskellFnArgument // [in] An argument to apply to haskellFn (a)
+ , HaskellObj obj1 // [in] arbitrary haskell value to evaluate of arbitrary type.
+ , HaskellObj obj2 // [in] arbitrary haskell value to evaluate of arbitrary type.
+ , HsStablePtr stablePtrIO // [in] arbitrary haskell IO action to execute (StablePtr (IO t))
+ )
+{
+ // Pause the RTS.
+ printf("Pause the RTS...");
+ Capability * cap = rts_pause();
+ printf("Paused\n");
+
+ // Note the original capability. We assert that cap is not changed by
+ // functions that take &cap.
+ Capability *const cap0 = cap;
+
+ // Call RtsAPI.h functions
+
+ // TODO print out what funciton is running to give better debug output if one of these deadlocks
+
+ printf("getRTSStats...\n");
+ RTSStats s;
+ getRTSStats (&s);
+ printf("getRTSStatsEnabled...\n");
+ getRTSStatsEnabled();
+ printf("getAllocations...\n");
+ getAllocations();
+ printf("rts_getSchedStatus...\n");
+ rts_getSchedStatus(cap);
+ printf("rts_getChar, rts_mkChar...\n");
+ rts_getChar (rts_mkChar ( cap, 0 ));
+ printf("rts_getInt, rts_mkInt...\n");
+ rts_getInt (rts_mkInt ( cap, 0 ));
+ printf("rts_getInt8, rts_mkInt8...\n");
+ rts_getInt8 (rts_mkInt8 ( cap, 0 ));
+ printf("rts_getInt16, rts_mkInt16...\n");
+ rts_getInt16 (rts_mkInt16 ( cap, 0 ));
+ printf("rts_getInt32, rts_mkInt32...\n");
+ rts_getInt32 (rts_mkInt32 ( cap, 0 ));
+ printf("rts_getInt64, rts_mkInt64...\n");
+ rts_getInt64 (rts_mkInt64 ( cap, 0 ));
+ printf("rts_getWord, rts_mkWord...\n");
+ rts_getWord (rts_mkWord ( cap, 0 ));
+ printf("rts_getWord8, rts_mkWord8...\n");
+ rts_getWord8 (rts_mkWord8 ( cap, 0 ));
+ printf("rts_getWord16, rts_mkWord16...\n");
+ rts_getWord16 (rts_mkWord16 ( cap, 0 ));
+ printf("rts_getWord32, rts_mkWord32...\n");
+ rts_getWord32 (rts_mkWord32 ( cap, 0 ));
+ printf("rts_getWord64, rts_mkWord64...\n");
+ rts_getWord64 (rts_mkWord64 ( cap, 0 ));
+ printf("rts_getPtr, rts_mkPtr...\n");
+ int x = 0;
+ rts_getPtr (rts_mkPtr ( cap, &x));
+ printf("rts_getFunPtr, rts_mkFunPtr...\n");
+ rts_getFunPtr (rts_mkFunPtr ( cap, &addOne ));
+ printf("rts_getFloat, rts_mkFloat...\n");
+ rts_getFloat (rts_mkFloat ( cap, 0.0 ));
+ printf("rts_getDouble, rts_mkDouble...\n");
+ rts_getDouble (rts_mkDouble ( cap, 0.0 ));
+ printf("rts_getStablePtr, rts_mkStablePtr...\n");
+ rts_getStablePtr (rts_mkStablePtr ( cap, &x ));
+ printf("rts_getBool, rts_mkBool...\n");
+ rts_getBool (rts_mkBool ( cap, 0 ));
+ printf("rts_mkString...\n");
+ rts_mkString ( cap, "Hello ghc-debug!" );
+ printf("rts_apply...\n");
+ rts_apply ( cap, deRefStablePtr(haskellFn), haskellFnArgument );
+
+ printf("rts_eval...\n");
+ HaskellObj ret;
+ rts_eval(&cap, obj1, &ret);
+ assert(cap == cap0);
+
+ printf("rts_eval_...\n");
+ rts_eval_ (&cap, obj2, 50, &ret);
+ assert(cap == cap0);
+
+ printf("rts_evalIO...\n");
+ HaskellObj io = deRefStablePtr(stablePtrIO);
+ rts_evalIO (&cap, io, &ret);
+ assert(cap == cap0);
+
+ printf("rts_evalStableIOMain...\n");
+ HsStablePtr retStablePtr;
+ rts_evalStableIOMain (&cap, stablePtrIO, &retStablePtr);
+ assert(cap == cap0);
+
+ printf("rts_evalStableIO...\n");
+ rts_evalStableIO (&cap, stablePtrIO, &retStablePtr);
+ assert(cap == cap0);
+
+ printf("rts_evalLazyIO...\n");
+ rts_evalLazyIO (&cap, io, &ret);
+ assert(cap == cap0);
+
+ printf("rts_evalLazyIO_...\n");
+ rts_evalLazyIO_ (&cap, io, 50, &ret);
+ assert(cap == cap0);
+
+ printf("rts_setInCallCapability...\n");
+ rts_setInCallCapability (0, 1);
+ printf("rts_pinThreadToNumaNode...\n");
+ rts_pinThreadToNumaNode (0);
+
+ // Resume the RTS.
+ printf("Resume the RTS...");
+ rts_resume(cap);
+ assert(cap == cap0);
+ printf("Resumed\n");
+}
+
+void* pauseAndResumeViaThread_helper(volatile unsigned int * count)
+{
+ pauseAndResume(false, count);
+ return NULL;
+}
+
+// Call pauseAndResume via a new thread and return the thread ID.
+unsigned long pauseAndResumeViaThread
+ ( volatile unsigned int * count // [in] Haskell threads should be forever incrementing this.
+ )
+{
+ pthread_t threadId;
+ pthread_create(&threadId, NULL, &pauseAndResumeViaThread_helper, count);
+ return threadId;
+}
+
+const int TIMEOUT = 1000000; // 1 second
+
+// Wait for &count to change (else exit(1) after TIMEOUT).
+void expectChange(const char * msg, volatile unsigned int * count)
+{
+ unsigned int count_0 = *count;
+ int microSecondsLeft = TIMEOUT;
+ unsigned int sleepTime = 10000;
+ while (true)
+ {
+ usleep(sleepTime);
+ microSecondsLeft -= sleepTime;
+
+ if (count_0 != *count)
+ {
+ // Change detected.
+ return;
+ }
+
+ if (microSecondsLeft < 0)
+ {
+ printf("Expected: %s\n", msg);
+ exit(1);
+ }
+ }
+}
+
+// Ensure &count does NOT change (for TIMEOUT else exit(1)).
+void expectNoChange(const char * msg, volatile unsigned int * count)
+{
+ unsigned int count_0 = *count;
+ int microSecondsLeft = TIMEOUT;
+ unsigned int sleepTime = 10000;
+ while (true)
+ {
+ usleep(sleepTime);
+ microSecondsLeft -= sleepTime;
+
+ if (count_0 != *count)
+ {
+ // Change detected.
+ printf("Expected: %s\n", msg);
+ exit(1);
+ }
+
+ if (microSecondsLeft < 0)
+ {
+ return;
+ }
+ }
+}
=====================================
testsuite/tests/rts/pause-resume/pause_resume.h
=====================================
@@ -0,0 +1,10 @@
+
+void pauseAndResume(bool assertNotPaused, volatile unsigned int * count);
+unsigned long pauseAndResumeViaThread(volatile unsigned int * count);
+void pauseAndUseRtsAPIAndResume
+ ( HaskellObj haskellFn
+ , HaskellObj haskellFnArgument
+ , HaskellObj obj1
+ , HaskellObj obj2
+ , HsStablePtr stablePtrIO
+ );
=====================================
testsuite/tests/rts/pause-resume/pause_resume_via_pthread.hs
=====================================
@@ -0,0 +1,38 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Control.Concurrent.MVar
+import Control.Monad
+import Foreign.C.Types
+import Foreign.Marshal.Alloc
+import Foreign.Ptr
+import Foreign.Storable
+
+foreign import ccall safe "pause_resume.h pauseAndResumeViaThread"
+ safe_pauseAndResumeViaThread_c :: Ptr CUInt -> IO CULong
+
+foreign import ccall safe "pthread.h pthread_join"
+ safe_pthread_join_c :: CULong -> IO ()
+
+-- Simple test of rts_pause() followed by rts_resume() via a new thread created
+-- in c code.
+main :: IO ()
+main = do
+ alloca $ \countPtr -> do
+ poke countPtr 0
+
+ -- forever increment count. Changes will be observed from the c code.
+ sequence_ $ replicate 4 $ forkIO $ forever $ do
+ count <- peek countPtr
+ poke countPtr (count + 1)
+ threadDelay 10000 -- 10 milliseconds
+
+ -- Test rts_pause/rts_resume.
+ safe_pthread_join_c =<< safe_pauseAndResumeViaThread_c countPtr
+
+ -- Test rts_pause/rts_resume from a unbound (worker) thread.
+ mvar <- newEmptyMVar
+ forkIO $ do
+ safe_pthread_join_c =<< safe_pauseAndResumeViaThread_c countPtr
+ putMVar mvar ()
+ takeMVar mvar
=====================================
testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi.hs
=====================================
@@ -0,0 +1,38 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Control.Concurrent.MVar
+import Control.Monad
+import Foreign.C.Types
+import Foreign.Marshal.Alloc
+import Foreign.Ptr
+import Foreign.Storable
+import GHC.Stack
+
+foreign import ccall safe "pause_resume.h pauseAndResume"
+ safe_pauseAndResume_c :: CBool -> Ptr CUInt -> IO ()
+
+-- Simple test of rts_pause() followed by rts_resume()
+main :: IO ()
+main = do
+ alloca $ \countPtr -> do
+ poke countPtr 0
+
+ -- forever increment count. Changes will be observed from the c code.
+ sequence_ $ replicate 4 $ forkIO $ forever $ do
+ count <- peek countPtr
+ poke countPtr (count + 1)
+ threadDelay 10000 -- 10 milliseconds
+
+ -- Test rts_pause/rts_resume.
+ safe_pauseAndResume_c cTrue countPtr
+
+ -- Test rts_pause/rts_resume from a unbound (worker) thread.
+ mvar <- newEmptyMVar
+ forkIO $ do
+ safe_pauseAndResume_c cTrue countPtr
+ putMVar mvar ()
+ takeMVar mvar
+
+cTrue :: CBool
+cTrue = 1
=====================================
testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi_concurrent.hs
=====================================
@@ -0,0 +1,52 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Control.Concurrent.MVar
+import Control.Monad
+import Foreign.C.Types
+import Foreign.Marshal.Alloc
+import Foreign.Ptr
+import Foreign.Storable
+import System.Exit
+import System.Timeout
+
+foreign import ccall safe "pause_resume.h pauseAndResume"
+ safe_pauseAndResume_c :: CBool -> Ptr CUInt -> IO ()
+
+-- Test that concurrent calls to rts_pause()/rts_resume() doesn't cause deadlock.
+main :: IO ()
+main = do
+ alloca $ \countPtr -> do
+ poke countPtr 0
+
+ -- forever increment count. Changes will be observed from the c code.
+ sequence_ $ replicate 4 $ forkIO $ forever $ do
+ count <- peek countPtr
+ poke countPtr (count + 1)
+ threadDelay 10000 -- 10 milliseconds
+
+ -- Note that each call blocks for about a second, so this will take 5
+ -- seconds to complete.
+ let n = 5
+ mvars <- sequence $ replicate n newEmptyMVar
+ forM_ mvars $ \mvar -> forkIO $ do
+ safe_pauseAndResume_c
+ -- Don't check rts_isPaused() before rts_pause nore after rts_resume
+ -- because we're doing this concurrently so that would introduce a race
+ -- condition.
+ cFalse
+ countPtr
+ putMVar mvar ()
+
+ -- Wait (at least 2n seconds to be safe) for all threads to finish.
+ result <- timeout (2 * n * 1000000) (mapM_ takeMVar mvars)
+ case result of
+ Nothing -> do
+ putStrLn "Not all rts_pause/rts_resume threads have finished. Assuming deadlocked and failing test."
+ exitFailure
+ Just () -> do
+ putStrLn "All threads finished"
+ exitSuccess
+
+cFalse :: CBool
+cFalse = 0
=====================================
testsuite/tests/rts/pause-resume/pause_resume_via_safe_ffi_concurrent.stdout
=====================================
@@ -0,0 +1 @@
+All threads finished
=====================================
testsuite/tests/rts/pause-resume/shouldfail/all.T
=====================================
@@ -0,0 +1,20 @@
+
+test('unsafe_rts_pause', [only_ways(threaded_ways), exit_code(1)], compile_and_run, [''])
+test('rts_lock_when_paused',
+ [ only_ways(threaded_ways),
+ exit_code(1),
+ extra_files(['rts_pause_lock.c','rts_pause_lock.h'])
+ ],
+ multi_compile_and_run, ['rts_lock_when_paused', [('rts_pause_lock.c','')], ''])
+test('rts_pause_when_locked',
+ [ only_ways(threaded_ways),
+ exit_code(1),
+ extra_files(['rts_pause_lock.c','rts_pause_lock.h'])
+ ],
+ multi_compile_and_run, ['rts_pause_when_locked', [('rts_pause_lock.c','')], ''])
+test('rts_double_pause',
+ [ only_ways(threaded_ways),
+ exit_code(1),
+ extra_files(['rts_pause_lock.c','rts_pause_lock.h'])
+ ],
+ multi_compile_and_run, ['rts_double_pause', [('rts_pause_lock.c','')], ''])
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.hs
=====================================
@@ -0,0 +1,23 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Foreign
+import Foreign.C
+import System.Exit
+import System.Timeout
+
+foreign import ccall safe "rts_pause_lock.h assertDoneAfterOneSecond"
+ safe_assertDoneAfterOneSecond_c :: Ptr CInt -> IO ()
+
+foreign import ccall safe "rts_pause_lock.h doublePause"
+ safe_doublePause_c :: Ptr CInt -> IO ()
+
+main :: IO ()
+main = alloca $ \donePtr -> do
+ -- We don't expect a deadlock, but we want to avoid one in the case of a
+ -- failed test.
+ poke donePtr 0
+ forkIO $ safe_assertDoneAfterOneSecond_c donePtr
+
+ -- The actual test.
+ safe_doublePause_c donePtr
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.stderr
=====================================
@@ -0,0 +1 @@
+rts_double_pause: error: rts_pause: This thread has already paused the RTS.
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_double_pause.stdout
=====================================
@@ -0,0 +1,2 @@
+Pausing...Paused
+Pausing...
\ No newline at end of file
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.hs
=====================================
@@ -0,0 +1,23 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Foreign
+import Foreign.C
+import System.Exit
+import System.Timeout
+
+foreign import ccall safe "rts_pause_lock.h assertDoneAfterOneSecond"
+ safe_assertDoneAfterOneSecond_c :: Ptr CInt -> IO ()
+
+foreign import ccall safe "rts_pause_lock.h lockThenPause"
+ safe_lockThenPause_c :: Ptr CInt -> IO ()
+
+main :: IO ()
+main = alloca $ \donePtr -> do
+ -- We don't expect a deadlock, but we want to avoid one in the case of a
+ -- failed test.
+ poke donePtr 0
+ forkIO $ safe_assertDoneAfterOneSecond_c donePtr
+
+ -- The actual test.
+ safe_lockThenPause_c donePtr
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.stderr
=====================================
@@ -0,0 +1,2 @@
+rts_lock_when_paused: error: rts_pause: attempting to pause from a Task that owns a capability.
+ Have you already acquired a capability e.g. with rts_lock?
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_lock_when_paused.stdout
=====================================
@@ -0,0 +1,2 @@
+Locking...Locked
+Pausing...
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_pause_lock.c
=====================================
@@ -0,0 +1,79 @@
+#include <stdio.h>
+#include <unistd.h>
+
+#include "Rts.h"
+#include "RtsAPI.h"
+
+#include "rts_pause_lock.h"
+
+// Although we expect errors rather than deadlock, we don't want a failed test
+// to be a deadlocked test. Hence we use this as a 1 second timeout mechanism.
+void assertDoneAfterOneSecond(int * done)
+{
+ sleep(1);
+ if (!*done)
+ {
+ printf("Deadlock detected.");
+ exit(1);
+ }
+}
+
+void lockThenPause (int * done) {
+ printf("Locking...");
+ Capability * lockCap = rts_lock();
+ printf("Locked\n");
+
+ printf("Pausing...");
+ Capability * pauseCap = rts_pause();
+ printf("Paused\n");
+
+ printf("Resuming...");
+ rts_resume(pauseCap);
+ printf("Resumed\n");
+
+ printf("Unlocking...");
+ rts_unlock(lockCap);
+ printf("Unlocked\n");
+
+ *done = 1;
+}
+
+void pauseThenLock (int * done) {
+ printf("Pausing...");
+ Capability * pauseCap = rts_pause();
+ printf("Paused\n");
+
+ printf("Locking...");
+ Capability * lockCap = rts_lock();
+ printf("Locked\n");
+
+ printf("Unlocking...");
+ rts_unlock(lockCap);
+ printf("Unlocked\n");
+
+ printf("Resuming...");
+ rts_resume(pauseCap);
+ printf("Resumed\n");
+
+ *done = 1;
+}
+
+void doublePause (int * done) {
+ printf("Pausing...");
+ Capability * pauseCapA = rts_pause();
+ printf("Paused\n");
+
+ printf("Pausing...");
+ Capability * pauseCapB = rts_pause();
+ printf("Paused\n");
+
+ printf("Resuming...");
+ rts_unlock(pauseCapA);
+ printf("Resuming\n");
+
+ printf("Resuming...");
+ rts_resume(pauseCapB);
+ printf("Resumed\n");
+
+ *done = 1;
+}
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_pause_lock.h
=====================================
@@ -0,0 +1,5 @@
+
+void assertDoneAfterOneSecond(int * done);
+void lockThenPause (int * done);
+void pauseThenLock (int * done);
+void doublePause (int * done);
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.hs
=====================================
@@ -0,0 +1,23 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Control.Concurrent
+import Foreign
+import Foreign.C
+import System.Exit
+import System.Timeout
+
+foreign import ccall safe "rts_pause_lock.h assertDoneAfterOneSecond"
+ safe_assertDoneAfterOneSecond_c :: Ptr CInt -> IO ()
+
+foreign import ccall safe "rts_pause_lock.h pauseThenLock"
+ safe_pauseThenLock_c :: Ptr CInt -> IO ()
+
+main :: IO ()
+main = alloca $ \donePtr -> do
+ -- We don't expect a deadlock, but we want to avoid one in the case of a
+ -- failed test.
+ poke donePtr 0
+ forkIO $ safe_assertDoneAfterOneSecond_c donePtr
+
+ -- The actual test.
+ safe_pauseThenLock_c donePtr
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.stderr
=====================================
@@ -0,0 +1,2 @@
+rts_pause_when_locked: error: rts_lock: The RTS is already paused by this thread.
+ There is no need to call rts_lock if you have already call rts_pause.
=====================================
testsuite/tests/rts/pause-resume/shouldfail/rts_pause_when_locked.stdout
=====================================
@@ -0,0 +1,2 @@
+Pausing...Paused
+Locking...
=====================================
testsuite/tests/rts/pause-resume/shouldfail/unsafe_rts_pause.hs
=====================================
@@ -0,0 +1,21 @@
+{-# LANGUAGE ForeignFunctionInterface #-}
+
+import Data.Word
+import Data.IORef
+import GHC.Clock
+import Control.Concurrent
+import Foreign.Ptr
+import System.Mem
+import Control.Monad
+
+data Capability
+
+foreign import ccall unsafe "RtsAPI.h rts_pause"
+ unsafe_rts_pause_c :: IO (Ptr Capability)
+
+main :: IO ()
+main = do
+ -- Making a unsafe call to rts_pause() should fail. We cannot allow this
+ -- haskell thread to continue if the RTS is paused.
+ _ <- unsafe_rts_pause_c
+ putStrLn "Oops! Haskell thread has continued even though RTS was paused."
=====================================
testsuite/tests/rts/pause-resume/shouldfail/unsafe_rts_pause.stderr
=====================================
@@ -0,0 +1,2 @@
+unsafe_rts_pause: error: rts_pause: attempting to pause via an unsafe FFI call.
+ Perhaps a 'foreign import unsafe' should be 'safe'?
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/12aec424f9f89ea2ff1dbd330686bd51240df92f
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/12aec424f9f89ea2ff1dbd330686bd51240df92f
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201020/37cdd157/attachment-0001.html>
More information about the ghc-commits
mailing list