Native Threads in the RTS

Wolfgang Thaller wolfgang.thaller@gmx.net
Wed, 20 Nov 2002 00:19:38 +0100


I've now written up a slightly more formal proposal for native threads. 
(OK, it's only a tiny bit more formal...)
I doubt I have explained everything clearly, please tell me which 
points are unclear. And of course please tell me what you like/don't 
like about it.
I have some rough ideas on how to implement the proposal. I would be 
ready to invest some time, but I don't have enough free time to make 
any promises here. The discussion has to be finished first, anyway.

Cheers,
Wolfgang

*******************
Native Threads Proposal, version 1

Some "foreign" libraries (for example OpenGL) rely on a mechanism 
called thread-local storage. The meaning of an OpenGL call therefore 
usually depends on which OS thread it is called from. Therefore, some 
kind of direct mapping from Haskell threads to OS threads is necessary 
in order to use the affected foreign libraries.
Executing every haskell thread in its own OS thread is not feasible for 
performance reasons. However, perfomance of native OS threads is not 
too bad as long as there aren't too many, so I propose that some 
threads get their own OS threads, and some don't:

Every Haskell Thread can be either a "green" thread or a "native" 
thread.
For each "native" thread, there is exactly one OS thread created by the 
RTS. For a green thread, it is unspecified which OS thread it is 
executed in.
The main program and all haskell threads forked using forkIO are green 
threads. Threads forked using forkNativeThread :: IO () -> IO () are 
native threads.

Execution of a green thread might move from one OS thread to another at 
any time. A "green" thread is never executed in an OS thread that is 
reserved for a "native" thread.
A "native" haskell thread and all foreign imported functions that it 
calls are executed in its associated OS thread. A foreign exported 
callback that is called from C code executing in that OS thread is 
executed in the native haskell thread.
A foreign exported callback that is called from C code executing in an 
OS thread that is not associated with a "native" haskell thread is 
executed in a new green haskell thread.

Only one OS thread can execute Haskell code at any given time.

If a "native" haskell thread enters a foreign imported function that is 
marked as "safe" or "threadsafe", all other Haskell threads keep 
running. If the imported function is marked as "unsafe", no other 
threads are executed until the call finishes.

If a "green" haskell thread enters a foreign imported function marked 
as "threadsafe", a new OS thread is spawned that keeps executing other 
green haskell threads while the foreign function executes. Native 
haskell threads continue to run in their own OS threads.
If a "green" haskell thread enters a foreign imported function marked 
as "safe", all other green threads are blocked. Native haskell threads 
continue to run in their own OS threads. If the imported function is 
marked as "unsafe", no other threads are executed until the call 
finishes.

Finalizers are always run in green threads.

Issues deliberately not addressed in this proposal:
Some people may want to run several Haskell threads in a dedicated OS 
thread (this is what has been called "thread groups" before).
Some people may want to run finalizers in specific OS threads (are 
finalizers predictable enough for this to be useful?).
Everyone would want SMP if it came for free (but SMP seems to be too 
hard to do at the moment...)

Other things I'm not sure about:
What should we do get if a foreign function spawns a new OS thread and 
executes a haskell callback in that OS thread? Should a new native 
haskell thread that executes in the OS thread be created? Should the 
new OS thread be blocked and the callback executed in a green thread? 
What does the current threaded RTS do? (I assume the non-threaded RTS 
will just crash?)