Native Threads in the RTS

Wolfgang Thaller wolfgang.thaller@gmx.net
Wed, 27 Nov 2002 18:31:24 +0100


Simon Marlow wrote:

> I don't see the problem with forking a new Haskell thread for each
> foreign export, and associating it with the current native thread if 
> the
> foreign export is marked "bound".  It does mean we can get multiple
> Haskell threads bound to the same native thread, but only one can be
> runnable at any one time (this is an important invariant from the point
> of view of the implementation, I believe).

Of course, you're right.

Simon Peyton-Jones wrote:

> I offer myself as such a guinea pig.  I'm afraid I don't understand it
> yet.  It's hard to describe precisely.   Comments below.

Yes, I do need a guinea pig ;-) . I really have trouble expressing my 
ideas accurately, and I keep changing them.

> Better start with some definitions, for
>     'Haskell thread'
> and    'native thread'
>
> I think I know what you mean, but better to be sure.

Well, yes. I didn't manage to come up with a decent definition for it 
yet.
I intend "native thread" to be the thing you get using pthread_create 
on unix, and "Haskell thread" to be the same thing as it currently is 
in the GHC RTS. Can anyone think of a way of defining that in a way 
that is accurate, general and understandable?

> | If there is one, this Haskell thread is used to execute the callback.
>
> OK, so this is where I get completely confused.  A Haskell thread is 
> not
> (currently) an execution platform.

Oops... The best way to confuse other people is to be confused yourself 
;-) --- I had a slight misconception about the RTS there --- I think 
I've corrected that.

===============================

Threads Proposal, version 4

Goals
~~~~~

Since foreign libraries sometimes exploit thread local state, it is
necessary to provide some control over which thread is used to execute
foreign code.  In particular, it is important that it should be
possible for Haskell code to arrange that a sequence of calls to a
given library are performed by the same native thread and that if an
external library calls into Haskell, then any outgoing calls from
Haskell are performed by the same native thread.

This specification is intended to be implementable both by
multithreaded Haskell implementations and by single-threaded
implementations and so it does not comment on which particular OS
thread is used to execute Haskell code.

Definitions
~~~~~~~~~~~
A native thread (aka OS thread) is a thread as defined by the operating 
system.
A Haskell thread is [*** FIXME - How shall I put this? ***] the thing 
you see from  Haskell land.

Design
~~~~~~

Haskell threads may be associated at thread creation time with either
zero or one native threads. Each Native thread is associated with zero 
or more native threads.

If a native thread is associated with one or more Haskell threads, 
exactly one of the following must be true:
*) Exactly one Haskell thread associated with the native thread is 
executing.
*) The native thread is executing foreign code.
*) The native thread and all Haskell threads associated with it are 
blocked.

The thread that main runs in, threads created using forkIO and threads 
created for running finalizers or signal handlers are not necessarily 
associated with a native thread. However, an implementation might 
choose to do so.

There are now two kinds of foreign exported [and foreign import 
wrapped] functions: bound and free. The FFI syntax should be extended 
appropriately [which of the two should be the default, if any?].

When a "bound" foreign exported function is invoked [by foreign code], 
a new Haskell thread is created and associated with the native thread. 
The new associated Haskell thread is then used to execute the callback.

When a "free" foreign exported function is invoked, the implementation 
may freely choose what kind of Haskell thread the function is executed 
in. It is not specified whether this thread is associated with a 
particular OS thread or not.

When a foreign imported function is invoked [by Haskell code], the 
foreign code is executed in the native thread associated with the 
current Haskell thread, if an association exists. If the current 
Haskell thread is not associated to a native thread, the implementation 
may freely decide which thread to run the foreign function in.
The existing distinction between unsafe, safe and threadsafe calls 
remains unchanged.

A new library routine, forkNativeThread :: IO () -> IO ThreadID, should 
spawn a new Haskell Thread (like forkIO) and associate it with a new 
native thread (forkIO is not guaranteed to do this). It may be 
implemented using the FFI and an OS-specific thread creation routine. 
It would just pass a "bound" callback as an entry point for a new OS 
thread.

Issues
~~~~~~

Finalizers and signal handlers cannot be associated with a particular 
native thread. If they have to trigger an action in a particular native 
thread, a message has to be sent manually (via MVars and friends) to 
the Haskell thread associated with the native thread in question.

This introduces a change in the syntax for foreign export and foreign 
import "wrapper" declarations (a bound/free specifier is added). I 
think we should have a default option here. I'm not sure which, 
however. Also, the objection that "bound" and "free" can be confused 
with the lambda calculus terms still holds.