Native Threads in the RTS
Wolfgang Thaller
wolfgang.thaller@gmx.net
Wed, 27 Nov 2002 18:31:24 +0100
Simon Marlow wrote:
> I don't see the problem with forking a new Haskell thread for each
> foreign export, and associating it with the current native thread if
> the
> foreign export is marked "bound". It does mean we can get multiple
> Haskell threads bound to the same native thread, but only one can be
> runnable at any one time (this is an important invariant from the point
> of view of the implementation, I believe).
Of course, you're right.
Simon Peyton-Jones wrote:
> I offer myself as such a guinea pig. I'm afraid I don't understand it
> yet. It's hard to describe precisely. Comments below.
Yes, I do need a guinea pig ;-) . I really have trouble expressing my
ideas accurately, and I keep changing them.
> Better start with some definitions, for
> 'Haskell thread'
> and 'native thread'
>
> I think I know what you mean, but better to be sure.
Well, yes. I didn't manage to come up with a decent definition for it
yet.
I intend "native thread" to be the thing you get using pthread_create
on unix, and "Haskell thread" to be the same thing as it currently is
in the GHC RTS. Can anyone think of a way of defining that in a way
that is accurate, general and understandable?
> | If there is one, this Haskell thread is used to execute the callback.
>
> OK, so this is where I get completely confused. A Haskell thread is
> not
> (currently) an execution platform.
Oops... The best way to confuse other people is to be confused yourself
;-) --- I had a slight misconception about the RTS there --- I think
I've corrected that.
===============================
Threads Proposal, version 4
Goals
~~~~~
Since foreign libraries sometimes exploit thread local state, it is
necessary to provide some control over which thread is used to execute
foreign code. In particular, it is important that it should be
possible for Haskell code to arrange that a sequence of calls to a
given library are performed by the same native thread and that if an
external library calls into Haskell, then any outgoing calls from
Haskell are performed by the same native thread.
This specification is intended to be implementable both by
multithreaded Haskell implementations and by single-threaded
implementations and so it does not comment on which particular OS
thread is used to execute Haskell code.
Definitions
~~~~~~~~~~~
A native thread (aka OS thread) is a thread as defined by the operating
system.
A Haskell thread is [*** FIXME - How shall I put this? ***] the thing
you see from Haskell land.
Design
~~~~~~
Haskell threads may be associated at thread creation time with either
zero or one native threads. Each Native thread is associated with zero
or more native threads.
If a native thread is associated with one or more Haskell threads,
exactly one of the following must be true:
*) Exactly one Haskell thread associated with the native thread is
executing.
*) The native thread is executing foreign code.
*) The native thread and all Haskell threads associated with it are
blocked.
The thread that main runs in, threads created using forkIO and threads
created for running finalizers or signal handlers are not necessarily
associated with a native thread. However, an implementation might
choose to do so.
There are now two kinds of foreign exported [and foreign import
wrapped] functions: bound and free. The FFI syntax should be extended
appropriately [which of the two should be the default, if any?].
When a "bound" foreign exported function is invoked [by foreign code],
a new Haskell thread is created and associated with the native thread.
The new associated Haskell thread is then used to execute the callback.
When a "free" foreign exported function is invoked, the implementation
may freely choose what kind of Haskell thread the function is executed
in. It is not specified whether this thread is associated with a
particular OS thread or not.
When a foreign imported function is invoked [by Haskell code], the
foreign code is executed in the native thread associated with the
current Haskell thread, if an association exists. If the current
Haskell thread is not associated to a native thread, the implementation
may freely decide which thread to run the foreign function in.
The existing distinction between unsafe, safe and threadsafe calls
remains unchanged.
A new library routine, forkNativeThread :: IO () -> IO ThreadID, should
spawn a new Haskell Thread (like forkIO) and associate it with a new
native thread (forkIO is not guaranteed to do this). It may be
implemented using the FFI and an OS-specific thread creation routine.
It would just pass a "bound" callback as an entry point for a new OS
thread.
Issues
~~~~~~
Finalizers and signal handlers cannot be associated with a particular
native thread. If they have to trigger an action in a particular native
thread, a message has to be sent manually (via MVars and friends) to
the Haskell thread associated with the native thread in question.
This introduces a change in the syntax for foreign export and foreign
import "wrapper" declarations (a bound/free specifier is added). I
think we should have a default option here. I'm not sure which,
however. Also, the objection that "bound" and "free" can be confused
with the lambda calculus terms still holds.