Native Threads in the RTS

Dean Herington heringto@cs.unc.edu
Mon, 02 Dec 2002 10:16:27 -0500


Simon Marlow wrote:

> > | 2. Calling from foreign code into Haskell to a bound foreign import
> > will
> > | require some special handling to ensure that a subsequent
> > call out to
> > | foreign code will use the same native thread.  Why couldn't this
> > special
> > | handling select the same Haskell thread instead of creating
> > a new one?
> >
> > This is just an efficiency issue, right?   If creating a
> > Haskell thread
> > from scratch is very cheap, then it's easier to do that each
> > time rather
> > than to try to find the carcass of a completed Haskell
> > thread.   If you
> > do the latter, you need to get into carcass management.
> >
> > But maybe there is more to it than efficiency in your mind?
>
> To recap, the suggestion was that a Haskell thread which makes a foreign
> call, which is turn calls back into Haskell, should use the same Haskell
> thread for the callback.  So the Haskell thread is not a carcass, it is
> still running, but blocked waiting for the result of the foreign call.

Yes, exactly.

> I'm not sure I've quite got my head around all the implications of doing
> this, but it sounds possible.  However, I'm not completely convinced
> it's desirable: the gain seems to be in efficiency only, and a fairly
> small one (creating threads is quite cheap).  I imagine you could
> demonstrate a performance gain by doing this for an application which
> does a lot of callbacks, though.

My concern is not for efficiency.  (In fact, I assumed the efficiency to be
better with the current scheme because it was preferred to the (to me) more
natural scheme.  In any case, it appears any difference in
efficiency--either way--is likely not to be large.)  Rather, I find it
nonintuitive that calling from Haskell to foreign code and back into Haskell
should create a new Haskell thread, when these two Haskell threads really
are just different portions of a single "thread of computation"
(deliberately vague term).  Off the top of my head I can think of two
situations in which having separate threads is bothersome.

1. Consider a computation that requires Haskell-side per-thread state and
associates it with a thread via the thread's ThreadId.  Then some handle to
this state currently needs to be passed out to foreign code and then back
into Haskell.  (This passage could be achieved explicitly as arguments in
the call-out and callback, or implicitly in the closure representing the
callback.)  It is unfortunate that mere inclusion of a foreign call in such
a chain of calls necessitates such additional complexity in per-thread state
maintenance.

2. It seems perfectly reasonable to want to have the Haskell called-back
code throw an exception that is caught by the Haskell code that called out
to foreign code.  "Reusing" the Haskell thread is necessary (though not
sufficient) to achieve such behavior.

> The current situation has the advantage of simplicity:
>
>   * for each 'foreign export' we create a single Haskell thread which
>     lives until completion of the IO action.  We can consider a
> standalone
>     program as a call to 'foreign export main :: IO a' from a simple C
>     wrapper (in fact, that's almost exactly how it works).

Dean