New Bound Threads Proposal

Sat Apr 26 07:58:23 EDT 2003

> Shame on me to not have written a proposal yet, but I am still
> too busy with other stuff right now.

I understand that... I'll probably be too busy again in a little more 
than a week...

> If you start with the prototype, I urge you to build it completely in 
> Haskell;

I was not talking about a "simulator", I was talking about a first 
draft of a working implementation for GHC. I would implement it in C, 
because it would have to be part of the RTS. After all, it's part of 
the scheduler, and the scheduler is already implemented in C.

> using the *minimal* amount of primitive operations required. Those 
> operations
> will probably be:
> - fork a native thread
> - attach a haskell thread to a native thread
> - moving unbound threads to other native threads
> - getting the current threads and their mapping to native threads

Question: Do you want them as part of a "low-level" public interface, 
i.e. should they be the same accross all Haskell implementations that 
implement your proposal?

If so, that contradicts two of the goals I've started from:
(*)	The specification shouldn’t explicitly require lightweight “green” 
threads
	to exist. The specification should be implementable in a simple and 
obvious
	way in haskell systems that always use a 1:1 correspondence between
	Haskell threads and OS threads.
(*)	The specification shouldn’t specify which particular OS thread 
should be
	used to execute Haskell code. It should be possible to implement it 
with
	e.g. a Haskell interpreter running in one OS thread that just uses 
other
	OS threads for foreign calls.
(These are the fourth and fifth points of section 2, "Requirements", of 
the threads document)

This is why I oppose having these, or similar operations, as part of 
any standard public interface.

Now if you mean using these primitives as an interface between the GHC 
RTS and GHC-specific libraries... well at least I'm less opposed to it.
I think it's more work to expose all these primitives. These primitives 
actually provide more features than we need (we don't need to 
explicitly map unbound threads to native threads; a scheduler loop 
running in an [unassociated] native thread can just execute any unbound 
thread without much ado).
Also, it feels like splitting up the scheduler logic in half between 
the RTS and the libraries. We don't we have primitive Haskell action 
for "add to run queue", "remove from run queue" etc. so that we can 
implement all the scheduling and MVars and so on in Haskell - so why 
should we implement just this part of the scheduler in Haskell?
Thirdly, I fear that it would preclude several kinds of optimizations 
that could be done within the scheduler (the proposal intentionally 
leaves some things unspecified to provide room for optimizations).

> Secondly, try to remove the whole safe/unsafe/threadsafe business and 
> make
> "safe" and "threadsafe" combinators too.

No ;-).
I don't think it's at all possible to implement the distinction between 
unsafe and the others via a combinator, because unsafe is about things 
that just have to be in the C language RTS.
In GHC, "unsafe" calls are compiled to just plain calls; "safe" and 
"threadsafe" calls first do some cleanup of RTS structures that makes 
sure that other (Haskell) threads or callbacks can run.
After this cleanup is performed (function suspendThread()), no Haskell 
code can execute until resumeThread() is called. Therefore, it is 
impossible to do this using a combinator.

So we're stuck with one optional specialid (not a keyword, "unsafe" is 
still a legal variable name) in foreign import declarations. Also, it's 
already widely used.

Now what about the distinction between "safe" and "threadsafe"?
_Maybe_ it's possible to implement "threadsafe" on top of "safe" and a 
few primitives (though I still doubt it).
My position is to simply remove "safe" from the language altogether, 
because:
*) It is poorly specified. There is no written document that specifies 
what exactly should block when, or what is required to block and what 
is just "allowed" to block.
*) Most "informal" specifications that require "safe" calls to block 
other threads rely on specific details of present Haskell 
implementations. Future implementations could, and are allowed to, 
implement threading differently. Green threads are a feature of Haskell 
implementations, not a feature of the Haskell language.
*) "safe" calls are dangerous. A "safe" call influences totally 
unrelated parts of a program in hard-to-predict ways.
*) It seems to be relatively hard to actually achieve the performance 
benefit of "safe" calls (which was the original reason for having the 
distinction).

If we remove "safe" from the language (and, consequently, rename 
"threadsafe" to "safe" and make it the default), we're left with only 
the distinction between "unsafe" and "safe" foreign imports (which we 
had from the start). If you don't specify anything, you get "safe" 
(a.k.a. threadsafe), which is the version that automatically "does the 
right thing". Unsafe foreign imports are provided as a performance 
optimization only (and it's not possible to achieve that using a 
combinator).
There would be no unnecessary keywords or language extensions left.

> Success,
>   Daan.

Qapla',
	Wolfgang