safe and threadsafe

Mon Feb 10 12:24:06 EST 2003

This has gotten rather long... sorry in advance.

I'll be rambling on the following topics:
*) Performance of the threaded RTS
*) I am not convinced that thread synchronization should be done by the 
FFI
*) "safe" and "threadsafe" are misleading names
*) "unsafe" shouldn't have to guarantee blocking
*) "safe" is not completely specified, and it's hard to implement 
correctly

========

*) Performance of the threaded RTS

Manuel M T Chakravarty wrote:
> How much more expensive than a
> vanilla function call is an unsafe, a safe, and a threadsafe
> call in the threaded RTS at the moment?

Unsafe is as fast as ever.
Safe call-outs are currently treated as threadsafe, but they can 
perhaps be made (almost) as fast as in the non-threaded RTS.
Threadsafe call-out is slightly slower. A background worker thread will 
either execute some tens of useless lines of C code until it realizes 
that there's really nothing that needs to be done. When the threadsafe 
call returns, the worker thread has to finish what it is doing.
Call-ins to the RTS are slower than in the non-threaded RTS because the 
scheduler has to run in a separate thread (not in the call-in thread). 
There is room for optimization, but it won't get as fast as the 
non-threaded RTS, and "safe" calls don't help (we still need a separate 
thread).
I haven't done any measurements, are there any existing benchmarks?

*) I am not convinced that thread synchronization should be done by the 
FFI

Manuel M T Chakravarty wrote:
> With the MVar solution, I am worried that it will add a lot
> of extra code to large libraries like Gtk+HS, where every
> single of the hundreds of functions would need to be
> protected by an MVar.

I didn't know that Gtk+HS was attempting to be threadsafe where gtk+ 
isn't. I was thinking that in most cases, preventing multiple entry 
into the library wouldn't be enough - you would need a lock accross 
multiple gtk calls to achieve meaningful synchronisation (but I don't 
know GTK well enough).
I'd expect that the Gtk.main function and other functions that run an 
event loop (or do anything else that might take a long time) are 
imported "threadsafe".
And what about potential problems with too much synchronization? It 
would be a pity if I couldn't use two different library bindings at the 
same time because both insist on using "safe" calls.

*) "safe" and "threadsafe" are misleading names

"Threadsafe" is simply the normal way of doing things. No unexpected 
blocking, no special synchronisation, no surprises. It should be the 
default, and perhaps renamed to "safe" or something else.

If you want to use "safe" for performance, then it's not really "safe" 
- you have to be sure that blocking won't hurt you. It should be 
renamed to something else.

If you want to use "safe" for synchronization, then why not rename it 
to "synchronized"?

[Don't worry, I'll keep using the old terms for the rest of this mail]

*) "unsafe" shouldn't have to guarantee blocking

Sometime in the future, there might be a Haskell implementation where 
it is faster to implement "unsafe" without blocking other threads. As 
"unsafe" was made for speed, this hypothetical implementation should be 
free to use the fastest variant and omit blocking. In all current 
implementations, blocking is the fastest option, but that smells like 
an implementation detail.

*) "safe" is not completely specified, and it's hard to implement 
correctly

Let's assume that all Haskell threads are blocked because one thread is 
calling a "safe" foreign imported routine.
What happens if a foreign thread running other foreign code comes along 
and tries to call a "foreign exported" haskell function/action?

a) The run-time system segfaults (aka "undefined behaviour")
b) The foreign thread is blocked until the "safe" call is finished
c) The call-in is handled immediately
    c.1) all other haskell threads remain blocked
    c.2) all other haskell threads are resumed

Depending on whether we are aiming for synchronization or for 
optimization, I'd recommend b) or c.2 (which are both relatively easy 
to implement).

Also, if the "safe" called foreign function calls back to Haskell land, 
all other haskell threads are temporarily resumed. They are suspended 
again when the callback exits and returns to the foreign function. 
Things get really nasty if one of those "background" haskell threads 
calls a "threadsafe" foreign import. I have no idea what the intended 
behaviour in that case is.

Finally, I've said everything that I could possibly have to say. :-)

Cheers,

Wolfgang