FFI, safe vs unsafe

Fri Mar 31 18:41:18 EST 2006

John Meacham wrote:

> first of all, a quick note, for GHC, the answers will be "the same  
> thing
> it does now with -threaded". but I will try to answer with what a  
> simple
> cooperative system would do.

Sure. Unless someone dares answer "yes" to question 4, GHC will stay  
as it is.

>> 2.) Assume the same situation as in 1, and assume that the answer to
>> 1 is yes. While 'foo' is running, (Haskell) thread B makes a non-
>> concurrent, reentrant foreign call. The foreign function calls back
>> to the foreign-exported Haskell function 'bar'. Because the answer to
>> 1 was yes, 'foo' will resume executing concurrently with 'bar'.
>> If 'foo' finishes executing before 'bar' does, what will happen?
>
> I am confused, why would anything in particular need to happen at all?
>
> the threads are completly independent.  The non-concurrent calls could
> just be haskell code that happens to not contain any pre-emption  
> points
> for all it cares. in particular, in jhc, non-concurrent foreign  
> imports
> and exports are just C function calls. no boilerplate at all in either
> direction.  calling an imported foreign function is no different than
> calling one written in haskell so the fact that threads A and B are
> calling foregin functions doesn't really change anything.

In an implementation which runs more than one Haskell thread inside  
one OS thread, like ghc without -threaded or hugs, the threads are  
NOT completely independent, because they share one C stack. So while  
bar executes, stack frames for both foreign functions will be on the  
stack, and it will be impossible to return from foo before bar and  
the foreign function that called it completes. I think this kind of  
semantics is seriously scary and has no place as default behaviour in  
the language definition.

If you implement concurrency by using the pthreads library, you need  
to either make sure that only one thread mutates the heap at a time,  
or deal with SMP. In either case, concurrent foreign calls would be  
trivial.

>> 4.) Should there be any guarantee about (Haskell) threads not making
>> any progress while another (Haskell) thread is executing a non-
>> concurrent call?
>
> I don't understand why we would need that at all.

Good. Neither do I, but in the discussions about this issue that we  
had three years ago several people seemed to argue for that.

>> 5.) [...] So what
>> should the poor library programmer A do?
>
> He should say just 'reentrant' since concurrent isn't needed for
> correctness because the tessalation routines are basic calculations  
> and
> will return.

Let's say they will return after a few minutes. So having them block  
the GUI is a show-stopper for programmer C.
And if programmer C happens to use a Haskell implementation that  
supports "concurrent reentrant" but also a more efficient "non- 
concurrent reentrant", he will not be able to use the library.

> everyone wins. in the absolute worst case there are always #ifdefs  
> but I
> doubt they will be needed.

Except for programmer C on some haskell implementations. I don't buy  
it yet :-).

>> 6.) Why do people consider it too hard to do interthread messaging
>> for handling a "foreign export" from arbitrary OS threads, when they
>> already agree to spend the same effort on interthread messaging for
>> handling a "foreign import concurrent"? Are there any problems that I
>> am not aware of?
>
> it is not that it is hard (well it is sort of), it is just absurdly
> inefficient and you would have no choice but to pay that price for
> _every_ foregin export. even when not needed which it mostly won't be.
> the cost of a foreign export should be a simple 'call' instruction
> (potentially) when an implementation supports that.

As we seem to agree that the performance issue is non-existant for  
implementations that use one OS thread for every haskell thread, and  
that we don't want to change how GHC works, the following refers to a  
system like hugs where all Haskell code and the entire runtime system  
always runs in a single OS thread.

It might not be absolutely easy to implement "concurrent reentrant",  
but it's no harder than concurrent non-reentrant calls. If a haskell  
implementation has a hacker on its team who is able to do the former,  
then this is no problem either.
As for the efficiency argument: if it is sufficiently slow, then that  
is an argument for including "nonconcurrent reentrant" as an option.  
It is not an argument for making it the default, or for leaving out  
"concurrent reentrant".

> the cost of a foreign import concurrent nonreentrant is only paid when
> actually using such a function, and quite cheap. on linux at least, a
> single futex, a cached pthread and it gets rolled into the main event
> loop. so a couple system calls max overhead.

Sure. But what gives you the idea that the cost of a foreign export  
or a foreign import concurrent reentrant would be paid when you are  
not using them?
If we include nonconcurrent reentrant foreign imports in the system,  
or if we just optimise foreign imports for the single-threaded case,  
all that the foreign export would have to do is to check a flag (NO  
system calls involved). If the callback is from a foreign import  
concurrent reentrant or if it is from an entirely Haskell-free C  
thread, then we will have to do an inter-thread RPC to the runtime  
thread. Unavoidable.
For Hugs, I guess that overhead would be absorbed in its general  
slowness. For Yhc, it might be an issue.

A related performance sink are all foreign imports if such an  
implementation supports bound threads (which are, after all, needed  
to use some libraries, like OpenGL and Carbon/Cocoa, from a multi- 
threaded program). If the foreign function needs to be executed in a  
dedicated thread, then even a nonconcurrent nonreentrant call would  
involve inter-thread messaging (in this hypothetical hugs+bound  
threads). We should consider adding a "nothreadlocal" attribute to  
foreign imports - when it is known that the foreign function does not  
access thread-local state, we can use the traditional, more efficient  
implementation for "foreign import nonconcurrent nonreentrant".

Cheers,

Wolfgang