FFI, safe vs unsafe
Wolfgang Thaller
wolfgang.thaller at gmx.net
Tue Apr 4 00:03:56 EDT 2006
John Meacham wrote (... but I've reordered things):
> My only real 'must-have' is that the 4 modes all can be explicitly and
> unambiguously specified. I have opinions on the syntax/hints but
> that is
> more flexable.
I basically agree (the syntax discussion will take place in the years
after the semantics discussion), but...
I want programmers to have a way of saying "this function might spend
a lot of time in foreign lands". These calls should be concurrent on
all implementations that support it (because some separately
developed/maintained piece of Haskell code might expect to run a
computation in the background), but if there are implementations that
don't support it shouldn't flag an error, because that would
encourage library writers to specify nonconcurrent when they can't
prove that it's safe, or make code needlessly nonportable.
Another way to look at it: You cannot decide whether the call
actually has to be done concurrently by just looking at the call site
- you'd need to look at the entire program, and asking people
(especially library writers) to state and guarantee global properties
of a program that might not even be finished yet is a Bad Thing.
Therefore, the concurrency annotation on the foreign import can only
be a hint on whether the foreign function is guaranteed to return
quickly or not; the actual requirement for the call to be
"concurrent" is hidden in the other code that expects to run at the
same time. Therefore, it would be wrong for an implementation that
doesn't support concurrent calls (reentrant or nonreentrant, I don't
care) to flag an error; the foreign import declaration correctly
refuses to give a guarantee that the function will return quickly.
The error is in the code that expects to run concurrently with a
foreign import on an implementation that doesn't support that (but of
course, a compiler can't detect such an error).
> Another nice minor thing would be if haskell implementations were
> required to ignore annotations starting with 'x-' for implementation
> specific hints.
Sounds good. Syntax discussion postponed again ('x-' looks so mime-
typish. Could we add a meaningless 'application/' to the front? Just
kidding).
> In my survey of when 'reentrant concurrent' was needed, I looked at
> all
> the standard libraries and didn't find anywhere it was actually
> needed.
> Are there some compelling examples of when it is really needed in a
> setting that doesn't have OS threads to begin with? (I am not
> asserting
> they don't exist, I just want to see some example uses of this feature
> to get a better handle on the implementation cost)
In my experience, reentrant calls are rare in non-GUI code, but they
become quite common in GUI code (OK, in some GUI libraries, there is
only one, called something like RunMainEventLoop, but then it runs
almost all of the time and is absolutely crucial). And with most GUI
libraries, the GUI's main event loop will refuse to cooperate well
with a Haskell's implementation's scheduler, so it will need to be
called as a "concurrent" foreign import if your application is to do
any background processing while waiting for events.
Other libraries that rely on callbacks would include the GLU
Tesselator that I already mentioned, as well as several packages for
solving optimisation problems. For those, concurrency would probably
only become an issue when they are used with a GUI (even if it's only
to display a progress bar).
Another reason why you don't see them in Haskell standard library
code might be that everyone prefers Data.List.sort to foreign import
ccall qsort.
> Any particular reason hugs and GHC didn't use the state-threads
> approach
> out of curiosity? did it conflict with the push-enter model? (jhc
> uses
> the eval-apply model so I am more familier with that)
It was before my time. I guess it's because GHC uses a separate heap-
allocated Haskell thread, so it made sense not to bother to allocate
a separate C stack for every one of them. Don't know about Hugs.
> It also implys that a function call will run on the same OS thread as
> the OS thread the current haskell thread is running on.
This shouldn't be put into a standard, as the bound threads proposal
already gives a different guarantee about that, and both guarantees
taken together probably guarantee too much - taken together, they
probably mean every Haskell thread has to be an OS thread. It might
be an implementation-specific guarantee, unless the bound threads
become a part of the standard in their entirety.
> 'OS thread the current haskell
> thread is running on' (GHC already doesn't when bound threads arn't
> used
> I am led to believe?)
There should be no such thing as the 'OS thread the current haskell
thread is running on' in any standard; OS thread identity is only
observed through the FFI.
> this means that things like 'log' and 'sin' and
> every basic operation goes through the FFI mechanism so it needs to be
> _fast_ _fast_. A neat side effect is that jhcs implementation of the
> prelude is mostly portable to different compilers.
I, too, want foreign import nonconcurrent nonreentrant to compile to
a plain call without any extras. GHC achieves that goal, even in the
presence of bound threads; I'm optimistic about jhc + state threads,
too.
==== Bound Threads / Implementation methods ====
> I am not quite sure whether you are saying something different from
> what
> I plan for jhc or not, my current thinking for jhc is,
> [...]
> An alternate mode I'd like to experiment with one day is the complete
> oposite side of the spectrum:
>
> one OS thread per haskell thread, no guarentees about duplicated work
> between threads.
Let me add a third one:
One OS thread per haskell thread, locks & condition variables used to
make them behave just like state threads.
Creating threads and switching from one thread to another would be
slower than with state threads by about the time it takes to do a
trip to the kernel and back. Nonconcurrent foreign imports are just
plain calls, concurrent foreign imports have to release the lock (and
maybe signal another thread) and then re-acquire it afterwards.
Foreign exports need to check whether the current OS thread owns the
lock (TLS access), and wait for the lock if it doesn't.
And a fourth one:
Your "single true OS thread" runs all *unbound* Haskell threads and
your scheduler (on separate stacks). Bound threads run in their own
OS threads, and the scheduler will use OS thread primitives (foreign
imported nonconcurrent) to run them and to wait for their time slice
to finish (if you're doing preemption) or for them to block. It's
like "if target thread is bound, use OS thread primitives, else use
State Thread primitives to pass control to it". Foreign import
nonconcurrent is a plain call. Foreign import concurrent from a bound
thread releases the lock before calling (and waits for it
afterwards). From an unbound thread you could implement it the way
you planned to.
Call-ins would need that one stack pointer range check, and wait for
a lock if it fails.
> well, the cost of bound threads is not the cost of the call itself, it
> is that they must be serialized. foreign concurrent calls can run
> concurrently to haskell code and each other. but 'bound' foreign calls
> must wait for their dedicated thread to become available. I think
> there
> needs to be an annotation as to which functions require boundness so
> suddenly all foreign calls arn't serialized just because they are in a
> 'forkOS'ed thread.
I think you're mistaken here. For every given OS thread, *at most
one* Haskell thread will ever be bound to it. So when you make a
foreign call, you can be sure that the OS thread that is supposed to
execute it is available *right now*, because the only Haskell thread
that has the right to cause code to be executed in the bound OS
thread has obviously finished any previous foreign calls. No need to
wait. Ever.
In the bound threads proposal, the only basic method to create a
bound thread is a call to a foreign export (or to main). The
resulting thread of Haskell execution is bonund to the OS thread that
made the call. The library function forkOS is just a call to
pthread_create or it's Windows equivalent.
It would be possible to add an annotation to "foreign export" that
states that "the Haskell thread that results from a call-in to this
function does not need to be bound" (if that would improve
performance), but back when "bound threads" were born we decided
against it to keep things simple; instead, we added
"runInUnboundThread :: IO a -> IO a" to keep the people who were
concerned about performance happy, and I have yet to see it used by
anyone.
>> ==== On the implementability of "concurrent reentrant" ====
>> [...] b) The runtime thread will need to periodically check
>> whether an
>> interthread message has arrived, and if there is no work, block
>> waiting for it. The fast path of checking whether something has been
>> posted to the message queue is fast indeed - you just have to check a
>> global flag.
> I'd integrate it into the EDSM loop somehow (futex maybe) as I have a
> moral adversion for periodic checking of anything.
Yes, I was thinking of checking the flag just as a way to avoid
calling epoll if a foreign call is already waiting. Basically, you'll
want to write a message to a pipe so that it gets picked up by the
EDSM loop. After all, a pipe is nothing but a traditional unix-style
message queue :-).
> the main thing is that it is a cost paid by every foreign export.
> perhaps a flag saying "this will only be called nonconcurrently on
> exports" though, perhaps that can be an x-flag if other compilers
> can't
> take advantage of it.
Yes, such an x-flag would be entirely reasonable if those few
instructions for a stack pointer range test become significant. Those
functions will, of course, be very rare.
> how prevelant is support for __thread BTW? is it required by any
> standards or an ELFism?
It is an ELFism by birth; __declspec(thread) is the equivalent
MSVCism, and I don't know of any other equivalent features (but then
my horizon doesn't extend much beyond the Mac/Linux/Windows triad). I
hope that Apple and the mingw32 team soon implement it, too, but
that's just an unfounded hope.
That's all for now,
Cheers,
Wolfgang
More information about the Haskell-prime
mailing list