FFI, safe vs unsafe

Thu Apr 13 00:26:22 EDT 2006

On Wed, Apr 12, 2006 at 11:37:57PM -0400, Wolfgang Thaller wrote:
> John Meacham wrote:
> 
> >However, in order to achieve that we would have to annotate the  
> >foreign
> >functions with whether they use thread local state.
> 
> I am not opposed to that; however, you might not like that here  
> again, there would be the safe, possibly inefficient default choice,  
> which means "might access thread local data", and the possibly more  
> efficient annotation that comes with a proof obligation, which says  
> "guaranteed not to access thread local data".
> The main counterargument is that some libraries, like OpenGL require  
> many *fast* nonconcurrent, nonreentrant but tls-using calls (and,  
> nost likely, one reentrant and possibly concurrent call for the GLUT  
> main event loop). Using OpenGL would probably be infeasible from an  
> implementation which requires a "notls" annotation to make foreign  
> imports fast.

this is getting absurd, 95% of foreign imports are going to be
nonreentrant, nonconcurrent, nonthreadlocalusing. Worrying about the
minor inconvinience of the small chance someone might accidentally
writing buggy code is silly when you have 'peek' and 'poke' and the
ability to just deadlock right out there in the open.

The FFI is inherently unsafe. We do not need to coddle the programer who
is writing raw FFI code.  

_any_ time you use the FFI there are a large number of proof obligations
you are commiting to that arn't necessarily apparent, why make these
_very rare_ cases so visible. There is a reason they arn't named
'unsafePoke' and 'unsafePeek', the convinience of using the names poke
and peek outweighs the unsafety concern becaues you are already using
the FFI and already know everything is unsafe and you need to be
careful. these problems can't even crash the runtime, way safer than a
lot of the unannotated routines in the FFI.

> >it would pretty much
> >be vital for implementing them efficiently on a non OS-threaded
> >implemenation of the language.
> 
> True, with the implementation plan you've outlined so far.
> Have you considered hybrid models where most threads are state  
> threads (all running in one OS thread) and a few threads (=the bound  
> threads) are OS threads which are prevented from actually executing  
> in parallel by a few well-placed locks and condition variables? You  
> could basically write an wrapper around the state threads and  
> pthreads libraries, and you'd get the best of both worlds. I feel it  
> wouldn't be that hard to implement, either.

well, I plan a hybrid model of some sort, simply because it is needed to
support foreign concurrent calls. exactly where I will draw the line
between them is still up in the air.

but in any case, I really like explicit annotations on everything as we
can't predict what future implementations might come about so we should
play it safe in the standard.

> >Oddly enough, depending on the implementation it might actually be
> >easier to just make every 'threadlocal' function fully concurrent. you
> >have already paid the cost of dealing with OS threads.
> 
> Depending on the implementation, yes. This is the case for the  
> inefficient implementation we recommended for interpreters like Hugs  
> in our bound threads paper; there, the implementation might be  
> constrained by the fact that Hugs implements cooperative threading in  
> Haskell using continuation passing in the IO monad; the interpreter  
> itself doesn't even really know about threads. For jhc, I fell that a  
> hybrid implementation would be better.

yeah, what I am planning is just providing a create new stack and jump
to a different stack(longjmp) primitive, and everything else being
implemented in haskell as a part of the standard libraries.  (with
liberal use of the FFI to call things like pthread_create and epoll)

so actually fairly close to the hugs implementation in that it is mostly
haskell based, but with some better primitives to work with. (from what
I understand of how hugs works)

> >you seem to be contradicting yourself, above you say a performance
> >penalty is vitally important in the GUI case if a call takes too  
> >long, [...]
> 
> I am not. What I was talking about above was not performance, but  
> responsiveness; it's somewhat related to fairness in scheduling.
> If a foreign call takes 10 microseconds instead of 10 nanoseconds,  
> that is a performance penalty that will matter in some circumstances,  
> and not in others (after all, people are writing real programs in  
> Python...). If a GUI does not respond to events for more than two  
> seconds, it is badly written. If the computer or the programming  
> language implementation are just too slow (performance) to achieve a  
> certain task in that time, the Right Thing To Do is to put up a  
> progress bar and keep processing screen update events while doing it,  
> or even do it entirely "in the background".
> Of course, responsiveness is not an issue for non-interactive  
> processes, but for GUIs it is very important.

at some point, people might just decide that their program requires an
OS threaded implementation and that is fine. that is why I want it as an
explicit option so the manual can say "this needs a compiler that
supports the OS threading option" rather than "this needs GHC". 

> >Who is to say whether a app that
> >muddles along is better or worse than one that is generally snappy but
> >has an occasional delay.
> 
> I am ;-). Apart from that, I feel that is a false dichotomy, as even  
> a factor 1000 slowdown in foreign calls is no excuse to make a GUI  
> "generally muddle along".

jhc has no concept of primitive. every thing you can do is imported via
the FFI from arithmetic routines to IO. it is pretty nice actually, my
standard libraries can almost be compiled and tested on other haskell
compilers and everything available is right there in the source with
haddock comments (well, working on those). but a 1000x slowdown in FFI
calls would mean a 1000x slowdown in pretty much everything compiled
with jhc. 

I subscribe to the idea that nothing should be built into the compiler,
if a user want's to define their own datatypes, they should optimize
just as well as the isomorphic built in types and if they want their own
primitives, the implementation should have no advantage.

> As for your claim about the relative rarity of such calls, I see that  
> your bias is very different from mine. My world consists mostly of  
> console-based programs that compute something (compilers, etc.) and  
> don't need any FFI or concurrency to speak of, and of interactive  
> graphical applications (GUIs + games) for which the standard  
> libraries are only a tiny fragment of the FFI world. In your world,  
> network servers (or applications with a similar structure) seem to  
> figure more prominently.

I am also thinking of ginsu (my chat client) and other interactive stuff
that respond to events like that for which the basic cooperative model
is more than good enough. Some programs will require the OS threading
option (where everything can be pretty much concurrent whether you like it
or not) but there is a large and rich body of programs which won't.

FWIW (pretty much) every gtk app follows the cooperative model, they
have a single event loop with cooperative scheduling where you must
explicitly code in your continuations via wacky idle callbacks and
whatnot. and many many things have been implemented with it just fine
while just the bare minimum cooperative haskell system is worlds better
off.

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈