[Haskell-cafe] How to ensure code executes in the context of a specific OS thread?

Wed Jul 6 21:58:16 CEST 2011

On 06/07/11 17:19, Gábor Lehel wrote:
> On Wed, Jul 6, 2011 at 5:24 PM, Jason Dagit<dagitj at gmail.com>  wrote:
>> On Wed, Jul 6, 2011 at 8:09 AM, Simon Marlow<marlowsd at gmail.com>  wrote:
>>> On 06/07/2011 15:42, Jason Dagit wrote:
>>>>
>>>> On Wed, Jul 6, 2011 at 2:23 AM, Simon Marlow<marlowsd at gmail.com>    wrote:
>>>>>
>>>>> On 06/07/2011 07:37, Jason Dagit wrote:
>>>>>>
>>>>>> On Jul 5, 2011 1:04 PM, "Jason Dagit"<dagitj at gmail.com
>>>>>> <mailto:dagitj at gmail.com>>    wrote:
>>>>>>   >
>>>>>>   >    On Tue, Jul 5, 2011 at 12:33 PM, Ian Lynagh<igloo at earth.li
>>>>>> <mailto:igloo at earth.li>>    wrote:
>>>>>>   >    >    On Tue, Jul 05, 2011 at 08:11:21PM +0100, Simon Marlow wrote:
>>>>>>   >    >>
>>>>>>   >    >>    In GHCi it's a different matter, because the main thread is
>>>>>> running
>>>>>>   >    >>    GHCi itself, and all the expressions/statements typed at the
>>>>>> prompt
>>>>>>   >    >>    are run in forkIO'd threads (a new one for each statement, in
>>>>>> fact).
>>>>>>   >    >>    If you want a way to run command-line operations in the main
>>>>>> thread,
>>>>>>   >    >>    please submit a feature request.  I'm not sure it can be done,
>>>>>> but
>>>>>>   >    >>    I'll look into it.
>>>>>>   >    >
>>>>>>   >    >    We already have a way: -fno-ghci-sandbox
>>>>>>   >
>>>>>>   >    I've removed all my explicit attempts to forkIO/forkOS and passed
>>>>>> the
>>>>>>   >    command line flag you mention.  I just tried this but it doesn't
>>>>>>   >    change the behavior in my example.
>>>>>>
>>>>>> I tried it again and discovered that due to an argument parsing bug in
>>>>>> cabal-dev that the flag was not passed correctly. I explicitly passed it
>>>>>> and verified that it works. Thanks for the workaround. By the way, I did
>>>>>> look at the user guide for options like this and didn't see it. Which
>>>>>> part of the manual is it in?
>>>>>>
>>>>>> Can I still make a feature request for a function to make code run on
>>>>>> the original thread? My reasoning is that the code which needs to run on
>>>>>> the main thread may appear in a library in which case the developer has
>>>>>> no control over how ghc is invoked.
>>>>>
>>>>> I'm not sure how that would work.  The programmer is in control of what
>>>>> the
>>>>> main thread does, not GHC.  So in order to implement some mechanism to
>>>>> run
>>>>> code on the main thread, we would need some cooperation from the main
>>>>> thread
>>>>> itself.  For example, in gtk2hs the main thread runs an event handler
>>>>> loop
>>>>> which occasionally checks a queue for requests from other threads (at
>>>>> least,
>>>>> I think that's how it works).
>>>>
>>>> What I'm wrestling with is the following.  Say I make a GUI library.
>>>> As author of the GUI library I discover issues like this where the
>>>> library code needs to execute on the "main" thread.  Users of the
>>>> library expect the typical Haskell environment where you can't tell
>>>> the difference between threads, and you fork at will.  How can I make
>>>> sure my library works from GHC (with arbitrary user threads) and from
>>>> GHCI?
>>>>
>>>> As John Lato points out in his email lots of people bump into this
>>>> without realizing it and don't understand what the problem is.  We can
>>>> try our best to educate everyone, but I have this sense that we could
>>>> also do a better job of providing primitives to make it so that code
>>>> will run on the main thread regardless of how people invoke the
>>>> library.
>>>>
>>>> In my specific case (Cocoa on OSX), it is possible for me to use some
>>>> Cocoa functions to force things to run on the main thread.  From what
>>>> I've read Cocoa uses pthreads to implement this. I was hoping we could
>>>> expose something from the RTS code in Control.Concurrent so that it's
>>>> part of an "official" Haskell API that library writers can assume.
>>>>
>>>> Judging by this SO question, it's easier to implement this in Haskell
>>>> on top of pthreads than to implement it in C (here I'm assuming GHC's
>>>> RTS uses pthreads, but I've never checked):
>>>>
>>>> http://stackoverflow.com/questions/6130823/pthreads-perform-function-on-main-thread
>>>>
>>>> In fact, the it sounds like what Gtk2hs is doing with the postGUI
>>>> functions.
>>>
>>> Right, but usually the way this is implemented is with some cooperation from
>>> the main thread.  That SO answer explains it - the main thread runs some
>>> kind of loop that periodically checks for requests from other threads and
>>> services them.  I expect that's how it works on Cocoa.
>>> So you can't just do this from a library - the main thread has to be in on
>>> the game.
>>
>> Yes.  From my perspective (that of a library writer) that's what makes
>> this tricky in GHCi.  I need GHCi's cooperation.  From GHCi's
>> perspective it's tricky too.
>>
>>> I suppose you might wonder whether the GHC RTS could implement
>>> runInMainThread by preempting the main thread and running some different
>>> code on it.
>>
>> Yes, that's roughly what I was wondering about.
>
>
> There's more than one reason why a (GUI) library might require
> functions to be called only from the main thread. One is if the
> library uses thread-local storage, in which case the code needs to run
> in the right thread to see the right data. I've heard that OpenGL is
> like this. Another (more common, as far as I know) reason is if (parts
> of) the library aren't thread safe, and can't handle more than one
> thread at a time simultaneously calling its functions and mutating its
> members. I'm not sure if there are other reasons.

Yes, this we know (see the Concurrency/FFI paper I linked earlier).

> In the second (thread safety) case, if you preempt the main thread in
> the middle of whatever it was doing to use it to call some function
> from the library, the effect would, I think, be the same as if the OS
> had preempted it to execute some other thread which then called the
> function, and you would be violating the library's
> one-thread-at-a-time expectation in pretty much the same exact way. So
> I don't think you would gain anything useful by doing this. The main
> thread needs to be interrupted at 'safe points', which is what the
> event loop lets you do, but the event loop is part of the GUI library,
> and not part of the GHC runtime, so GHC doesn't know about it and
> can't tell it what to do - only the library bindings can.

I think you misunderstand what I meant by preemption.  I was talking 
about preempting the Haskell thread, not the OS thread.  Haskell threads 
are preempted by other Haskell threads all the time.

> Stated another way: I suspect most GUI libraries don't really actually
> care that you only execute GUI code from the main OS thread, as much
> as they care that only one (thread-unsafe) GUI function is being
> called at any given time.

No, this is not true.  Some GUI libraries really have to be called from 
one specific thread (e.g. Win32).

Cheers,
	Simon

> If you only ever call GUI code from the same
> (main) OS thread, that fulfills this requirement, because an OS thread
> is only capable of running one library function at a time;
> alternately, if you only ever call GUI code from the same Haskell
> thread, that also fulfills this requirement, because one Haskell
> thread is also only capable of running one library function at a time,
> even if its execution might jump between different OS threads along
> the way. (If you were writing code in the library's native language,
> and as part of your own code for processing an event in the main
> thread, stopped the main thread, used a different thread to execute
> some GUI functions, and then returned control to the main thread, I
> suspect that would also be safe, though there tends not to be any
> reason to want to do this.)
>
> Basically: In the context of GHC/Haskell, I think you need to separate
> the concept of "thread of execution", which is what the GUI libraries
> care about, from the concept of "OS threads", which nearly all of the
> time correspond to the threads of execution, but in this case, don't.
> (Or rather, do, but in a very different way from the usual.)
>
> These are impressions I've gained from the reading the docs (such as
> the paper Simon just linked) and thinking about it. If anyone more
> knowledgeable sees that I'm mistaken, please correct me.
>
>
>>
>>>   In theory that's possible, but whether it's a good idea or not
>>> is a different matter!  I think it amounts to the same thing as the gtk2hs
>>> folks have been asking for - multiple Haskell threads bound to the same OS
>>> thread.
>>
>> I'm starting to realize that I don't understand the GHC threading
>> model very well :)  I thought that was already possible.  I may be
>> mixing GHC's thread model up with other language implementations, but
>> I thought that it had a pool of OS threads and that Haskell threads
>> ran on them as needed.  I think what you're saying is that the RTS has
>> bound threads and it has thread pooling, but what it doesn't have is
>> "bound thread pooling" (that is, the combination of being bound and
>> pooled).
>>
>>>   runInMainThread then becomes the same as forking a temporary new
>>> thread bound to the main OS thread, or temporarily binding the current
>>> thread to the main OS thread.  If the main OS thread is off making a foreign
>>> call (e.g. in the GUI library's main loop) then it can't run any other
>>> Haskell threads anyway, and then I have to figure out what to do with all
>>> these Haskell threads waiting for their bound OS thread to come back from
>>> the foreign call.  My guess is that all this would be pretty complex to
>>> implement.
>>
>> Yes it does sound complex.  I'd really like help as much as possible.
>> I know very little about GHC internals but perhaps I could take a look
>> at some of the RTS code.  Is there some background reading I could do?
>>   Perhaps a specific reference to a paper or wiki page?
>>
>>> Still, I'm all for making things easier somehow.  At the least, we should
>>> have good diagnostics when you're using GHCi and this goes wrong.  Although
>>> I'm not sure how to do that, I think it's really something the gtk2hs or
>>> Cocoa binding needs to implement.  Do you have a way to check whether you're
>>> on the main thread or not?
>>
>> pthread_main_np is the only way I've stumbled across:
>> https://www.mirbsd.org/htman/i386/man3/pthread_main_np.htm
>>
>> Jason
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
>