[Haskell-cafe] Re: Bound threads

Thu Mar 3 02:38:14 EST 2005

Marcin Kowalczyk wrote:

> Indeed, my brain is melting, but I did it :-)

Congratulations. How about we found a "Bound-thread-induced brain melt 
victims' support group"?

> [...] I have added some optimizations:

I think we had thought of most of these optimizations, but things were 
already very complex, so I kept putting that off until we decided to do 
IO differently, at which point they were no longer necessary.

Besides simplicity, one of the main reasons for moving our select() 
call from the run-time system to the libraries was to avoid the 
performance hit of having to call select() every time through the 
scheduler loop rather than only once per IO operation.
Imagine having one or more (unbound) threads that spend most of their 
time waiting for IO, and a bunch of (also unbound) threads that do some 
computation. If select() is part of the scheduler loop, you will get a 
select() call whenever a thread-switch between the computation threads 
happens.
If, on the other hand, the select() call happens in a separate OS 
thread, you will get extra inter-OS-thread messaging once the select 
wakes up, but that happens far less often as a thread-switch between 
the computation threads.

> Bound threads introduced problems. They can partially be solved,
> e.g. the worker pool, the wakeup pipe, epoll descriptor are correctly
> recreated. But there is simply no way to return from callbacks because
> the corresponding C contexts no longer exist. So I made them as
> follows:
>
> All threads except the thread performing the fork become unbound.
> [...]

What happens when fork is called from an unbound thread? Does it become 
bound in the child process?
When does the child process terminate? Does the thread that called fork 
gain a "main thread" status so that the process will exit as soon as 
the thread exits?
For GHC, we side-stepped all those issues by only providing a 
simplified version of forkProcess:

System.Posix.forkProcess :: IO () -> IO System.Posix.Types.ProcessID

In the child process, only the IO action given as a parameter will run, 
and once it returns, the child process will terminate. This covers most 
use cases of fork and is, to the best of my knowledge, the most general 
version that can be implemented with the same semantics for both GHC's 
threaded RTS and GHC's non-threaded RTS.

> I measured the speed of some syscalls on my system, to see what is
> worth optimizing:
>
> - pthread_mutex_lock + unlock (NPTL)   0.1 us

pthread_mutex_* is not necessarily a syscall. When there is no 
contention, the NPTL is able to do it entirely in user space without a 
context switch. The kernel only gets involved when a thread is actually 
suspended waiting for a lock. The situation is the same on Mac OS X, 
but Microsoft's multithreading API is rumored to be a lot slower 
(kernel calls for every lock/unlock - I haven't checked this, though).

Cheers,
	Wolfgang