[Yhc] some initial questions

Fri Mar 3 05:29:31 EST 2006

> That is very good to hear.  It seems to me that the two biggest issues are:
> 
> -- supporting multiple execution stacks
> -- blocking I/O and FFI calls
> 
> How did you decide to handle these problems (mostly for my curiousity)?

You're quite right, those were two of the biggest issues :-)

The first thing to note is that concurrency is implemented in Yhc by the 
interpretter running some instructions for one process, then some more 
instructions for the next process and so on.

This is rather than using one OS thread per Haskell process because all 
processes can access the global heap, so if using OS threads it would be 
necessary to either lock every heap access (incredibly slow) or use some 
of the truely nasty tricks that concurrent GHC uses.

The first problem, as you note, is that originally we only had one stack 
but now every process needs its own stack. After considering possible 
options I decided the easiest way was:
    - process stacks become normal nodes in the heap, and are garbage
      collected like other heap nodes. This preserves the property that
      we can't run out of heap but not stack (and vice versa).

    - a process is initially allocated a small stack space in the heap
      and if that runs out we simply allocate a new stack space (twice as
      large as before) and let the old stack be garbage collected.

    - I considered using stacks spread out over multiple 'pages', but in
      practice that I found that
        a) it's really hard to implement correctly, especially when you
           consider garbage collection can shift all your addresses
           around (don't forget frames on the stack reference the
           previous frame, possibly in a different stack page!)

        b) most stacks don't grow very large so there's little penalty 
to
           just throwing an old stack away if it gets too small.

    - it would also be easy to recycle old stacks i.e. perhaps one
      process's discarded stack would be the right size for another
      process. This would be especially true if stacks came in standard
      sizes (64,128,256 etc). I haven't implemented this yet.

Now going back, the approach of running some instructions from one 
thread then some more from another is fine providing every instruction 
runs quickly. They all do, with one exception, PRIMITIVE. This is 
because PRIMITIVE is used for IO actions/FFI calls.

The solution? Well the basic idea is "just spawn an OS thread to do the 
IO action while the main thread continues running Haskell".

In detail, when PRIMITIVE is called for an IO action/FFI call:

    - an OS thread is taken from a pool of threads

    - the main thread tells the OS thread to run the IO action/FFI call

    - the main thread then reschedules so another haskell process can run

    - eventually the OS thread complete it's action and is put back in
      the pool. It adds the process to the list of processes that have
      completed their IO actions and are waiting to be rescheduled.

    - when the main thread next reschedules it checks the list of
      completed IO actions and makes the corresponding blocked processes
      ready to run again.

There is a simple optimisation used which is if there is only one ready 
process in the program then it doesn't bother using a seperate thread, 
it just does the IO action directly.

Hope that answers your questions :-)

Cheers

Tom Shackell