Proposed change to ForeignPtr

Tue Sep 10 18:01:22 EDT 2002

> [snip] No, you do not really need separate threads for this problem
> to occur.  All you need is, say, Hugs to call a GHC-exported
> function as a finalizer, in the same OS thread, GHC to run a garbage
> collection during this function, and the garbage collection in turn
> to want to run a Hugs finalizer, before the finalizer called from
> Hugs has finished. 

The Hugs and GHC runtimes can talk to each other just fine (or, if
they can't it's a simple oversight and well fix it).

There's no problem with GHC and Hugs each telling each other that some
object they own has one less pointer to it.  Next time it is
convenient for the runtime, it can run a GC, perhaps recognize that
there's a few GHC objects it can release and it tells the GHC runtime
that it can release them.

The reason we can do this is because it has limited scope: just a few
data structures have to be tweaked to avoid GHC coming in when Hugs'
data structures are in an inconsistent state.

It's quite a different matter to allow arbitrary Haskell code to be
run - that means the entire runtime system and libraries have to be
made reentrant.

> Of course one wouldn't normally want to link GHC
> from Hugs, but if even these two cannot be made to meet, I don't
> know how you expect Haskell to call anything else with a reasonably
> flexible GC system; it puts the kybosh on Java for example, which I
> am fairly sure makes plenty of use of both callbacks and finalizers.

That's fine, they can have all the finalizers they want.
The finalizers can fiddle with things in the runtime system
to tell the GC whatever they want.

> In any case it seems to me just as dangerous to assume that the
> implementation does not use OS threads, as to assume it does.

The internal structure of your apps is up to you - use locks to avoid
using single-threaded code in multithreaded manner.

> You are effectively writing on top of the FFI document "If your
> program does this perfectly reasonable combination of finalizers, it
> will fall over in an undefined way should the implementation use OS
> threads; furthermore there is no way around this".  Basically the
> fact that there is only one OS thread is an implementation detail,
> not something that the user should have to think about.

Programmers are used to dealing with code which is single threaded or
not reentrant.  It's quite common.

> Is it really the case that neither NHC nor Hugs can implement a list
> of actions to be taken at the first convenient point after GC has
> finished without implementing the whole machinery of preemptive
> concurrency?  I take Malcolm Wallace's word for it that it isn't
> trivial, but why do you need for example asynchronous interruption
> of Haskell threads, wait queues, or time slices?  Surely what you
> need is some way of backing up the state upon return from GC in such
> a way that you can run the queued IO actions, which may be hard but
> is a long way off preemptive concurrency.

The way GHC implements preemption is an optimized form of: set a bit
when preemption is needed; and make sure that generated code will test
that bit whenever it is in a position to perform a context switch.

What you're asking Hugs and NHC to do is: add a function to a list
whenever you have a finalizer to run; make sure the interpreter will
test that bit whenever it is in a position to perform a context
switch.  

It's basically the same.  We don't have to mess around with signals to
provide a regular timer interrupt but that's the easy bit of the code.

We can probably avoid messing around with multiple C stacks.  That's a
significant saving but, it's the complexity of that is fairly
self-contained - we could probably steal some code from some other
language implementation.

The cost is going over all data structures in the system making sure
that operations on them are suitably atomic.  One of the issues I
remember from old versions of GHC was that some of the primops would
do some work, allocate some memory, then finish the job.  The classic
error to make in that code was for the second half of the code to
assume almost anything about what happened in the first half of the
code: how much space is on the stack, does a table have free space in
it, is this pointer into the middle of an object ok?

The problem is the scope: every single data structure and every bit of
code that accesses it has to be vetted and we have to keep it in mind
as we maintain the code.  It's a high price to pay and I don't think
it's necessary (because all you really need is for the runtime systems
to talk to each other in very limited ways).

> If it's really impossible for NHC or Hugs to implement this, I think
> I would still rather it was left to the NHC and Hugs documentation
> to admit that exported Haskell functions basically don't work in
> some circumstances, rather than to the GHC documentation to say that
> actually they do.

It's a matter of taste how you do these things.

--
Alastair Reid                 alastair at reid-consulting-uk.ltd.uk  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/