[Haskell-cafe] Re: Threads vs. processes [Was: Re: Re: Python's big challenges, Haskell's big advantages?]

Wed Sep 17 19:08:46 EDT 2008

On 2008-09-17, Jonathan Cast <jonathanccast at fastmail.fm> wrote:
> On Wed, 2008-09-17 at 21:20 +0000, Aaron Denney wrote:
>> On 2008-09-17, Jonathan Cast <jonathanccast at fastmail.fm> wrote:
>> >> In my mind pooling vs new-creation is only relevant to process vs
>> >> thread in the performance aspects.
>> >
>> > Say what?  This discussion is entirely about performance --- does
>> > CPython actually have the ability to scale concurrent programs to
>> > multiple processors?  The only reason you would ever want to do that is
>> > for performance.
>> 
>> I entered the discussion as which model is a workaround for the other --
>
> Well, I thought the discussion was about implementations, not models.  I
> also assumed remarks would be made in the context of the entire thread.
> I shall have to remember that in the future.
>
>> someone said processes were a workaround for the lack of good threading
>> in e.g. standard CPython.
>
>> I replied that most languages thread support
>
> Using a definition of `thread' which, apparantly, excludes Concurrent
> Haskell.

Can't I exclude it based on "most languages'".  CSP models are still the
minority.

> Different enough we're talking past each other.  The idea that the thing
> you make with forkIO doesn't count as a thread never crossed my mind,
> sorry.

I think it's fair to consider it a thread interface, because there's
still a huge amount of shared state.  Mostly immutable, but not
completely as you later point out, even discounting updates of
lazy-evaluation thunks.  It is a lot less pure CSP than Erlang and
Occam, which both call them processes (though I see "thread" being
used more and more these days for Erlang).  Then there's apparently a
tradition in mainstream languages of calling language-level parallelism
"threads".  Of course most are thread models.

> and use completely different definitions for key terms and make
> statements which, substituting in the definitions I was using, are (as
> I hope you grant) non-sensical

Yes, I can see how my rants sounded bizarre, even though I think we're
mostly in agreement.

> Not any more.  I just think your definition of `thread' is unexpected in
> this context (without rather more elaboration).

>> These are, in fact, process models.
>
> OK.  I think that perspective is rather unique, but OK.

Well, what's the P in CSP stand for?

>> They are implemented on top of thread models,
>> but that's a performance hack.
>
> Maybe.  It's done for performance, but I don't see why you call it a
> hack.  Does it sacrifice some important advantage I'm missing?  (Vs.
> kernel-scheduled threads).

Vs kernel threads, not much -- just parallelism on SMP systems, which is
often regained by muxing on top of kernel threads.

Vs kernel processes, yes, I think some is lost.  Privilege separation,
isolation in the event of crashes, larger memory spaces, the ability
to span multiple machines (necessary for true fault tolerance).  How
important are these vs raw speed?  Well, it depends on the domain
and problem.  Take postfix for instance -- different parts of postfix are
implemented in different processes, with different OS privileges.
Subverting one doesn't give you carte blanche with the others, as it
would if these were all threads in one process.

-- 
Aaron Denney
-><-