Threads vs. processes [Was: Re: [Haskell-cafe] Re: Python's big
challenges, Haskell's big advantages?]
jonathanccast at fastmail.fm
Wed Sep 17 17:50:55 EDT 2008
On Wed, 2008-09-17 at 21:20 +0000, Aaron Denney wrote:
> On 2008-09-17, Jonathan Cast <jonathanccast at fastmail.fm> wrote:
> >> In my mind pooling vs new-creation is only relevant to process vs
> >> thread in the performance aspects.
> > Say what? This discussion is entirely about performance --- does
> > CPython actually have the ability to scale concurrent programs to
> > multiple processors? The only reason you would ever want to do that is
> > for performance.
> I entered the discussion as which model is a workaround for the other --
Well, I thought the discussion was about implementations, not models. I
also assumed remarks would be made in the context of the entire thread.
I shall have to remember that in the future.
> someone said processes were a workaround for the lack of good threading
> in e.g. standard CPython.
> I replied that most languages thread support
Using a definition of `thread' which, apparantly, excludes Concurrent
> can be
> seen as a workaround for the poor performance of communicating processes.
Meaning kernel-switched processes.
> (creation in particular is usually cited, but that cost can often be reduced
> by process pools, context switching costs, alas, is harder.)
> > Kernel threads /are/ expensive. Which is why all the cool kids use
> > user-space threads.
> Often muxed on top of kernel threads, because user-threads can't use
> multiple CPUs at once.
Well, a single kernel thread can't use multiple CPUs at once. (So you
need more than one).
> >> The central aspect in my mind is a default share-everything, or
> >> default share-nothing.
> > I really don't think you understand Concurrent Haskell, then. (Or
> > Concurrent ML, or stackless Python, or libthread, or any other CSP-based
> > set-up).
> Or Erlang, Occam, or heck, even jcsp. Because I'm coming at this from a
> slightly different perspective
Different enough we're talking past each other. The idea that the thing
you make with forkIO doesn't count as a thread never crossed my mind,
> and place a different emphasis on things
and use completely different definitions for key terms and make
statements which, substituting in the definitions I was using, are (as I
hope you grant) non-sensical
> you think I don't understand?
Not any more. I just think your definition of `thread' is unexpected in
this context (without rather more elaboration).
> No, trust me, I do understand them,
> and think CSP and actor models (the differences in nondeterminism is a
> minor detail that doesn't much matter here) are extremely nice ways of
> implementing parallel systems.
I'm glad to hear that...
> These are, in fact, process models.
OK. I think that perspective is rather unique, but OK.
> They are implemented on top of thread models,
> but that's a performance hack.
Maybe. It's done for performance, but I don't see why you call it a
hack. Does it sacrifice some important advantage I'm missing? (Vs.
> And while putting this model on top
> restores much of the programming sanity, in languages with mutable
> variables and references that can be passed, you still need a fair
> bit of discipline to keep that sanity. There, the implementation detail
> of thread, rather than process allows and even encourages shortcuts that
> violate the process model. In languages that are immutable, taking
> advantage of the shared memory space really can gain efficiency without
> any noticeably downside.
Nice clarification. Thanks.
 I am, btw., painfully aware that Haskell has mutable references that
can be passed between threads. Just as I am painfully aware of Unix's,
um, interesting ideas on maintaining file system consistency in the
presence of concurrent access to *that* shared resource...
More information about the Haskell-Cafe