[Haskell-cafe] Re: Python's big challenges, Haskell's big advantages?

Wed Sep 17 17:42:22 EDT 2008

Hi Aaron,

On Wed, Sep 17, 2008 at 23:20, Aaron Denney <wnoise at ofb.net> wrote:
> I entered the discussion as which model is a workaround for the other --
> someone said processes were a workaround for the lack of good threading
> in e.g. standard CPython.  I replied that most languages thread support can be
> seen as a workaround for the poor performance of communicating processes.
> (creation in particular is usually cited, but that cost can often be reduced
> by process pools, context switching costs, alas, is harder.)

That someone was probably me, but this is not what I meant. I meant
that the "processing" [1] Python module is a workaround for CPython's
performance problems with threads. For those who don't know it, the
processing module exposes a nearly identical interface to the standard
threading module in Python, but runs each "thread" in a seperate OS
process. The processing module emulates shared memory between these
"threads" as well as locking primitives and blocking. That is what I
meant when I said "processing" (the module) was a workaround for
CPython's threading issues.

[1] http://www.python.org/dev/peps/pep-0371/

The processes vs. threads depends on definitions. There seem to be two
sets floating around here. One is that processes and threads are
essentially the same, the only difference being that processes don't
share memory but threads do. With this view it doesn't really matter
if "processes" are implemented as proper OS processes or OS threads.
Discussion based on this definition can be interesting and one model
fits some problems better than the other and vice versa.

The other one is the systems view of OS processes vs. OS threads.
Discussion about the difference between these two is only mildly
interesting imo, as I think most people agree on things here and they
are well covered in textbooks that are old as dirt.

>>> The central aspect in my mind is a default share-everything, or
>>> default share-nothing.
>>
[..snip...]
> These are, in fact, process models.  They are implemented on top of thread models,
> but that's a performance hack.  And while putting this model on top
> restores much of the programming sanity, in languages with mutable
> variables and references that can be passed, you still need a fair
> bit of discipline to keep that sanity.  There, the implementation detail
> of thread, rather than process allows and even encourages shortcuts that
> violate the process model.

Well, this is a viewpoint I don't totally agree with. Correct me if
I'm not understanding you, but you seem to be making the point that OS
processes are often preferred because with threads, you *can* get
yourself in trouble by using shared memory.

The thing I don't agree with is "let's use A because B has dangerous
features". This is sort of like the design mantra of languages like
Java. Now, you may say that indeed Java has been wildly successful,
but I think (or hope) that is because we don't give people
(programmers) enough credit. Literature, culture and training in the
current practice of programming could do well IMO with making fewer
_good_ programmers rather than a lot of mediocre ones. And _good_
programmers don't need to be handcuffed just because otherwise they
*could* poke themselves in the eye.

I.e. if you need to sacrifice the efficiency of threads for full-blown
OS processes because people can't stay away from shared memory, then
something is fundamentally wrong.

I'll stop here, this is starting to sound like a very OT rant.

cheers,
Arnar