Threads and memory management
Jost Berthold
berthold at Mathematik.Uni-Marburg.de
Mon Apr 27 05:41:12 EDT 2009
> Message: 8
> Date: Fri, 24 Apr 2009 19:20:46 +0200
> From: Johannes Waldmann <waldmann at imn.htwk-leipzig.de>
> Subject: Threads and memory management
> To: "glasgow-haskell-users at haskell.org"
> <glasgow-haskell-users at haskell.org>
> Message-ID: <49F1F4EE.8070306 at imn.htwk-leipzig.de>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Dear all,
>
> I was wondering what is the current status of the ghc RTS
> with respect to threading. Is it true that the allocator
> and deallocator (garbage collector) are still single-threaded?
>
>
> I made this example:
> ...
> Well, then, if the two Haskell threads are (nearly) completely
> independent like the above, it would be better to compile and run
> two separate executables and have them communicate via the OS (pipe or
> port). But that shouldn't be! (the OS being better than Haskell)
>
> Is there was a way of partitioning the memory (managed by the ghc RTS)
> in totally independent parts that each have their stand-alone
> memory management. Of course then all communication
> had to go via some Control.Concurrent.Chan,
> but that should be fine, if there is little of them.
>
> Well, just some thought. This idea can't be new?
> Tell me why it couldn't possibly work ...
>
> J.W.
Hello everybody,
Since I did not see any other replies, I thought I might give you some
pointers to more information. Other people will perhaps follow up with
more details.
In all, quite a few things are recently going on about threading in the
GHC world, and it is even difficult to keep the oversight.
- GHC has undergone a big overhaul with respect to threading support
since last September. I have started this work at Microsoft Research
last summer, and a lot more has been done by Simon Marlow since.
There are forthcoming papers (mainly one submitted to ICFP) about this
work. The main focus here was the shared-heap implementation of the
Glasgow-parallel Haskell programming model (see
http://www.macs.hw.ac.uk/~dsg/gph/ , mainly the paper here:
http://www.macs.hw.ac.uk/~dsg/gph/papers/html/Strategies/strategies.html).
This programming model is supported in GHC versions since 2004 already,
but should now deliver much better performance. That said, the vast
majority of the latest GHC optimisations carried out in the GHC HEAD aim
at improving thread performance (the explicit basis of the more implicit
programming model), and thus relate directly to what you observe.
The version you have tested with is GHC-6.10.2, which does not include
substantial threading optimisations.
- a related question is tool support for parallel performance tuning.
Very recently, GHC (HEAD) supports a visual post-mortem analysis, and a
graphical tool "ThreadScope" has been developed by Satnam Singh and
others. http://raintown.org/?page_id=132
There is a related posting in glasgow-haskell-users (11 March).
- GHC has parallel garbage collection since 2007 already. However,
recent work showed that this parallel GC sometimes hampers performance,
in particular on Linux systems. See this thread for more:
http://www.haskell.org/pipermail/glasgow-haskell-users/2009-April/017050.html
- about your remark that separate OS processes should communicate,
rather than having the Haskell RTS manage all threads: yes, there are
programming models around which aim at parallel Haskell with distributed
heap. Aside from the older GpH cluster implementations, you might have a
look at the language Eden:
http://www.mathematik.uni-marburg.de/~eden/?content=paper
(and personally, I might apologise for the suboptimal web presence...)
Using Eden (and its implementation) would be a way to realise the model
you propose, communicating processes with separate heaps, but managed by
a common parallel runtime system and programmed in one single program.
Let me add that Eden also provides a tool with features very similar to
what recently became "ThreadScope".
Interestingly, a number of recent experiments have shown that Eden has
competitive performance to the shared heap GpH implementation when
executed on multicores, delegating all communication to the underlying
middleware (most commonly PVM). Early papers discussing alike results
have been published recently or are in preparation.
http://www-fp.cs.st-and.ac.uk/~kh/mainICPP09.pdf
I strongly support the idea to collect all this related information
systematically! Kickoff is already present here:
http://haskell.org/haskellwiki/Performance/Parallel
However, the question is where to start... the field is indeed very
broad, and the page above will surely focus on GHC rather than general
ideas. So I hesitate in dropping all this information into the wiki and
rather send a mail for now.
Cheers
Jost Berthold
More information about the Glasgow-haskell-users
mailing list