Threads and memory management

Jost Berthold berthold at Mathematik.Uni-Marburg.de
Mon Apr 27 05:41:12 EDT 2009


> Message: 8
> Date: Fri, 24 Apr 2009 19:20:46 +0200
> From: Johannes Waldmann <waldmann at imn.htwk-leipzig.de>
> Subject: Threads and memory management
> To: "glasgow-haskell-users at haskell.org"
> 	<glasgow-haskell-users at haskell.org>
> Message-ID: <49F1F4EE.8070306 at imn.htwk-leipzig.de>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Dear all,
> 
> I was wondering what is the current status of the ghc RTS
> with respect to threading. Is it true that the allocator
> and deallocator (garbage collector) are still single-threaded?
> 
> 
> I made this example:
> ...
> Well, then, if the two Haskell threads are (nearly) completely
> independent like the above, it would be better to compile and run
> two separate executables and have them communicate via the OS  (pipe or
> port). But that shouldn't be!  (the OS being better than Haskell)
> 
> Is there was a way of partitioning the memory (managed by the ghc RTS)
> in totally independent parts that each have their stand-alone
> memory management. Of course then all communication
> had to go via some Control.Concurrent.Chan,
> but that should be fine, if there is little of them.
> 
> Well, just some thought. This idea can't be new?
> Tell me why it couldn't possibly work ...
> 
> J.W.



Hello everybody,

Since I did not see any other replies, I thought I might give you some 
pointers to more information. Other people will perhaps follow up with 
more details.
In all, quite a few things are recently going on about threading in the 
GHC world, and it is even difficult to keep the oversight.

- GHC has undergone a big overhaul with respect to threading support 
since last September. I have started this work at Microsoft Research 
last summer, and a lot more has been done by Simon Marlow since.
There are forthcoming papers (mainly one submitted to ICFP) about this 
work. The main focus here was the shared-heap implementation of the 
Glasgow-parallel Haskell programming model (see 
http://www.macs.hw.ac.uk/~dsg/gph/ , mainly the paper here: 
http://www.macs.hw.ac.uk/~dsg/gph/papers/html/Strategies/strategies.html). 
This programming model is supported in GHC versions since 2004 already, 
but should now deliver much better performance. That said, the vast 
majority of the latest GHC optimisations carried out in the GHC HEAD aim 
at improving thread performance (the explicit basis of the more implicit 
programming model), and thus relate directly to what you observe.
The version you have tested with is GHC-6.10.2, which does not include 
substantial threading optimisations.

- a related question is tool support for parallel performance tuning. 
Very recently, GHC (HEAD) supports a visual post-mortem analysis, and a 
graphical tool "ThreadScope" has been developed by Satnam Singh and 
others. http://raintown.org/?page_id=132
There is a related posting in glasgow-haskell-users (11 March).

- GHC has parallel garbage collection since 2007 already. However, 
recent work showed that this parallel GC sometimes hampers performance, 
in particular on Linux systems. See this thread for more:
http://www.haskell.org/pipermail/glasgow-haskell-users/2009-April/017050.html

- about your remark that separate OS processes should communicate, 
rather than having the Haskell RTS manage all threads: yes, there are 
programming models around which aim at parallel Haskell with distributed 
heap. Aside from the older GpH cluster implementations, you might have a 
look at the language Eden: 
http://www.mathematik.uni-marburg.de/~eden/?content=paper
(and personally, I might apologise for the suboptimal web presence...)
Using Eden (and its implementation) would be a way to realise the model 
you propose, communicating processes with separate heaps, but managed by 
a common parallel runtime system and programmed in one single program.
Let me add that Eden also provides a tool with features very similar to 
what recently became "ThreadScope".

Interestingly, a number of recent experiments have shown that Eden has 
competitive performance to the shared heap GpH implementation when 
executed on multicores, delegating all communication to the underlying 
middleware (most commonly PVM). Early papers discussing alike results 
have been published recently or are in preparation. 
http://www-fp.cs.st-and.ac.uk/~kh/mainICPP09.pdf

I strongly support the idea to collect all this related information 
systematically! Kickoff is already present here:
	http://haskell.org/haskellwiki/Performance/Parallel
However, the question is where to start... the field is indeed very 
broad, and the page above will surely focus on GHC rather than general 
ideas. So I hesitate in dropping all this information into the wiki and 
rather send a mail for now.

Cheers
Jost Berthold


More information about the Glasgow-haskell-users mailing list