[Haskell-cafe] Parallel Haskell Digest 2

Wed May 11 22:50:16 CEST 2011

Parallel Haskell Digest
=======================
Edition 2
2011-05-11

Hello Haskellers!

Welcome to the second edition of the Parallel Haskell Digest, bringing you
news, discussions and tasters of parallelism and concurrency in Haskell.

The digest is made possible by the Parallel GHC project.
More news about how we're doing below.

Note that this edition of the digest may be best viewed in blog form at
http://www.well-typed.com/blog/52 (there's an inline picture).

News
----------------------------------------------------------------------
* Workshop of the 2nd SICSA MultiCore Challenge
  (27 May, Glasgow; registration deadline 18 May)

  The goal of the SICSA  MultiCore challenge is to bring together
  researchers in the  area of parallel  programming, by implementing one
  or more challenge applications on (networks of) multi-core machines.
  The goal is to learn about the strengths and weaknesses of current
  systems for parallel programming by  comparing them on a common
  application. The final workshop will feature presentations of the
  individual implementations, assessing both raw performance as well as
  ease of parallelisation and other aspects in the development of this
  parallel application.

  https://groups.google.com/d/topic/parallel-haskell/K-IJ2roA59I/discussion

* Parallel Haskell Portal

  One of things I've been working on at Well-Typed is to help clean up
  the Haskell wiki documentation on parallelism and concurrency.
  There's still a bit of gardening to do, but we're ready to unveil what
  we have now.  The Parallel Haskell portal is targeted primarily at
  people getting started with parallelism and concurrency in Haskell.
  It places a lot of emphasis on steering readers towards mature
  technologies, while still offering a path for more advanced users to
  dig deeper into current research.

  http://www.haskell.org/haskellwiki/Parallel

Word of the Month
----------------------------------------------------------------------
This edition of the digest is brought to you by threads, threads and
more threads. In the last digest, we took a look at Haskell sparks and
particularly at how they differ from threads.  Now let's delve a little
bit deeper into threads.

A "thread of execution" is a programming abstraction that helps the
programmer separate concerns within a program. Consider a web server
serving many documents to many clients simultaneously. The programmer
may wish to use threads, using one thread per client making it easier to
manage concurrent action. [1]

In the Haskell world, this programming abstraction is provided by
Haskell threads. To implement Haskell threads, the GHC runtime system
uses what is known as a M:N threading model, where many Haskell threads
are scheduled across a much smaller number of operating system threads.
This model has two key benefits:

* it can make use of multiple CPU cores
* it allows Haskell threads to be very cheap

Don Stewart illustrated this on StackOverflow with the following
diagram, citing 2011 figures of a handful of CPUs, a dozen or so
OS threads and tens of thousands of Haskell threads (plus for those of
us interested in pure parallelism, millions of sparks).

  http://i.imgur.com/u53Uk.png

If you want to try generating some figures for yourself, have a look
at the benchmark utility at

  http://darcs.haskell.org/nofib/smp/threads005/Main.hs

The benchmark tool creates a large "pipeline" of simultaneous Haskell
threads, and tries to send a message through the pipeline, passing it
from one thread to the next. On my laptop, I can create 10000 Haskell
threads in 0.074 seconds and pump "Hello World" through them in 0.07
seconds.  That works out to 7.4 microseconds per thread (0.7
microseconds for pumping).  How about giving it a shot on your
computer?  Haskell threads are very lightweight.

For most concurrency needs, you can generally forget that operating
system threads exist.  Indeed, when the documentation for modules like
Control.Concurrent refers to "threads", or when Haskell hackers discuss
threads in Haskell code it's a good bet that they're referring to
Haskell threads rather than OS threads.

That said, there are a few occasions where you may want to be aware
about OS threads, in order of importance, if you

* want your Haskell threads to be running in parallel
  on multiple cores (for better performance) instead of
  just being interleaved on a single core, or

* need to make foreign calls concurrently with doing
  other things (eg. running Haskell code or making other
  foreign calls)

Doing any of these things requires that you use GHC's multi-threaded
runtime system.  GHC currently uses a single-threaded runtime system by
default, but until this changes, you will have to explicitly enable the
multi-threaded one by linking your program with the flag `-threaded`.
With the multi-threaded runtime system all Haskell threads are
scheduled across multiple OS threads as opposed to being interleaved on
a single one.  This allows for parallelism in the first case, for
concurrent foreign code in the second case.

For the first case, you may want to note that on top of enabling the
multi-threaded RTS, you will also need to tell the runtime system to let
some of those run some of those OS threads simultaneously (which in
run requires that you have multiple cores).  You can do this by passing
`+RTS -N` to your program during runtime (with GHC 6.12.1 and higher, the
bare -N flag will cause the RTS to use a sensible default based on the
number of CPU cores on your machine).  This is only if you are
specifically interested in parallelism.

If you are only concerned about making foreign calls, just enabling the
multi-threaded RTS is enough.  The issue with the single-threaded
runtime system is that Haskell threads that make foreign calls to the
operating system or C libraries will block all other Haskell threads.
This means that if a foreign call should take a long time, or worse, if
it should block in its own right, all other Haskell threads will be
stuck.  With the multi-threaded runtime system, Haskell that make
foreign calls do not block other Haskell threads, because they can be
handed over to another OS thread while the one making the foreign call
is churning away.

But be careful! Concurrency with foreign calls can be a tricky business.
If you are only using one OS thread at a time, you are effectively
shielded from having to worry about concurrency issues in foreign code
(by not being able to run foreign code concurrently).
Enabling multi-threaded mode comes with extra responsibility of making
sure any foreign libraries you use are thread-safe, or that you have
adequate mechanisms to deal with the ones that are not.  Thread-safety
isn't an issue with Haskell-only code because shared state is always
managed with locks or atomicity.  But when concurrency and foreign calls
mix, you will need to take care.

There's another issue to watch out for when mixing foreign calls with
multiple OS threads.  The nice thing thing about the M:N threading model
in Haskell is that the GHC RTS scheduler will automatically move Haskell
threads between OS thread to achieve a good balance of work.  But this
introduces a new problem for foreign libraries that use thread-local
state: the Haskell thread may calling the library from any number of OS
threads, so if the foreign library uses OS-thread-local state then this
state could be different from one call to the next... what a mess!  To
solve this problem we have to use a feature called "bound threads".
Bound threads are Haskell threads that are bound to a single OS thread;
they are not moved around by the scheduler. Bound threads are more
expensive than unbound ones because they tie up a whole OS thread, but
they are necessary for working with certain foreign libraries like
OpenGL that make essential use of thread local state.

Summing up threads in Haskell:
* When writing concurrent programs in Haskell, we deal primarily
  with Haskell threads, paying attention only to OS threads when
  we have to.
* Haskell threads run on one or more OS threads, but you get
  concurrency either way.  It is likely that in the future,
  GHC will use multiple OS threads by default but for now you
  have to explicitly enable it by linking your program using
  `-threaded`.
* Foreign calls are potentially tricky!  Make sure you either
  use thread-safe libraries or know how to manage non thread-safe
  ones.
* If you are using foreign libraries that use thread-local state, use
  bound threads, that is, Haskell threads tied to an OS thread.

[1] Thanks to Paul Bone for this paragraph
    https://groups.google.com/d/topic/parallel-haskell/fd6B9zbOvwM/discussion
    and also to Andres and Duncan from Well-Typed for extensive help
    revising this word of the month!

Parallel GHC project news
----------------------------------------------------------------------
The Parallel GHC Project is an MSR-funded project to push the real-world
use of parallel Haskell.  Part of this project involves effort by
Well-Typed to provide tools for use by the general community:

We have begun work on making the "Modified Additive Lagged Fibonacci" and
perhaps other random number generators from the SPRNG library available in
Haskell.  As a first step to the Haskell SPRNG reimplementation, we have
developed a [binding](https://github.com/bjpop/haskell-sprng) to the existing
C++ library. The binding will serve as a refernence implementation to test
against, but it also ready to be used now.

To complement our work on extending GHC eventlog and Threadscope to
support multi-process or distributed Haskell systems, we have begun work
on developing new visualisations for ThreadScope, including the rate of
parallel spark creation, and the distribution of spark evaluation times.

Meanwhile, work continues on our project partners' side. We hope to
be say more about it in the next edition of the digest :-)

For more information on the Parallel GHC project, see
http://www.haskell.org/haskellwiki/Parallel_GHC_Project

Talks
----------------------------------------------------------------------
1. High-Performance Web Applications in Haskell

   Gregory Collins gave a talk at QCon London (11 March).  Greg delivered
   this message to an an enterprise software development crowd: "Haskell is a
   really good choice for web programming".  Check out his slides to see how
   Haskell can help you to write scalable web applications

   http://qconlondon.com/dl/qcon-london-2011/slides/GregoryCollins_HighPerformanceWebApplicationsInHaskell.pdf

1. Parallel = functional: the way of future.

   Simon Peyton-Jones gave the keynote talk at the London FP Exchange
   (18 March), with a whirlwind overview of what's going on in the world
   of parallel Haskell.  You can watch the
   [video](http://skillsmatter.com/podcast/scala/talk-by-haskell-expert-simon-peyton-jones/js-1434)
   and follow along on the
   [slides](http://research.microsoft.com/%7Esimonpj/papers/parallel/Parallel-Haskell.pdf).

Blogs, papers and packages
----------------------------------------------------------------------
1. Mstate - a concurrent state monad (5 April).  Nils Schweinsberg
   released a small library called mstate which offers a threadsafe
   equivalent to the State monad (and transformer).  The library makes it
   easy to start stateful threads, grab their results when they finish and
   handle any exceptions along the way.

   http://blog.n-sch.de/2011/04/05/mstate-a-concurrent-state-monad/

1. monad-par - This library offers an alternative parallel programming
   API to that provided by the parallel package.  The Par monad allows
   the simple description of parallel computations, and can be used to
   add parallelism to pure Haskell code. The basic API is
   straightforward: the monad supports forking and simple communication
   in terms of IVars. The library comes with an efficient work-stealing
   implementation, but the internals are also exposed so that you can
   build your own scheduler if necessary.

   http://hackage.haskell.org/package/monad-par

1. Real-time edge detection with the latest release of the parallel array
   library Repa.  The folks at UNSW have released version 2.0.0.1 of the
   Repa high performance parallel array library, with along with an
   example use case doing edge detection in real time with the Canny
   algorithm.

   http://pls.posterous.com/real-time-edge-detection-with-the-latest-rele

1. Haskell for the Cloud (Jeff Epstein, Andrew Black and Simon Peyton Jones)
   Erlang style distributed programming in Haskell

   "We present Cloud Haskell, a domain-specific language for developing programs
   for a distributed-memory computing environment. Implemented as a shallow
   embedding in Haskell, it provides a message-passing communication model,
   inspired by Erlang, without introducing incompatibility with Haskell's
   established shared-memory concurrency. A key contribution is a method for
   serializing function closures for transmission across the network. Cloud
   Haskell has been implemented; we present example code and some preliminary
   performance measurements'"

   * http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf
   * http://www.reddit.com/r/haskell/comments/gjb52/haskell_for_the_cloud_new_paper_by_jeff_epstein/
   * http://hackage.haskell.org/trac/ghc/wiki/ErlangInHaskell

Parallel-Haskell and Haskell Cafe
----------------------------------------------------------------------
* How to daemonize a threaded Haskell program? (5 March).
  Bas van Dijk would like to turn his Haskell program into a Unix
  daemon. This is made complicated by the fact that his program needs to
  run simultaneous threads, and that forkProcess is not supported in GHC
  when running on multiple processors.  Bas wrote a C wrapper to do the
  forking and call his Haskell main function, and asked about the
  practicalities of combining the two pieces with Cabal.  Further
  discussion converged on doing the daemonization outside the program,
  in particular with the help of Upstart on Ubuntu.

  * http://www.haskell.org/ghc/docs/latest/html/libraries/unix-2.4.2.0/System-Posix-Process.html
  * http://www.haskell.org/pipermail/haskell-cafe/2011-March/089891.html

* Parallel Haskell stories (9 March).  Simon Peyton Jones was looking
  for stories about using parallel or concurrent Haskell to use in his
  keynote talk (see above).  Stories so far: basic concurrency in the
  IRC Hulk (see PH Digest 1), STM in a vending machine simulator, and
  the BitTorrent client Combinatorrent, and parallelism in the
  Cryptographic Protocol Shapes Analyzer

  http://www.haskell.org/pipermail/haskell-cafe/2011-March/090028.html

* Light and fast HTTP server (11 March) Victor Oliveira needed help
  choosing among all the HTTP servers on Hackage. He wanted something
  light, fast and basic along the lines of nginx.  Responses so far
  point mainly to Snap and Warp.

  http://www.haskell.org/pipermail/haskell-cafe/2011-March/090114.html

* Simple STM question (15 March) qld3303 noticed a bit of STM code
  seemed to stop working when he upgrade to GHC 7.0.2.  The issue
  turns out to have been a mistaken use of "pure" to run an IO action,
  which is a no-op as pure for IO is the same as return.

  http://www.haskell.org/pipermail/haskell-cafe/2011-March/090182.html

* STM, newArray, and a stack overflow? (23 March) Ketil Malde
  got a stack overflow creating a large array in STM. The equivalent
  code works fine in the ST monad, or in GHCi. After some digging
  around, Bas van Dijk tracked it down to a bug in the newArray
  method of TArray.  His fix has been pushed and should appear in
  GHC 7.2.

  * http://www.haskell.org/pipermail/haskell-cafe/2011-March/090419.html
  * http://hackage.haskell.org/trac/ghc/ticket/5042

* Writing to a FIFO when the reader stops reading (13 March) Brian D
  had some simple code that continuously writes to a FIFO and gets a
  "Broken pipe" error when the reader goes away.  He wanted to check
  that this was expected behaviour (it was) and that he wasn't missing
  something on the Haskell side (he wasn't).  The advice was to use a
  Unix file socket instead.

  http://www.haskell.org/pipermail/haskell-cafe/2011-March/090142.html

* BlockedIndefinitelyOnMVar exception (31 March) Mitar wanted to
  know if there was any way to disable throwing
  BlockedIndefinitelyOnMVar exceptions.  Such exceptions are thrown
  during a particular known deadlock, "when a thread is blocked on an
  MVar but there are no other references to the MVar so it can't ever
  continue."  This lead to more discussion about what to do during
  cases where it would be desirable to just keep the thread around
  (eg. until it's explicitly killed).  Some options are to catch and
  ignore the exception; or to contrive to keep some other references
  to the MVar (eg. putting it in a StablePtr), or otherwise prevent
  the thread from being garbage collected.

  * http://www.haskell.org/pipermail/haskell-cafe/2011-March/090624.html
  * http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.3.1.0/Control-Exception-Base.html

* Parallelism and concurrency revisited and diagrammed.  To help me come to
  grips with how people talk about "parallelism" and "concurrency" in
  Haskell, I tried making a diagram how our language about the two interleave.

  https://groups.google.com/d/topic/parallel-haskell/j85VZvRujyg/discussion

Stack Overflow
----------------------------------------------------------------------
* How to choose between parList and parBuffer ?
  http://stackoverflow.com/questions/5352683/how-to-choose-between-parlist-and-parbuffer
* How to exploit any parallelism in my haskell parallel code ? (17 March)
  http://stackoverflow.com/questions/5345981/how-to-exploit-any-parallelism-in-my-haskell-parallel-code
* Performance considerations of Haskell FFI / C? (14 April)
  http://stackoverflow.com/questions/5665209/performance-considerations-of-haskell-ffi-c
* Reentrant caching of “referentially transparent” IO calls
  http://stackoverflow.com/questions/5484847/reentrant-caching-of-referentially-transparent-io-calls
  (30 March)

Help and feedback
----------------------------------------------------------------------
Got something for the digest? Send me an email at eric at well-typed.com.
I'm particularly interested in short parallelism/concurrency puzzles,
cool projects for featured code. Other comments and feedback (criticism
and corrections especially!) would be welcome too.

-- 
Eric Kow <http://erickow.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110511/e599f6d3/attachment-0001.pgp>