[Haskell-cafe] Parallel Haskell Digest 11
Eric Kow
eric at well-typed.com
Thu Jul 5 17:13:14 CEST 2012
Parallel Haskell Digest 11
==========================
HTML version: http://www.well-typed.com/blog/67
Hello Haskellers!
It's time for another Parallel Haskell Digest! Unfortunately, this may
just be our last one, at least within the context of the Parallel GHC
project. That said, we may as a community be at the very beginnings of
Haskell as *the* language of choice for your parallel and concurrent
needs. Maybe we need to keep something like the Digest going to help our
little FP monster through its infancy? Any volunteers in the community?
If you're interested in picking up the torch, please give us a shout!
Otherwise, if you can't take on a (perhaps rotating) digest commitment,
but still want to help, would you be kind enough to fill out a small
[survey](http://goo.gl/bP2fn) on the digest? There are just five
questions on it, plus a feedback form. Anything you can say will help
those of us in the Secret Haskell Propaganda Commitee to fine tune our
efforts:
http://goo.gl/bP2fn
It's been a fantastic year for me, working on the Parallel GHC project,
learning about all sorts of neat ideas and technologies (as a basic
parallel-naive Haskeller), and trying to reflect them back in a way that
hopefully helps the broader community. Thanks to all of you in the
parallel Haskell world first for cranking out all this great stuff for
us to use, and second for your patience and support. Thanks especially
to my follow Well-Typed-ers for all the fun chats, the feedback on
drafts, and help getting up to speed.
One last thing before signing off as your Parallel Haskell Digester.
While the digest may be coming to an end, there will at least be one
encore! It turns out we had so much to say in our last word of the
month, that we'll have to put in in a follow-up posting. In the
meantime, we'll just leave you with a little teaser…
News
----------------------------------------------------------------------
* [Announce: Haskell Platform 2012.2.0.0][n1] (3 Jun)
The new Haskell Platform is out! If you've been waiting for Haskell
Platform before moving on GHC 7.4, now's a great time to upgrade.
Of particular interest to parallel Haskellers, this latest GHC
offers better profiling flags, multicore profiling, vastly improved
DPH, event logging [allows ThreadScope spark profiling], and more
convenient RTS flags.
* [Introducing FP Complete][n2] (6 Jun)
You might have Bartosz Milewski around. If not, have a look at the
[Downfall of Imperative Programming][downfall]. Bartosz posted a
quick message introducing himself to the community along with the
new company FP Complete, which aims to commercialise Haskell.
Bartosz believes that “now is the right time for Haskell to become
a strong software industry player, especially that functional
programming is being widely recognized as the answer to the recent
multicore and GPU explosion.” We'll hopefully find out more about
FP Complete have in mind as their plans stabilise a bit.
* [3 year Bioinformatics R&D position in Granada, Spain][n3]
Love Functional Programming and concurrency? If you are a
CS/Math/IT graduate without a PhD, and have had no more
than 4 years of research experience, and have not lived in
Spain for more than 12 months (within the last 3 years),
Era7 has a position for you! You'll be hacking Scala and
using AWS for everything. So if Akka is the sort of thing
you're into, this could be the job for you.
Word of the month (teaser!)
----------------------------------------------------------------------
The word of the month series has given us a chance to survey the arsenal
of Haskell parallelism and concurrency constructs:
* some low level foundations (sparks and threads),
* three ways to do parallelism (parallel arrays, strategies, dataflow),
* and some concurrency abstractions (locks, transactions, channels)
The Haskell approach has been to explicitly recognise the vastness of
the parallelism/concurrency space, in other words, to provide a
multitude of right tools for a multitude of right jobs. Better still,
the tools we have are largely interoperable, should we find ourselves
with jobs that don't neatly fit into a single category.
The Haskell of 2012 may be in a great place for parallelism and
concurrency, but don't think this is the end of the story! What we've
seen so far is only a snapshot of the technology as it hurtles through
the twenty-tens (How quaint are we, Future Haskeller?). While we can't
say what exactly the future will bring, we can look at one of the
directions that Haskell might branch into in the coming decade.
The series so far has focused on things you might do with a single
computer, using parallelism to speed up your software, or using
concurrency abstractions to preserve your sanity in the face of
non-determinism. But now what if you have more than one computer?
Our final word of the month is *actor*. Actors are not specific to
distributed programming; they are really more of a low level concurrency
abstraction on a par with threads. And they certainly aren't new
either. The actor model has been around since the early 70s at least,
and has been seriously used for distributed programming since the late
80s with Erlang. Can you guess where this word of the month is going?
We have a bit more to say about it shortly, so while this is the last
Parallel Haskell Digest, watch this space for the final word of the
month :-)
Parallel GHC project update
----------------------------------------------------------------------
Our work on the [distributed-process][dist-p] implementation of Cloud
Haskell continues apace. We're almost there, having implemented most of
the API described in the original [Epstein *et al*][ch-pdf] paper,
except for node configuration and initialisation. We are very excited to
be getting this out of the door soon and into your hands. In fact,
we've even submitted a proposal to present this work at the upcoming
[Haskell Implementors Workshop][hiw]; so hopefully you'll be able to
join Duncan and Edsko in Copenhagen and catch up on the Cloud Haskell
news.
As for ThreadScope, we last mentioned that we were working to make use
of information from hardware performance counters (specifically, Linux
Perf Events). This took a bit more work and trickier GHC patches than
we had anticipated, but it does seem to be in order now and we are now
in the testing phase for the next release. The next ThreadScope release
will also include the use of heap statistics from the (eventual) GHC 7.6
RTS, and some user interface enhancements suggested by our users.
Tutorials
----------------------------------------------------------------------
* [Parallel and Concurrent Haskell Course slides][t1]
Looks like the recent Summer School was a great time for all!
We had expert Haskellers talking parallelism and concurrency, a
chateau, good food, and lots of wine. Did you miss out? We can't
help with the wine, but if you'd like to get in on some of the
parallel action, check out Simon's slides. There are seven
lectures in all, and some [lab exercises][lab-exo] to go with
them:
1. Basic pure parallelism
2. The Par Monad
3. Basic concurrency
4. Software Transactional Memory
5. Concurrent network servers
6. Distributed programming
7. GPU programming with Accelerate
* [Threading and Gtk2Hs][t2]
“So you're writing a Gtk2Hs application and you need to do some
threading.” Daniel Wagner has just the tutorial for you. The
post comes in two parts. First Daniel gives it all away with
two keys points (and a simple concrete example):
1. Make all Gtk calls from the main thread. If other threads
need to affect the interface, use `postGUIAsync`
or `postGUISync` to send your code to the main thread.
2. Link your program with the threaded runtime system by passing
GHC the `-threaded` option at link time.
For readers who want to learn more, Daniel then goes into much
more depth: the things that threading hard both in its own right
and when working in Gtk2Hs in particular; the perils of using
`unsafe` FFI imports (as opposed to `safe` ones); a peek into
the Gtk2Hs guts showing its interaction with Gtk and glib; and
finally, some of the possible pitfalls, wrong things Daniel
believed when he started and what he now believes instead.
Blogs and packages
----------------------------------------------------------------------
* [GHC-7.4.2-Eden - Parallel Haskell on multicore and cluster systems][b0] (17 Jun)
There is more than one way to do parallelism and distributed
programming in Haskell. Mischa Dieterle announced a new release of
[Eden][eden], an extension of Haskell which is “tailored for
distributed systems but works equally well on multicore
architectures”.
These extensions consist of a small number of constructs for
working with *processes* (processes work within disjoint address
spaces and do not share any data). Eden provides automatic process
handling to reduce the amount of low-level detail needed to
implement parallel algorithms, but also allows for the explicit
control you may need to get good performance.
You can either install the full Eden system including the compiler
(GHC with the Eden parallel runtime system), libraries and
tools; or just install a thread simulation by using a standard GHC
to install the libraries and tools off Hackage.
* [Forklift - a pattern for performing monadic actions in a worker thread][b1] (7 Jun)
Apfelmus recently noticed a recurring applied Haskell puzzle: how
do you combine two IO-wrapping monads that don't lend themselves to
being combined via monad transformers? This arose in the context
of Shae Erisson's Summer of Code Project (Web based GHCi) which
uses webserver and interpreter packages providing monads of their
own. A typical approach would be to run separate threads and have
the two communicate using something like an `MVar`; but each time
reinvent this solution, we end up creating some mini communication
protocol specific to the task.
Apfelmus suggests a more generic variant of this approach, which he
calls the “forklift pattern” because it consists in *forking* a
worker thread to *lift* arbitrary monadic actions into IO.
data ForkLift m = ForkLift { requests :: Chan (m ()) }
carry :: MonadIO m => ForkLift m -> m a -> IO a
The worker thread maintains a queue of requests. To run an
arbitrary action of type `m a`, you “carry” it over into `IO` with
a helper function that wraps it up in an `MVar` sandwich, sticks it
on the queue, and reads the `MVar` to get the result back. See
Apfelmus's posting to see this cute trick in action.
* [The Flavours of MVar][b2] (4 Jun)
Neil Mitchell finds that that the flexibility of the `MVar` can
leave some room for confusion: both taking from and putting to an
`MVar` one can block, whereas it is likely that you only expect it
to block on one of those. In a quick and practical tutorial, Neil
whips through three `MVar` patterns that he tends to use regularly:
* lock: guaranteed single-threaded access to some
resource
* var: thread-safe mutable variables that never block
on put
* barrier: starts empty, is written to once, then
read one or more times.
Building on these Neil shows a couple of examples of how one might
go one to combine these MVar uses to to get higher level
abstractions: an action that can be invoked multiple times but runs
at most once, and a queue that collects messages individually and
delivers them in bulk. Check his post out, and maybe see why `join
$ modifyMVar …` is becoming one of his favourite idioms.
* [Being more clever about vectorising nested data parallelism][b3]
(4 Jun)
Manuel Chakravarty tumbles: Our new draft paper [on Vectorisation
Avoidance][chak1] introduces a novel program analysis for nested
data parallelism that lets us avoid vectorising purely scalar
subcomputations. It includes a set of benchmark kernels that
suggest that vectorisation avoidance improves runtimes over merely
using array stream fusion.
* [Repa 3: more control over array representations with indexed types][b4]
Another paper from the UNSW parallel Haskellers: We have got a new
draft paper on [Guiding Parallel Array Fusion with Indexed
Types][chak2]. It describes the design and use of the 3rd
generation Repa API, which uses type indices to give the programmer
control over the various parallel array representations. The result
are clearer programs that the compiler can more easily optimise.
The implementation of Repa 3 is ready for use on Hackage in the
repa package.
* [Protocol Buffers 2.0.7][b8] (19 May)
Chris Kuklewicz is back! In the last digest, some folks in the
community were looking for him because they had patches for the
protocol buffers package family. Chris has not only resurfaced,
but released an update to the packages, making them compile with
GHC 7.4.1 and handle missing package names better.
* [Generics and Protocol Buffers][b6] (20 May)
Nathan Howell thinks the protocol-buffers package is great:
full-featured, well-tested, no complaints about performance.
However, maintaining `.proto` files is “more than just a chore”.
The `hprotoc` tool could help but is trickyp to integrated properly
into the build system. Maybe there's another way, one which does
not involve separate files or build tools. Have a look at his
[GitHub Gist][gproto] for a promising alternative solution using
the [type-level][type-level] library.
* [SafeSemaphore][b9] (2 Jun)
Chris Kuklewicz has a problem, a solution, and a plea for help.
* Problem: Control.Concurrent.QSem (and QSemN, SampleVar) are
[broken][ghc3160] (they provide no exception safety).
* Solution: his [SafeSemaphore][safesema] package,
just updated to 0.90, with several safer alternatives.
* Plea: Would it be possible to replace parts of GHC with
SafeSemaphore, so as to unbreak the Haskell Platform?
It looks like Chris' plea has been heard, as Simon Marlow has
recently suggested importing the STM version for GHC 7.6.1
* [Paraiso][paraiso] (7 Jun)
Kazu Yamamoto [announced][b10] a couple of new parallel libraries from
Japan. The first is Paraiso (by Takayuki Muranushi of Monadius
fame), a high-level language for implementing explicit
partial-differential equations solvers on supercomputers as well as
today's advanced personal computers.
* [GTALib][gtalib] (7 Jun)
Also from the Japan Parallel Haskell workld, is GTALib by Kendo
Emoto. It provides core functionalities of the GTA programming
framework described in the paper [Generate, Test, and Aggregate A
Calculation-based Framework for Systematic Parallel Programming with
MapReduce][gta]
Mailing lists
----------------------------------------------------------------------
* [How to write Source for TChan working with LC.take?][m1] (20 May)
Hiromi Ishii is writing a Data.Conduit `Source` that supplies its
values from a `TChan`. He has three versions, one using the raw
Pipe constructors directly, one using `sourceState`, and one
using `yield`. The `yield' version does not seem to work as
expected. At first this seemed like an unfortunate necessity, but
after putting some thought into it, Michael Snoyman proposed some
[modifications to conduit's await/yield][b5] functions, which
should allow Hiromi to write things in the intuitive way.
* [Parallel cooperative multithreading?][m2] (22 May)
Benjamin Ylvisaker was wondering if it'd be possible to implement
something like [Observationally Cooperative
Multithreading][ocm-pdf] (OCM) in Haskell. The paper discusses Lua,
C, and C++ implementations. Ben thinks that Haskell would be an
awesome fit such a framework. The premise behind OCM is that
cooperative concurrency can be easier than preemptive concurrency,
because you can use reason sequentially between invocations of
pause/yield/wait. Historically, it has only worked on single
processors, because the blocks of code between the p/y/w calls need
to be run atomically. Recent research means we know more about how
to efficiently run blocks of code atomically, so maybe cooperative
concurrency can make a comeback?
Ryan Newton thinks the comeback is indeed happening. He points in
the Haskell world to monad-par's use of ConT as an example of a
a framework in which tasks cooperatively yield control whenever
their desired input data is not yet available. Mario Blažević
has also thought about cooperative concurrency in Haskell,
particularly in context of his monad-coroutine library; however,
he found no speedups when he added support for running multiple
co-routines in parallel. Ketil Malde is sceptical that the
proposed approach would be better than using STM. He wonders if
the paper's critique of STM applies to implementations that
keep transactional data are segregated by the type system.
* [How to translate Repa 2 program to efficient Repa 3 code?][m3] (26 May)
Michael Serra posted a StackOverflow thread asking about the
[differences between Repa 2 and 3 APIs][s1]. He has some
simple image convolution tests with which run fast enough
in Repa 2 with judicious use of `force`. But when translating
the tests to Repa 3, he can't quite work out how to get the
same kind of performance. See the thread on StackOverflow
for more details. In short, `computeP` is the new `force`.
* [Is Repa suitable for boxed arrays?...][m4] (3 Jun)
Stuart Hungerford needs to build a 2D Array of boxed Haskell
values. He's attracted to Repa, but couldn't work out from the
documentation if it would work with arbitrary values. Moreover,
getting the examples to work. Ben Lippmeier replies that it should
work (the array type would be something like `Array V DIM2 Float`).
The documentation is out of date (it's for Repa 2 and Repa 3 is
different). Until somebody gets a chance to update the
documentation, try the [Repa 3 paper][repa3-pdf], Ben just
submitted for Haskell Symposium 2012.
* [Status and roadmap for Cloud Haskell?][m5] (19 May)
Ben Lee is very interested in the our work at Well-Typed on
`distributed-process`, the followup implementation of Cloud
Haskell. How's progress? As mentioned in the Parallel GHC news
above, we're almost there! Edsko de Vries says that so far we've
been focusing on two aspects of the new implementation, the design
of the transport API, and a robust TCP implementation to sit on top
of it. These two parts are nearly done. Meanwhile, we've been
laying down [some documentation][dp-docs] on our GitHub project
wiki. If you want to help out, we'd love if you could play with the
TCP transport, and try to write some transports of your own.
* [`anyP' in DPH?][m6] (21 May)
Rob Stewart is trying to find the `anyP` function for Data Parallel
Haskell. Ben Lippmeier says that DPH is in flux at the moment and
that the current user facing API can be found in
[dph-lifted-vseg][dph-lv] which provides an `orP` function.
* [Everybody should write everything in Go?][m7] (28 May)
Ryan Hayes posted a small [snippet of Go][go-snippet] showing how
friendly he found it for writing concurrent programs, “No
pthread... not stupid crap... just works!”. The program seems to
create 4 threads which print out 1 to 100 each. What do Haskellers
think? See the comments for some discussion between Haskell people
like Simon Marlow, and some folks in the Go community about our
respective approaches to the problem.
* [Proposal: Control.Concurrent.Async][m8] (8 June)
Deep into writing his book on Parallel Haskell, Simon Marlow
proposes a higher-level concurrency API for the base package.
The proposed [Control.Concurrent.Async][cc-async] would
help make sure that exceptions in child threads are dealt with
(returned or passed up), and that threads aren't accidentally left
running in the background.
A few Haskellers commented that they would prefer that base be
kept minimal as possible, and have counter-proposed making it
a package to be included in the Haskell Platform instead. See
the thread for some discussion on the API itself.
StackOverflow and Reddit
----------------------------------------------------------------------
* [What are the key differences between the Repa 2 and 3 APIs?][s1]
* [Does `par` create another thread?][s2]
* [How to improve performence of this Haskell code?][s3]
* [Haskell: Why was `par` defined the way it was?][s4]
* [Method for capturing monad stack state][s5]
* [Killing a thread when MVar is garbage collected][s6]
* [Improving simulation performance via concurrency][s7]
* [Haskell framework to parallelize non-threadsafe C++ lib][s8]
* [Thread-safe state with Warp/WAI][s9]
* [How to use multiple cores when compiling with GHC? : haskell][r1]
Help and Feedback
----------------------------------------------------------------------
Well, this is the end of the Haskell Parallel Digest, but feedback
would still be much appreciated! Get in touch with me,
Eric Kow, at <parallel at well-typed.com>. Bye for now!
[ch-pdf]: http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf
[dist-p]: https://github.com/haskell-distributed/distributed-process
[downfall]: http://fpcomplete.com/the-downfall-of-imperative-programming/
[hiw]: http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop
[lab-exo]: http://community.haskell.org/~simonmar/lab-exercises-cadarache.pdf
[eden]: http://www.mathematik.uni-marburg.de/~eden
[chak1]: http://www.cse.unsw.edu.au/~chak/papers/KCLLP12.html
[chak2]: http://www.cse.unsw.edu.au/~chak/papers/LCKP12.html
[paraiso]: http://hackage.haskell.org/package/Paraiso
[gtalib]: http://hackage.haskell.org/package/GTALib
[gta]: http://research.nii.ac.jp/~hu/pub/esop12.pdf
[ghc3160]: http://hackage.haskell.org/trac/ghc/ticket/3160
[safesema]: http://hackage.haskell.org/package/SafeSemaphore/
[gproto]: https://gist.github.com/2757253
[type-level]: http://hackage.haskell.org/package/type-level
[dp-docs]: https://github.com/haskell-distributed/distributed-process/wiki
[dph-lv]: http://hackage.haskell.org/package/dph-lifted-vseg
[cc-async]: http://community.haskell.org/~simonmar/async-stm/Control-Concurrent-Async.html
[repa3-pdf]: http://www.cse.unsw.edu.au/~benl/papers/guiding/guiding-Haskell2012-sub.pdf
[ocm-pdf]: http://www.cs.hmc.edu/~stone/papers/ocm-unpublished.pdf
[go-snippet]: https://gist.github.com/3010649
[n1]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101579.html
[n2]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101632.html
[n3]: http://functionaljobs.com/jobs/111-3-year-postgraduate-rd-position-at-era7-bioinformatics
[n4]: http://tomschrijvers.blogspot.co.uk/2012/05/ifl-2012-call-for-papers.html
[t1]: https://plus.google.com/107890464054636586545/posts/LThYZELANCg
[t2]: http://dmwit.com/gtk2hs/
[b0]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101808.html
[b1]: http://apfelmus.nfshost.com/blog/2012/06/07-forklift.html
[b2]: http://neilmitchell.blogspot.co.uk/2012/06/flavours-of-mvar_04.html
[b3]: http://tumblr.justtesting.org/post/24399176080/being-more-clever-about-vectorising-nested-data
[b4]: http://tumblr.justtesting.org/post/24398752358/repa-3-more-control-over-array-representations-with
[b5]: http://www.yesodweb.com/blog/2012/05/next-conduit-changes
[b6]: http://breaks.for.alienz.org/blog/2012/05/20/generics-and-protocol-buffers/
[b8]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101313.html
[b9]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101567.html
[b10]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101649.html
[b12]: http://fpcomplete.com/asynchronous-api-in-c-and-the-continuation-monad/
[m1]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101318.html
[m2]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101359.html
[m3]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101454.html
[m4]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101573.html
[m5]: https://groups.google.com/d/msg/parallel-haskell/A6mXn1Wv-KY/-63SHoGU31wJ
[m6]: https://groups.google.com/d/msg/parallel-haskell/Ykm3QJT12yw/8HpnoP8qsugJ
[m7]: https://plus.google.com/app/plus/mp/588/#~loop:view=activity&aid=z13pwzbajpqeg3qmo23hgporhlywe1fd5
[m8]: http://www.haskell.org/pipermail/libraries/2012-June/017892.html
[s1]: http://stackoverflow.com/questions/10747079/what-are-the-key-differences-between-the-repa-2-and-3-apis
[s2]: http://stackoverflow.com/questions/10724946/does-par-create-another-thread
[s3]: http://stackoverflow.com/questions/10557055/how-to-improve-performence-of-this-haskell-code
[s4]: http://stackoverflow.com/questions/10166640/haskell-why-was-par-defined-the-way-it-was
[s5]: http://stackoverflow.com/questions/11073610/method-for-capturing-monad-stack-state
[s6]: http://stackoverflow.com/questions/10871303/killing-a-thread-when-mvar-is-garbage-collected
[s7]: http://stackoverflow.com/questions/10627980/improving-simulation-performance-via-concurrency
[s8]: http://stackoverflow.com/questions/10567223/haskell-framework-to-parallelize-non-threadsafe-c-lib
[s9]: http://stackoverflow.com/questions/10449819/thread-safe-state-with-warp-wai
[r1]: http://www.reddit.com/r/haskell/comments/uyy80/how_to_use_multiple_cores_when_compiling_with_ghc/
--
Eric Kow <http://erickow.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120705/bef26c0c/attachment.pgp>
More information about the Haskell-Cafe
mailing list