[Haskell-cafe] Parallel Haskell Digest 11

Eric Kow eric at well-typed.com
Thu Jul 5 17:13:14 CEST 2012


Parallel Haskell Digest 11
==========================
HTML version: http://www.well-typed.com/blog/67

Hello Haskellers!

It's time for another Parallel Haskell Digest! Unfortunately, this may
just be our last one, at least within the context of the Parallel GHC
project. That said, we may as a community be at the very beginnings of
Haskell as *the* language of choice for your parallel and concurrent
needs. Maybe we need to keep something like the Digest going to help our
little FP monster through its infancy? Any volunteers in the community?
If you're interested in picking up the torch, please give us a shout!

Otherwise, if you can't take on a (perhaps rotating) digest commitment,
but still want to help, would you be kind enough to fill out a small
[survey](http://goo.gl/bP2fn) on the digest? There are just five
questions on it, plus a feedback form. Anything you can say will help
those of us in the Secret Haskell Propaganda Commitee to fine tune our
efforts:

http://goo.gl/bP2fn

It's been a fantastic year for me, working on the Parallel GHC project,
learning about all sorts of neat ideas and technologies (as a basic
parallel-naive Haskeller), and trying to reflect them back in a way that
hopefully helps the broader community.  Thanks to all of you in the
parallel Haskell world first for cranking out all this great stuff for
us to use, and second for your patience and support.  Thanks especially
to my follow Well-Typed-ers for all the fun chats, the feedback on
drafts, and help getting up to speed.

One last thing before signing off as your Parallel Haskell Digester.
While the digest may be coming to an end, there will at least be one
encore! It turns out we had so much to say in our last word of the
month, that we'll have to put in in a follow-up posting. In the
meantime, we'll just leave you with a little teaser…

News
----------------------------------------------------------------------
*    [Announce: Haskell Platform 2012.2.0.0][n1] (3 Jun)

     The new Haskell Platform is out! If you've been waiting for Haskell
     Platform before moving on GHC 7.4, now's a great time to upgrade.
     Of particular interest to parallel Haskellers, this latest GHC
     offers better profiling flags, multicore profiling, vastly improved
     DPH, event logging [allows ThreadScope spark profiling], and more
     convenient RTS flags.

*    [Introducing FP Complete][n2] (6 Jun)

     You might have Bartosz Milewski around. If not, have a look at the
     [Downfall of Imperative Programming][downfall]. Bartosz posted a
     quick message introducing himself to the community along with the
     new company FP Complete, which aims to commercialise Haskell.
     Bartosz believes that “now is the right time for Haskell to become
     a strong software industry player, especially that functional
     programming is being widely recognized as the answer to the recent
     multicore and GPU explosion.” We'll hopefully find out more about
     FP Complete have in mind as their plans stabilise a bit.

*    [3 year Bioinformatics R&D position in Granada, Spain][n3]

     Love Functional Programming and concurrency? If you are a
     CS/Math/IT graduate without a PhD, and have had no more
     than 4 years of research experience, and have not lived in
     Spain for more than 12 months (within the last 3 years),
     Era7 has a position for you! You'll be hacking Scala and
     using AWS for everything. So if Akka is the sort of thing
     you're into, this could be the job for you.

Word of the month (teaser!)
----------------------------------------------------------------------
The word of the month series has given us a chance to survey the arsenal
of Haskell parallelism and concurrency constructs:

*    some low level foundations (sparks and threads),
*    three ways to do parallelism (parallel arrays, strategies, dataflow),
*    and some concurrency abstractions (locks, transactions, channels)

The Haskell approach has been to explicitly recognise the vastness of
the parallelism/concurrency space, in other words, to provide a
multitude of right tools for a multitude of right jobs. Better still,
the tools we have are largely interoperable, should we find ourselves
with jobs that don't neatly fit into a single category.

The Haskell of 2012 may be in a great place for parallelism and
concurrency, but don't think this is the end of the story! What we've
seen so far is only a snapshot of the technology as it hurtles through
the twenty-tens (How quaint are we, Future Haskeller?). While we can't
say what exactly the future will bring, we can look at one of the
directions that Haskell might branch into in the coming decade.
The series so far has focused on things you might do with a single
computer, using parallelism to speed up your software, or using
concurrency abstractions to preserve your sanity in the face of
non-determinism. But now what if you have more than one computer?

Our final word of the month is *actor*. Actors are not specific to
distributed programming; they are really more of a low level concurrency
abstraction on a par with threads.  And they certainly aren't new
either.  The actor model has been around since the early 70s at least,
and has been seriously used for distributed programming since the late
80s with Erlang.  Can you guess where this word of the month is going?
We have a bit more to say about it shortly, so while this is the last
Parallel Haskell Digest, watch this space for the final word of the
month :-)

Parallel GHC project update
----------------------------------------------------------------------
Our work on the [distributed-process][dist-p] implementation of Cloud
Haskell continues apace. We're almost there, having implemented most of
the API described in the original [Epstein *et al*][ch-pdf] paper,
except for node configuration and initialisation. We are very excited to
be getting this out of the door soon and into your hands.  In fact,
we've even submitted a proposal to present this work at the upcoming
[Haskell Implementors Workshop][hiw]; so hopefully you'll be able to
join Duncan and Edsko in Copenhagen and catch up on the Cloud Haskell
news.

As for ThreadScope, we last mentioned that we were working to make use
of information from hardware performance counters (specifically, Linux
Perf Events). This took a bit more work and trickier GHC patches than
we had anticipated, but it does seem to be in order now and we are now
in the testing phase for the next release. The next ThreadScope release
will also include the use of heap statistics from the (eventual) GHC 7.6
RTS, and some user interface enhancements suggested by our users.

Tutorials
----------------------------------------------------------------------

*   [Parallel and Concurrent Haskell Course slides][t1]

     Looks like the recent Summer School was a great time for all!
     We had expert Haskellers talking parallelism and concurrency, a
     chateau, good food, and lots of wine. Did you miss out? We can't
     help with the wine, but if you'd like to get in on some of the
     parallel action, check out Simon's slides. There are seven
     lectures in all, and some [lab exercises][lab-exo] to go with
     them:

     1. Basic pure parallelism
     2. The Par Monad
     3. Basic concurrency
     4. Software Transactional Memory
     5. Concurrent network servers
     6. Distributed programming
     7. GPU programming with Accelerate

*    [Threading and Gtk2Hs][t2]

     “So you're writing a Gtk2Hs application and you need to do some
     threading.” Daniel Wagner has just the tutorial for you.  The
     post comes in two parts.  First Daniel gives it all away with
     two keys points (and a simple concrete example):

     1. Make all Gtk calls from the main thread.  If other threads
        need to affect the interface, use `postGUIAsync`
        or `postGUISync` to send your code to the main thread.

     2. Link your program with the threaded runtime system by passing
        GHC the `-threaded` option at link time.

     For readers who want to learn more, Daniel then goes into much
     more depth: the things that threading hard both in its own right
     and when working in Gtk2Hs in particular; the perils of using
     `unsafe` FFI imports (as opposed to `safe` ones); a peek into
     the Gtk2Hs guts showing its interaction with Gtk and glib; and
     finally, some of the possible pitfalls, wrong things Daniel
     believed when he started and what he now believes instead.


Blogs and packages
----------------------------------------------------------------------
*    [GHC-7.4.2-Eden - Parallel Haskell on multicore and cluster systems][b0] (17 Jun)

     There is more than one way to do parallelism and distributed
     programming in Haskell. Mischa Dieterle announced a new release of
     [Eden][eden], an extension of Haskell which is “tailored for
     distributed systems but works equally well on multicore
     architectures”.

     These extensions consist of a small number of constructs for
     working with *processes* (processes work within disjoint address
     spaces and do not share any data).  Eden provides automatic process
     handling to reduce the amount of low-level detail needed to
     implement parallel algorithms, but also allows for the explicit
     control you may need to get good performance.

     You can either install the full Eden system including the compiler
     (GHC with the Eden parallel runtime system), libraries and
     tools; or just install a thread simulation by using a standard GHC
     to install the libraries and tools off Hackage.

*    [Forklift - a pattern for performing monadic actions in a worker thread][b1] (7 Jun)

     Apfelmus recently noticed a recurring applied Haskell puzzle: how
     do you combine two IO-wrapping monads that don't lend themselves to
     being combined via monad transformers?  This arose in the context
     of Shae Erisson's Summer of Code Project (Web based GHCi) which
     uses webserver and interpreter packages providing monads of their
     own.  A typical approach would be to run separate threads and have
     the two communicate using something like an `MVar`; but each time
     reinvent this solution, we end up creating some mini communication
     protocol specific to the task.

     Apfelmus suggests a more generic variant of this approach, which he
     calls the “forklift pattern” because it consists in *forking* a
     worker thread to *lift* arbitrary monadic actions into IO.

         data ForkLift m = ForkLift { requests :: Chan (m ()) }
         carry :: MonadIO m => ForkLift m -> m a -> IO a

     The worker thread maintains a queue of requests. To run an
     arbitrary action of type `m a`, you “carry” it over into `IO` with
     a helper function that wraps it up in an `MVar` sandwich, sticks it
     on the queue, and reads the `MVar` to get the result back.  See
     Apfelmus's posting to see this cute trick in action.

*    [The Flavours of MVar][b2] (4 Jun)

     Neil Mitchell finds that that the flexibility of the `MVar` can
     leave some room for confusion: both taking from and putting to an
     `MVar` one can block, whereas it is likely that you only expect it
     to block on one of those. In a quick and practical tutorial, Neil
     whips through three `MVar` patterns that he tends to use regularly:

     * lock: guaranteed single-threaded access to some
       resource
     * var: thread-safe mutable variables that never block
       on put
     * barrier: starts empty, is written to once, then
       read one or more times.

     Building on these Neil shows a couple of examples of how one might
     go one to combine these MVar uses to to get higher level
     abstractions: an action that can be invoked multiple times but runs
     at most once, and a queue that collects messages individually and
     delivers them in bulk.  Check his post out, and maybe see why `join
     $ modifyMVar …` is becoming one of his favourite idioms.

*    [Being more clever about vectorising nested data parallelism][b3]
     (4 Jun)

     Manuel Chakravarty tumbles: Our new draft paper [on Vectorisation
     Avoidance][chak1] introduces a novel program analysis for nested
     data parallelism that lets us avoid vectorising purely scalar
     subcomputations. It includes a set of benchmark kernels that
     suggest that vectorisation avoidance improves runtimes over merely
     using array stream fusion.

*    [Repa 3: more control over array representations with indexed types][b4]

     Another paper from the UNSW parallel Haskellers: We have got a new
     draft paper on [Guiding Parallel Array Fusion with Indexed
     Types][chak2]. It describes the design and use of the 3rd
     generation Repa API, which uses type indices to give the programmer
     control over the various parallel array representations. The result
     are clearer programs that the compiler can more easily optimise.
     The implementation of Repa 3 is ready for use on Hackage in the
     repa package.

*    [Protocol Buffers 2.0.7][b8] (19 May)

     Chris Kuklewicz is back!  In the last digest, some folks in the
     community were looking for him because they had patches for the
     protocol buffers package family.  Chris has not only resurfaced,
     but released an update to the packages, making them compile with
     GHC 7.4.1 and handle missing package names better.

*    [Generics and Protocol Buffers][b6] (20 May)

     Nathan Howell thinks the protocol-buffers package is great:
     full-featured, well-tested, no complaints about performance.
     However, maintaining `.proto` files is “more than just a chore”.
     The `hprotoc` tool could help but is trickyp to integrated properly
     into the build system.  Maybe there's another way, one which does
     not involve separate files or build tools.  Have a look at his
     [GitHub Gist][gproto] for a promising alternative solution using
     the [type-level][type-level] library.

*    [SafeSemaphore][b9] (2 Jun)

     Chris Kuklewicz has a problem, a solution, and a plea for help.

     * Problem: Control.Concurrent.QSem (and QSemN, SampleVar) are
       [broken][ghc3160] (they provide no exception safety).
     * Solution: his [SafeSemaphore][safesema] package,
       just updated to 0.90, with several safer alternatives.
     * Plea: Would it be possible to replace parts of GHC with
       SafeSemaphore, so as to unbreak the Haskell Platform?

     It looks like Chris' plea has been heard, as Simon Marlow has
     recently suggested importing the STM version for GHC 7.6.1

*    [Paraiso][paraiso] (7 Jun)
     
     Kazu Yamamoto [announced][b10] a couple of new parallel libraries from
     Japan. The first is Paraiso (by Takayuki Muranushi of Monadius
     fame), a high-level language for implementing explicit
     partial-differential equations solvers on supercomputers as well as
     today's advanced personal computers.

*    [GTALib][gtalib] (7 Jun)

     Also from the Japan Parallel Haskell workld, is GTALib by Kendo
     Emoto.  It provides core functionalities of the GTA programming
     framework described in the paper [Generate, Test, and Aggregate A
     Calculation-based Framework for Systematic Parallel Programming with
     MapReduce][gta]

Mailing lists
----------------------------------------------------------------------
*    [How to write Source for TChan working with LC.take?][m1] (20 May)

     Hiromi Ishii is writing a Data.Conduit `Source` that supplies its
     values from a `TChan`.  He has three versions, one using the raw
     Pipe constructors directly, one using `sourceState`, and one
     using `yield`. The `yield' version does not seem to work as
     expected. At first this seemed like an unfortunate necessity, but
     after putting some thought into it, Michael Snoyman proposed some
     [modifications to conduit's await/yield][b5] functions, which
     should allow Hiromi to write things in the intuitive way.

*    [Parallel cooperative multithreading?][m2] (22 May)

     Benjamin Ylvisaker was wondering if it'd be possible to implement
     something like [Observationally Cooperative
     Multithreading][ocm-pdf] (OCM) in Haskell. The paper discusses Lua,
     C, and C++ implementations. Ben thinks that Haskell would be an
     awesome fit such a framework. The premise behind OCM is that
     cooperative concurrency can be easier than preemptive concurrency,
     because you can use reason sequentially between invocations of
     pause/yield/wait. Historically, it has only worked on single
     processors, because the blocks of code between the p/y/w calls need
     to be run atomically. Recent research means we know more about how
     to efficiently run blocks of code atomically, so maybe cooperative
     concurrency can make a comeback?

     Ryan Newton thinks the comeback is indeed happening. He points in
     the Haskell world to monad-par's use of ConT as an example of a
     a framework in which tasks cooperatively yield control whenever
     their desired input data is not yet available.  Mario Blažević
     has also thought about cooperative concurrency in Haskell,
     particularly in context of his monad-coroutine library; however,
     he found no speedups when he added support for running multiple
     co-routines in parallel.  Ketil Malde is sceptical that the
     proposed approach would be better than using STM. He wonders if
     the paper's critique of STM applies to implementations that
     keep transactional data are segregated by the type system.

*    [How to translate Repa 2 program to efficient Repa 3 code?][m3] (26 May)

     Michael Serra posted a StackOverflow thread asking about the
     [differences between Repa 2 and 3 APIs][s1].  He has some
     simple image convolution tests with which run fast enough
     in Repa 2 with judicious use of `force`.  But when translating
     the tests to Repa 3, he can't quite work out how to get the
     same kind of performance.  See the thread on StackOverflow
     for more details.  In short, `computeP` is the new `force`.

*    [Is Repa suitable for boxed arrays?...][m4] (3 Jun)

     Stuart Hungerford needs to build a 2D Array of boxed Haskell
     values. He's attracted to Repa, but couldn't work out from the
     documentation if it would work with arbitrary values.  Moreover,
     getting the examples to work.  Ben Lippmeier replies that it should
     work (the array type would be something like `Array V DIM2 Float`).
     The documentation is out of date (it's for Repa 2 and Repa 3 is
     different). Until somebody gets a chance to update the
     documentation, try the [Repa 3 paper][repa3-pdf], Ben just
     submitted for Haskell Symposium 2012.

*    [Status and roadmap for Cloud Haskell?][m5] (19 May)

     Ben Lee is very interested in the our work at Well-Typed on
     `distributed-process`, the followup implementation of Cloud
     Haskell. How's progress? As mentioned in the Parallel GHC news
     above, we're almost there!  Edsko de Vries says that so far we've
     been focusing on two aspects of the new implementation, the design
     of the transport API, and a robust TCP implementation to sit on top
     of it. These two parts are nearly done. Meanwhile, we've been
     laying down [some documentation][dp-docs] on our GitHub project
     wiki. If you want to help out, we'd love if you could play with the
     TCP transport, and try to write some transports of your own.

*    [`anyP' in DPH?][m6] (21 May)

     Rob Stewart is trying to find the `anyP` function for Data Parallel
     Haskell. Ben Lippmeier says that DPH is in flux at the moment and
     that the current user facing API can be found in
     [dph-lifted-vseg][dph-lv] which provides an `orP` function.

*    [Everybody should write everything in Go?][m7] (28 May)

     Ryan Hayes posted a small [snippet of Go][go-snippet] showing how
     friendly he found it for writing concurrent programs, “No
     pthread... not stupid crap... just works!”.  The program seems to
     create 4 threads which print out 1 to 100 each. What do Haskellers
     think? See the comments for some discussion between Haskell people
     like Simon Marlow, and some folks in the Go community about our
     respective approaches to the problem.

*    [Proposal: Control.Concurrent.Async][m8] (8 June)

     Deep into writing his book on Parallel Haskell, Simon Marlow
     proposes a higher-level concurrency API for the base package.
     The proposed [Control.Concurrent.Async][cc-async] would
     help make sure that exceptions in child threads are dealt with
     (returned or passed up), and that threads aren't accidentally left
     running in the background.
     
     A few Haskellers commented that they would prefer that base be
     kept minimal as possible, and have counter-proposed making it
     a package to be included in the Haskell Platform instead. See
     the thread for some discussion on the API itself.

StackOverflow and Reddit
----------------------------------------------------------------------
* [What are the key differences between the Repa 2 and 3 APIs?][s1]
* [Does `par` create another thread?][s2]
* [How to improve performence of this Haskell code?][s3]
* [Haskell: Why was `par` defined the way it was?][s4]
* [Method for capturing monad stack state][s5]
* [Killing a thread when MVar is garbage collected][s6]
* [Improving simulation performance via concurrency][s7]
* [Haskell framework to parallelize non-threadsafe C++ lib][s8]
* [Thread-safe state with Warp/WAI][s9]

* [How to use multiple cores when compiling with GHC? : haskell][r1]

Help and Feedback
----------------------------------------------------------------------
Well, this is the end of the Haskell Parallel Digest, but feedback
would still be much appreciated!  Get in touch with me,
Eric Kow, at <parallel at well-typed.com>.  Bye for now!

[ch-pdf]: http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf
[dist-p]: https://github.com/haskell-distributed/distributed-process
[downfall]: http://fpcomplete.com/the-downfall-of-imperative-programming/
[hiw]: http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop
[lab-exo]: http://community.haskell.org/~simonmar/lab-exercises-cadarache.pdf

[eden]: http://www.mathematik.uni-marburg.de/~eden
[chak1]: http://www.cse.unsw.edu.au/~chak/papers/KCLLP12.html
[chak2]: http://www.cse.unsw.edu.au/~chak/papers/LCKP12.html
[paraiso]: http://hackage.haskell.org/package/Paraiso
[gtalib]: http://hackage.haskell.org/package/GTALib
[gta]:    http://research.nii.ac.jp/~hu/pub/esop12.pdf
[ghc3160]: http://hackage.haskell.org/trac/ghc/ticket/3160
[safesema]: http://hackage.haskell.org/package/SafeSemaphore/
[gproto]:  https://gist.github.com/2757253
[type-level]: http://hackage.haskell.org/package/type-level

[dp-docs]: https://github.com/haskell-distributed/distributed-process/wiki
[dph-lv]:   http://hackage.haskell.org/package/dph-lifted-vseg
[cc-async]: http://community.haskell.org/~simonmar/async-stm/Control-Concurrent-Async.html
[repa3-pdf]: http://www.cse.unsw.edu.au/~benl/papers/guiding/guiding-Haskell2012-sub.pdf
[ocm-pdf]: http://www.cs.hmc.edu/~stone/papers/ocm-unpublished.pdf
[go-snippet]: https://gist.github.com/3010649

[n1]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101579.html
[n2]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101632.html
[n3]: http://functionaljobs.com/jobs/111-3-year-postgraduate-rd-position-at-era7-bioinformatics
[n4]: http://tomschrijvers.blogspot.co.uk/2012/05/ifl-2012-call-for-papers.html

[t1]: https://plus.google.com/107890464054636586545/posts/LThYZELANCg
[t2]: http://dmwit.com/gtk2hs/

[b0]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101808.html
[b1]: http://apfelmus.nfshost.com/blog/2012/06/07-forklift.html
[b2]: http://neilmitchell.blogspot.co.uk/2012/06/flavours-of-mvar_04.html
[b3]: http://tumblr.justtesting.org/post/24399176080/being-more-clever-about-vectorising-nested-data
[b4]: http://tumblr.justtesting.org/post/24398752358/repa-3-more-control-over-array-representations-with
[b5]: http://www.yesodweb.com/blog/2012/05/next-conduit-changes
[b6]: http://breaks.for.alienz.org/blog/2012/05/20/generics-and-protocol-buffers/
[b8]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101313.html
[b9]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101567.html
[b10]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101649.html
[b12]: http://fpcomplete.com/asynchronous-api-in-c-and-the-continuation-monad/

[m1]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101318.html
[m2]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101359.html
[m3]: http://www.haskell.org/pipermail/haskell-cafe/2012-May/101454.html
[m4]: http://www.haskell.org/pipermail/haskell-cafe/2012-June/101573.html
[m5]: https://groups.google.com/d/msg/parallel-haskell/A6mXn1Wv-KY/-63SHoGU31wJ
[m6]: https://groups.google.com/d/msg/parallel-haskell/Ykm3QJT12yw/8HpnoP8qsugJ
[m7]: https://plus.google.com/app/plus/mp/588/#~loop:view=activity&aid=z13pwzbajpqeg3qmo23hgporhlywe1fd5
[m8]: http://www.haskell.org/pipermail/libraries/2012-June/017892.html

[s1]: http://stackoverflow.com/questions/10747079/what-are-the-key-differences-between-the-repa-2-and-3-apis
[s2]: http://stackoverflow.com/questions/10724946/does-par-create-another-thread
[s3]: http://stackoverflow.com/questions/10557055/how-to-improve-performence-of-this-haskell-code
[s4]: http://stackoverflow.com/questions/10166640/haskell-why-was-par-defined-the-way-it-was
[s5]: http://stackoverflow.com/questions/11073610/method-for-capturing-monad-stack-state
[s6]: http://stackoverflow.com/questions/10871303/killing-a-thread-when-mvar-is-garbage-collected
[s7]: http://stackoverflow.com/questions/10627980/improving-simulation-performance-via-concurrency
[s8]: http://stackoverflow.com/questions/10567223/haskell-framework-to-parallelize-non-threadsafe-c-lib
[s9]: http://stackoverflow.com/questions/10449819/thread-safe-state-with-warp-wai

[r1]: http://www.reddit.com/r/haskell/comments/uyy80/how_to_use_multiple_cores_when_compiling_with_ghc/
-- 
Eric Kow <http://erickow.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120705/bef26c0c/attachment.pgp>


More information about the Haskell-Cafe mailing list