[Haskell-cafe] Is there any experience using Software Transactional Memory in substantial applications?

Tue Aug 10 05:04:56 EDT 2010

| Recently we discussed Haskell and especially types in Russian part of
| LiveJournal and of course we talk about STM.
| 	
| My opponent gave me that link:
| http://logicaloptimizer.blogspot.com/2010/06/so-microsofts-experiments-with-
| software.html
| 
| It says that performance with STM in Microsoft Research was more than
| horrible.

I can't resist replying to this.  Several things to say:

* STM requires runtime logging of every memory read and write. In an imperative language (such as C#), in which the very fabric of computation involves reads and writes, STM is bound to be expensive.  [Lots of work has been done to omit redundant logging based on program analysis, but it's still expensive.]

In contrast, in a pure functional language there are no reads and writes, so all the pure part has zero overhead.  Only when you do 'readTVar' and 'writeTVar' do you pay the overhead; these are a tiny fraction of all memory accesses.  So we get a huge win from the fact that Haskell's computational fabric is pure.  

* When do you need readTVar/writeTVar?  Answer, precisely when you are manipulating state shared between threads.  If you do a *lot* of this, your program is *bound* to perform badly; data has to be shuffled between processors, caches have to do their thing, etc.  The key to high performance in parallel applications is to minimise sharing.  If you do that, then you'll (dynamically) have few readTVar/writeTVars, and so regardless of how expensive they are it'll run fast.

There are occasional data structures that are (a) necessarily shared and (b) necessarily hot-spots.  Then STM is perhaps not the best solution, which is why GHC provides a range of primitives: as well as STM we've kept MVars, and we also have atomicModifyIORef which is even cheaper.  Horses for courses. 

* The GHC STM implementation is simple.  It's a direct implementation of our paper "Composable Memory Transactions".  We could make strong simplifying assumptions.  In contrast, the Microsoft .NET project had to tackle a much more challenging set of requirements.  In particular, they thought that programmers would *require* to be able to manipulate the same locations both *inside* and *outside* a transaction. It turns out that this simple requirement dramatically complicates the implementation [see multiple papers by Tim Harris on this topic].  Also they wanted nested transactions, and a raft of other complications.  

Each of these features was strongly motivated, and the MS guys in Redmond are extremely smart, but the result was necessarily complex and, as it turned out, unacceptably slow.

Summary: it's simplistic to say "STM good" or "STM bad".  For STM to work well you need
	- very limited side effects in the computational fabric
	- limited communication between threads
	- a simple design

Then STM can work extremely well.  And it does in Haskell.  But it's not a silver bullet; concurrency is too complex to be slain with one bullet.  Indeed those precise words appear in every talk about STM I have given (eg http://research.microsoft.com/en-us/um/people/simonpj/papers/stm)

Simon