[Haskell-cafe] Help with "shootout"

Wed Jan 4 05:42:28 EST 2006

On 03 January 2006 12:03, Chris Kuklewicz wrote:

> STM* is usually slower than IO/MVar.  STM has to do the transactional
> record keeping and throws away work (i.e. CPU cycles and speed) when
> it aborts.  The Chameneos benchmark has 4 writers working *very*
> quickly, so the contention is high.  Taking the MVar acts like a
> mutex to serialize access without throwing away work.

We found that once you add exception safety to your MVar code, i.e.
using withMVar and modifyMVar rather than raw takeMVar/putMVar, then
MVar code is comparable in speed to STM.  For example, Chan and TChan
perform about the same.

Of course, for a benchmark, you can forget about the exception safety,
as the Ch implementation does (plus it is tweaked in various other
ways).

> Also, I suspect the following may be true:
> 
> If 4 threads block trying to take an MVar, and a 5th thread puts a
> value in the MVar, then *exactly one* of the 4 blocked threads is
> woken up and scheduled to run.

Yes, that's true.  We haven't documented this as a property of MVars,
but we don't intend to change this behaviour, so I think we probably
should document it.

Furthermore, the thread to be woken up is always the first thread to
block on the MVar.  That is, there's a kind of fairness built into MVars
which isn't present in STM and TMVars.

> If 4 threads retry while in STM (e.g. takeTMVar), and a 5th thread
> commits a change to that TMVar, then *all 4 threads* are woken up and
> rescheduled.  This is what the Apache httpd server folks called the
> "thundering herd" problem when many processes are blocked waiting on
> socket 80.
> 
> If 4000 threads were getting woken up when only 1 was needed, then
> performance would be poor.  Certainly I found 4 writer thread and STM
> to be much slower for this shootout problem than the Einar's custom
> MVar channel.
> 
> Could someone who knows the STM implementation comment on this?

Hope this helps...

Simon PJ and I talked about this briefly this morning, and agreed that
it might be possible to provide more primitive MVar-like objects in STM
to recover single-wakeup and some of the speed that MVars currently
have.  We haven't worked through the details though.

>> This is the CVS code. newTChanIO is exported but undocumented in GHC
>> 6.4.1. I'm not sure what purpose it serves.
> 
> I get tired of writing "do tv <- atomically $ newTVar foo"
> 
> I bet this is just shorthand: "do tv <- newTVarIO foo"
> 
> Same for newTChanIO.

It's not just shorthand: you can use newTVarIO and friends inside
unsafePerformIO, which doesn't always work for the longhand atomically
version.

Cheers,
	Simon