[Haskell-cafe] IORef vs TVar performance: 6 seconds versus 4 minutes

Mon Dec 29 06:28:06 EST 2008

Evan Laforge wrote:
> On Mon, Dec 29, 2008 at 1:15 PM, Ryan Ingram <ryani.spam at gmail.com> wrote:
> > Both readTVar and writeTVar are worse than O(1); they have to look up
> > the TVar in the transaction log to see if you have made local changes
> > to it.
> >
> > Right now it looks like that operation is O(n) where n is the number
> > of TVars accessed by a transaction, so your big transaction which is
> > just accessing a ton of TVars is likely O(n^2).
> 
> So this actually brings up a tangential question I've wondered about
> for a while.
> 
> The lock-free datastructure paper at
> http://research.microsoft.com/users/Cambridge/simonpj/papers/stm/lock-free-flops06.pdf
> shows lock vs. STM with very similar performance for single processor
> with STM quickly winning on multi-processors.

I have not verified this, but a possible cause is that
Control.Concurrent.QSem isn't efficient if there are many waiters.
It should use two lists for managing the waiters (ala Okasaki).

But why does it manually manage the waiters at all? MVars are fair, in
ghc at least. So this should work:

    newtype Sem = Sem (MVar Int) (MVar Int)

    newSem :: Int -> IO Sem
    newSem initial = liftM2 Sem (newMVar initial) newEmptyMVar

    -- | Wait for a unit to become available
    waitSem :: Sem -> IO ()
    waitSem (Sem sem wakeup) = do
       avail' <- modifyMVar sem (\avail -> return (avail-1, avail-1))
       when (avail' < 0) $ takeMVar wakeup >>= putMVar sem

    -- | Signal that a unit of the 'Sem' is available
    signalSem :: Sem -> IO ()
    signalSem (Sem sem wakeup) = do
       avail <- takeMVar sem
       if avail < 0 then putMVar wakeup (avail+1)
                    else putMVar sem (avail+1)

(I should turn this into a library proposal.)

Bertram