[Haskell-cafe] When are MVars better than STM?

Sun Jan 24 06:55:18 UTC 2016

Could you post the code please?

On Sun, Jan 24, 2016 at 12:46 AM, Thomas Koster <tkoster at gmail.com> wrote:

> Hi friends,
>
> Using Criterion, I have been running benchmarks to measure the
> relative performance of STM and MVars for some simple transactions
> that I expect will be typical in my application. I am using GHC 7.10.2
> and libraries as at Stackage LTS 3.2.
>
> I have found that STM is faster than MVars in all my benchmarks,
> without exception. This seems to go against accepted wisdom [1][2][3].
> I have not included my source code here to save space, but if you
> suspect that I am using MVars incorrectly, just say so and I will post
> my source code separately.
>
> I have two questions:
>
> 1. When are MVars faster than STM? If the answer is "never", then when
> are MVars "better" than STM? (Choose your own definition of "better".)
>
> 2. When given two capabilities (+RTS -N2), MVars are suddenly an order
> of magnitude slower than with just one capability. Why?
>
>
> For those who want details:
>
> My benchmark forks four Haskell threads. Each thread repeats a
> transaction that increments a shared counter many, many times. These
> transactions must be serialized. The counter is therefore highly
> contended. One version uses an MVar to store the counter in the
> obvious way. The other version uses a TVar instead.
>
> By the way, simply using "atomic-primops" to increment the counter
> won't do because the increment operation is actually a mock substitute
> for a more complex operation. I use the counter for my benchmarks
> because the real operation needs much more memory and I don't want the
> additional, unpredictable GC cost to affect my measurements.
>
> Typical measurements are:
>
> 1 capability, using MVar: 37.30 ms
> 1 capability, using TVar: 24.88 ms
> 2 capabilities, using MVar: 1.564 s
> 2 capabilities, using TVar: 80.09 ms
> 4 capabilities, using MVar: 2.890 s
> 4 capabilities, using TVar: 207.8 ms
>
> Notice that the MVar version suddenly slows by an order of magnitude
> when run with more than one capability. Why is this so? (This is
> question 2.)
>
> Despite the absolute time elapsed, I realize that the CPU usage
> characteristics of the two versions are also quite different. I
> realize that the MVar version interlocks the four threads so that only
> one capability is ever busy at a time, irrespective of the number of
> capabilities available, whereas the STM version allows up to four
> capabilities to be busy at once. However, I believe that the
> additional parallel transactions in the STM version would be mostly
> wasted, destined to be retried. Unless I am mistaken, this assumption
> appears to be consistent with the observation that the STM version
> with -N1 is the fastest of all. Despite all this wasted work by the
> thundering herd, the total CPU time (i.e. my power bill) for the STM
> version is still less than for the MVar version, because the MVar
> version so dramatically slow.
>
> Paradoxically, MVars seem to be the wrong tool for this job. So when
> are MVars faster than STM? (This is question 1.)
>
> [1]
> https://stackoverflow.com/questions/15439966/when-why-use-an-mvar-over-a-tvar
> [2]
> https://www.reddit.com/r/haskell/comments/39ef3y/ioref_vs_mvar_vs_tvar_vs_tmvar/
> [3]
> https://mail.haskell.org/pipermail/haskell-cafe/2014-January/112158.html
>
> Thanks,
> Thomas Koster
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>

-- 
Chris Allen
Currently working on http://haskellbook.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20160124/e6d04620/attachment.html>