[Haskell-cafe] Best ways to achieve throughput, for large M:N ratio of STM threads, with hot TVar updates?

Wed Jul 29 17:37:10 UTC 2020

Am 24.07.20 um 17:48 schrieb Compl Yue via Haskell-Cafe:
> The global 
> counter is only used to reveal the technical traits of my situation, 
> it's of course not a requirement of my business needs.

Given the other discussion here, I'm not sure if it's really relevant to 
your situation, but that stats counter could indeed be causing lock 
contention. Which means your numbers may be skewed, and you may be 
drawing wrong conclusions - which is actually commonplace in benchmarking.

Two things you could do:
1) Leave the global counter out and see whether the running times vary. 
There's still a chance that while the overall running time is the same, 
the code might now be hitting a different bottleneck. Or maybe the 
counter isn't the bottleneck but it would become one once you have done 
the other optimizations. So that experiment is cheap but gives you no 
more than a preliminary result.
2) Let each thread collect its own statistics, and coalesce into the 
global counter only once in a while. (Vary the "once in a while" 
determination and see whether it changes anything.)

Just my 2c from the sideline.

Regards,
Jo