[Haskell-cafe] Best ways to achieve throughput, for large M:N ratio of STM threads, with hot TVar updates?

Thu Jul 23 14:51:21 UTC 2020

Hello Cafe,

I'm working on an in-memory database, in Client/Server mode I just let each connected client submit remote procedure call running in its dedicated lightweight thread, modifying TVars in RAM per its business needs, then in case many clients connected concurrently and trying to insert new data, if they are triggering global index (some TVar) update, the throughput would drop drastically. I reduced the shared state to a simple int counter by TVar, got same symptom. While the parallelism feels okay when there's no hot TVar conflicting, or M is not much greater than N.

As an empirical test workload, I have a `+RTS -N10` server process, it handles 10 concurrent clients okay, got ~5x of single thread throughput; but in handling 20 concurrent clients, each of the 10 CPUs can only be driven to ~10% utilization, the throughput seems even worse than single thread. More clients can even drive it thrashing without much  progressing.

 I can understand that pure STM doesn't scale well after reading [1], and I see it suggested [7] attractive and planned future work toward that direction.

But I can't find certain libraries or frameworks addressing large M over small N scenarios, [1] experimented with designated N parallelism, and [7] is rather theoretical to my empirical needs.

Can you direct me to some available library implementing the methodology proposed in [7] or other ways tackling this problem?

I think the most difficult one is that a transaction should commit with global indices (with possibly unique constraints) atomically updated, and rollback with any violation of constraints, i.e. transactions have to cover global states like indices. Other problems seem more trivial than this.

Specifically, [7] states:

> It must be emphasized that all of the mechanisms we deploy originate, in one form or another, in the database literature from the 70s and 80s. Our contribution is to adapt these techniques to software transactional memory, providing more effective solutions to important STM problems than prior proposals.

I wonder any STM based library has simplified those techniques to be composed right away? I don't really want to implement those mechanisms by myself, rebuilding many wheels from scratch.

Best regards,
Compl

[1] Comparing the performance of concurrent linked-list implementations in Haskell 
https://simonmar.github.io/bib/papers/concurrent-data.pdf <https://simonmar.github.io/bib/papers/concurrent-data.pdf>

[7] M. Herlihy and E. Koskinen. Transactional boosting: a methodology for highly-concurrent transactional objects. In Proc. of PPoPP ’08, pages 207–216. ACM Press, 2008.
https://www.cs.stevens.edu/~ejk/papers/boosting-ppopp08.pdf <https://www.cs.stevens.edu/~ejk/papers/boosting-ppopp08.pdf>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20200723/d4aa0eb9/attachment.html>