<div dir="ltr">Unfortunately my STM benchmarks are rather disorganized.  The most relevant paper using them is:<div><br></div><div>Leveraging hardware TM in Haskell (PPoPP '19)<br></div><div><a href="https://dl.acm.org/doi/10.1145/3293883.3295711" target="_blank">https://dl.acm.org/doi/10.1145/3293883.3295711</a></div><div><br></div><div>Or my thesis:</div><div><a href="https://urresearch.rochester.edu/institutionalPublicationPublicView.action?institutionalItemId=34931" target="_blank">https://urresearch.rochester.edu/institutionalPublicationPublicView.action?institutionalItemId=34931</a> <br></div><div><br></div><div> The PPoPP benchmarks are on a branch (or the releases tab on github):</div><div><a href="https://github.com/fryguybob/ghc-stm-benchmarks/tree/wip/mutable-fields/benchmarks/PPoPP2019/src" target="_blank">https://github.com/fryguybob/ghc-stm-benchmarks/tree/wip/mutable-fields/benchmarks/PPoPP2019/src</a> </div><div><br></div><div><br></div><div> All that to say, without an implementation of mutable constructor fields (which I'm working on getting into GHC) the scaling is limited.</div><div><br></div><div>Ryan</div><div> <br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jul 25, 2020 at 3:45 AM Compl Yue via Haskell-Cafe <<a href="mailto:haskell-cafe@haskell.org" target="_blank">haskell-cafe@haskell.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div>

    <p>Dear Cafe,</p>

    <p>As Chris Allen has suggested, I learned that  <a href="https://hackage.haskell.org/package/stm-containers" target="_blank">https://hackage.haskell.org/package/stm-containers</a>

      and <a href="https://hackage.haskell.org/package/ttrie" target="_blank">https://hackage.haskell.org/package/ttrie</a>

      can help a lot when used in place of traditional HashMap for stm

      tx processing, under heavy concurrency, yet still with automatic

      parallelism as GHC implemented them. Then I realized that in

      addition to hash map (used to implement dicts and scopes), I also

      need to find a TreeMap replacement data structure to implement the

      db index. I've been focusing on the uniqueness constraint aspect,

      but it's still an index, needs to provide range scan api for db

      clients, so hash map is not sufficient for the index.<br>

    </p>

    <p>I see Ryan shared the code benchmarking RBTree with stm in mind:</p>

    <p> <a href="https://github.com/fryguybob/ghc-stm-benchmarks/tree/master/benchmarks/RBTree-Throughput" target="_blank">https://github.com/fryguybob/ghc-stm-benchmarks/tree/master/benchmarks/RBTree-Throughput</a>

      <br>

    </p>

    <p><a href="https://github.com/fryguybob/ghc-stm-benchmarks/tree/master/benchmarks/RBTree" target="_blank">https://github.com/fryguybob/ghc-stm-benchmarks/tree/master/benchmarks/RBTree</a></p>

    <p>But can't find conclusion or interpretation of that benchmark

      suite. And here's a followup question:</p>

    <p><br>

    </p>

    <p>Where are some STM contention optimized data structures, that

      having keys ordered, with sub-range traversing api ? <br>

    </p>

    <p>(of course production ready libraries most desirable)<br>

    </p>

    <p><br>

    </p>

    <p>Thanks with regards,</p>

    <p>Compl</p>

    <p><br>

    </p>

    <div>On 2020/7/25 下午2:04, Compl Yue via

      Haskell-Cafe wrote:<br>

    </div>

    <blockquote type="cite">

      <p>Shame on me for I have neither experienced with `perf`, I'd

        learn these essential tools soon to put them into good use.</p>

      <p>It's great to learn about how `orElse` actually works, I did

        get confused why there are so little retries captured, and now I

        know. So that little trick should definitely be removed before

        going production, as it does no much useful things at excessive

        cost. I put it there to help me understand internal working of

        stm, now I get even better knowledge ;-)<br>

      </p>

      <p>I think a debugger will trap every single abort, isn't it

        annoying when many aborts would occur? If I'd like to count the

        number of aborts, ideally accounted per service endpoints, time

        periods, source modules etc. there some tricks for that?</p>

      <p>Thanks with best regards,</p>

      <p>Compl</p>

      <p><br>

      </p>

      <div>On 2020/7/25 上午2:02, Ryan Yates

        wrote:<br>

      </div>

      <blockquote type="cite">

        <div dir="ltr">To be clear, I was trying to refer to Linux

          `perf` [^1].  Sampling based profiling can do a good job with

          concurrent and parallel programs where other methods are

          problematic.  For instance,

          <div> changing the size of heap objects can drastically change

            cache performance and completely different behavior can show

            up.</div>

          <div><br>

          </div>

          <div>[^1]: <a href="https://en.wikipedia.org/wiki/Perf_(Linux)" target="_blank">https://en.wikipedia.org/wiki/Perf_(Linux)</a></div>

          <div><br>

          </div>

          <div>The spinning in `readTVar` should always be very short

            and it typically shows up as intensive CPU use, though it

            may not be high energy use with `pause` in the loop on x86

            (looks like we don't have it [^2], I thought we did, but

            maybe that was only in some of my code... )</div>

          <div><br>

          </div>

          <div>[^2]: <a href="https://github.com/ghc/ghc/blob/master/rts/STM.c#L1275" target="_blank">https://github.com/ghc/ghc/blob/master/rts/STM.c#L1275</a> <br>

          </div>

          <div><br>

          </div>

          <div>All that to say, I doubt that you are spending much time

            spinning (but it would certainly be interesting to know if

            you are!  You would see `perf` attribute a large amount of

            time to `read_current_value`).  The amount of code to

            execute for commit (the time when locks are held) is always

            much shorter than it takes to execute the transaction body. 

            As you add more conflicting threads this gets worse of

            course as commits sequence.</div>

          <div><br>

          </div>

          <div>The code you have will count commits of executions of

            `retry`.  Note that `retry` is a user level idea, that is,

            you are counting user level *explicit* retries.  This is

            different from a transaction failing to commit and starting

            again.  These are invisible to the user.  Also using your

            trace will convert `retry` from the efficient wake on write

            implementation, to an active retry that will always attempt

            again.  We don't have cheap logging of transaction aborts in

            GHC, but I have built such logging in my work.  You can

            observe these aborts with a debugger by looking for

            execution of this line:</div>

          <div><br>

          </div>

          <div><a href="https://github.com/ghc/ghc/blob/master/rts/STM.c#L1123" target="_blank">https://github.com/ghc/ghc/blob/master/rts/STM.c#L1123</a></div>

          <div><br>

          </div>

          <div>Ryan </div>

          <div><br>

          </div>

          <div><br>

          </div>

        </div>

        <br>

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Fri, Jul 24, 2020 at

            12:35 PM Compl Yue <<a href="mailto:compl.yue@icloud.com" target="_blank">compl.yue@icloud.com</a>> wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

            <div>

              <p>I'm not familiar with profiling GHC yet, may need more

                time to get myself proficient with it.</p>

              <p>And a bit more details of my test workload for

                diagnostic: the db clients are Python processes from a

                cluster of worker nodes, consulting the db server to

                register some path for data files, under a data dir

                within a shared filesystem, then mmap those data files

                and fill in actual array data. So the db server don't

                have much computation to perform, but puts the data file

                path into a global index, which at the same validates

                its uniqueness. As there are many client processes

                trying to insert one meta data record concurrently, with

                my naive implementation, the global index's TVar will

                almost always in locked state by one client after

                another, from a queue never fall empty.</p>

              <p>So if `readTVar` should spinning waiting, I doubt the

                threads should actually make high CPU utilization,

                because at any instant of time, all threads except the

                committing one will be doing that one thing.<br>

              </p>

              <p>And I have something in my code to track STM retry like

                this:</p>

              <p>```</p>

              <div style="color:rgb(212,212,212);background-color:rgb(30,30,30);font-weight:normal;font-size:17px;line-height:23px;white-space:pre-wrap">

<div><span style="color:rgb(212,212,212)">    </span><span style="color:rgb(106,153,85)">-- blocking wait not expected, track stm retries explicitly</span></div><div><span style="color:rgb(212,212,212)">    </span><span style="color:rgb(220,220,170)">trackSTM</span><span style="color:rgb(212,212,212)"> :: </span><span style="color:rgb(86,156,214)">Int</span><span style="color:rgb(212,212,212)"> -> </span><span style="color:rgb(86,156,214)">IO</span><span style="color:rgb(212,212,212)"> (</span><span style="color:rgb(86,156,214)">Either</span><span style="color:rgb(212,212,212)"> () </span><span style="color:rgb(156,220,254)">a</span><span style="color:rgb(212,212,212)">)</span></div><div><span style="color:rgb(212,212,212)">    trackSTM !rtc = </span><span style="color:rgb(197,134,192)">do</span></div>

<div><span style="color:rgb(212,212,212)">      when </span><span style="color:rgb(106,153,85)">-- todo increase the threshold of reporting?</span></div><div><span style="color:rgb(212,212,212)">           (rtc > </span><span style="color:rgb(181,206,168)">0</span><span style="color:rgb(212,212,212)">) $ </span><span style="color:rgb(197,134,192)">do</span></div><div><span style="color:rgb(212,212,212)">        </span><span style="color:rgb(106,153,85)">-- trace out the retries so the end users can be aware of them</span></div><div><span style="color:rgb(212,212,212)">        tid <- myThreadId</span></div><div><span style="color:rgb(212,212,212)">        trace</span></div><div><span style="color:rgb(212,212,212)">            (  </span><span style="color:rgb(206,145,120)">"🔙</span><span style="color:rgb(215,186,125)">\n</span><span style="color:rgb(206,145,120)">"</span></div><div><span style="color:rgb(212,212,212)">            <> show callCtx</span></div><div><span style="color:rgb(212,212,212)">            <> </span><span style="color:rgb(206,145,120)">"🌀 "</span></div><div><span style="color:rgb(212,212,212)">            <> show tid</span></div><div><span style="color:rgb(212,212,212)">            <> </span><span style="color:rgb(206,145,120)">" stm retry #"</span></div><div><span style="color:rgb(212,212,212)">            <> show rtc</span></div><div><span style="color:rgb(212,212,212)">            )</span></div><div><span style="color:rgb(212,212,212)">          $ return </span><span style="color:rgb(86,156,214)">()</span></div>

<div><span style="color:rgb(212,212,212)">      atomically ((Just <$> stmJob) `orElse` return Nothing) >>= \</span><span style="color:rgb(197,134,192)">case</span></div><div><span style="color:rgb(212,212,212)">         Nothing -> </span><span style="color:rgb(106,153,85)">-- stm failed, do a tracked retry</span></div><div><span style="color:rgb(212,212,212)">          trackSTM (rtc + </span><span style="color:rgb(181,206,168)">1</span><span style="color:rgb(212,212,212)">)</span></div><div><span style="color:rgb(212,212,212)">        Just ... -> ...</span></div><div><span style="color:rgb(212,212,212)">

</span></div></div>

              <p>```</p>

              <p>No such trace msg fires during my test, neither in

                single thread run, nor in runs with pressure. I'm sure

                this tracing mechanism works, as I can see such traces

                fire, in case e.g. posting a TMVar to a TQueue for some

                other thread to fill it, then read the result out, if

                these 2 ops are composed into a single tx, then of

                course it's infinite retry loop, and a sequence of such

                msgs are logged with ever increasing rtc #.<br>

              </p>

              <p>So I believe no retry has ever been triggered.</p>

              <p>What can going on there?</p>

              <p><br>

              </p>

              <div>On 2020/7/24 下午11:46, Ryan Yates wrote:<br>

              </div>

              <blockquote type="cite">

                <div dir="ltr">

                  <div dir="ltr">

                    <div>> Then to explain the low CPU utilization

                      (~10%), am I right to understand it as that upon

                      reading a TVar locked by another committing tx, a

                      lightweight thread will put itself into `waiting

                      STM` and descheduled state, so the CPUs can only

                      stay idle as not so many threads are willing to

                      proceed?<br>

                    </div>

                    <div><br>

                    </div>

                    <div>Since the commit happens in finite steps, the

                      expectation is that the lock will be released very

                      soon.  Given this when the body of a transaction

                      executes `readTVar` it spins (active CPU!) until

                      the `TVar` is observed unlocked.  If a lock is

                      observed while commiting, it immediately starts

                      the transaction again from the beginning.  To get

                      the behavior of suspending a transaction you have

                      to successfully commit a transaction that executed

                      `retry`.  Then the transaction is put on the

                      wakeup lists of its read set and subsequent

                      commits will wake it up if its write set overlaps.</div>

                    <div><br>

                    </div>

                    <div>I don't think any of these things would explain

                      low CPU utilization.  You could try running with

                      `perf` and see if lots of time is spent in some

                      recognizable part of the RTS.</div>

                    <div><br>

                    </div>

                    <div>Ryan</div>

                  </div>

                  <div><br>

                  </div>

                  <br>

                  <div class="gmail_quote">

                    <div dir="ltr" class="gmail_attr">On Fri, Jul 24,

                      2020 at 11:22 AM Compl Yue <<a href="mailto:compl.yue@icloud.com" target="_blank">compl.yue@icloud.com</a>>

                      wrote:<br>

                    </div>

                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                      <div>

                        <p>Thanks very much for the insightful

                          information Ryan! I'm glad my suspect was

                          wrong about the Haskell scheduler:<br>

                        </p>

                        <p>> The Haskell capability that is

                          committing a transaction will not yield to

                          another Haskell thread while it is doing the

                          commit.  The OS thread may be preempted, but

                          once commit starts the haskell scheduler is

                          not invoked until after locks are released.</p>

                        <div>So best effort had already been made in GHC

                          and I just need to cooperate better with its

                          design. Then to explain the low CPU

                          utilization (~10%), am I right to understand

                          it as that upon reading a TVar locked by

                          another committing tx, a lightweight thread

                          will put itself into `waiting STM` and

                          descheduled state, so the CPUs can only stay

                          idle as not so many threads are willing to

                          proceed?<br>

                        </div>

                        <div><br>

                        </div>

                        <div>Anyway, I see light with better data

                          structures to improve my situation, let me try

                          them and report back. Actually I later changed

                          `TVar (HaskMap k v)` to be `TVar (HashMap k

                          Int)` where the `Int` being array index into

                          `TVar (Vector (TVar (Maybe v)))`, in pursuing

                          insertion order preservation semantic of dict

                          entries (like that in Python 3.7+), then it's

                          very hopeful to incorporate stm-containers'

                          Map or ttrie to approach free of contention.<br>

                        </div>

                        <p>Thanks with regards,</p>

                        <p>Compl</p>

                        <p><br>

                        </p>

                        <div>On 2020/7/24 下午10:03, Ryan Yates wrote:<br>

                        </div>

                        <blockquote type="cite">

                          <div dir="ltr">

                            <div>Hi Compl,</div>

                            <div><br>

                            </div>

                            <div>Having a pool of transaction processing

                              threads can be helpful in a certain way. 

                              If the body of the transaction takes more

                              time to execute then the Haskell thread is

                              allowed and it yields, the

                              suspended thread won't get in the way of

                              other thread, but when it is rescheduled,

                              will have a low probability of success. 

                              Even worse, it will probably not discover

                              that it is doomed to failure until commit

                              time.  If transactions are more likely to

                              reach commit without yielding, they are

                              more likely to succeed.  If the

                              transactions are not conflicting, it

                              doesn't make much difference other than

                              cache churn.</div>

                            <div><br>

                            </div>

                            <div>The Haskell capability that is

                              committing a transaction will not yield to

                              another Haskell thread while it is doing

                              the commit.  The OS thread may be

                              preempted, but once commit starts the

                              haskell scheduler is not invoked until

                              after locks are released.</div>

                            <div><br>

                            </div>

                            <div>To get good performance from STM you

                              must pay attention to what TVars are

                              involved in a commit.  All STM systems are

                              working under the assumption of low

                              contention, so you want to minimize

                              "false" conflicts (conflicts that are not

                              essential to the computation).   

                              Something like `TVar (HashMap k v)` will

                              work pretty well for a low thread count,

                              but every transaction that writes to that

                              structure will be in conflict with every

                              other transaction that accesses it. 

                              Pushing the `TVar` into the nodes of the

                              structure reduces the possibilities for

                              conflict, while increasing the amount of

                              bookkeeping STM has to do.  I would like

                              to reduce the cost of that bookkeeping

                              using better structures, but we need to do

                              so without harming performance in the low

                              TVar count case.  Right now it is

                              optimized for good cache performance with

                              a handful of TVars.</div>

                            <div><br>

                            </div>

                            <div>There is another way to play with

                              performance by moving work into and out of

                              the transaction body.  A transaction body

                              that executes quickly will reach commit

                              faster.  But it may be delaying work that

                              moves into another transaction.  Forcing

                              values at the right time can make a big

                              difference.</div>

                            <div><br>

                            </div>

                            <div>Ryan</div>

                            <div dir="ltr"><br>

                            </div>

                            <div dir="ltr">On Fri, Jul 24, 2020 at 2:14

                              AM Compl Yue via Haskell-Cafe <<a href="mailto:haskell-cafe@haskell.org" target="_blank">haskell-cafe@haskell.org</a>>

                              wrote:<br>

                            </div>

                            <div class="gmail_quote">

                              <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                                <div>

                                  <p>Thanks Chris, I confess I didn't

                                    pay enough attention to STM

                                    specialized container libraries by

                                    far, I skimmed through the

                                    description of stm-containers and

                                    ttrie, and feel they would

                                    definitely improve my code's

                                    performance in case I limit the

                                    server's parallelism within hardware

                                    capabilities. That may because I'm

                                    still prototyping the api and

                                    infrastructure for correctness, so

                                    even `TVar (HashMap k v)` performs

                                    okay for me at the moment, only if

                                    at low contention (surely there're

                                    plenty of CPU cycles to be optimized

                                    out in next steps). I model my data

                                    after graph model, so most data,

                                    even most indices are localized to

                                    nodes and edges, those can be

                                    manipulated without conflict, that's

                                    why I assumed I have a low

                                    contention use case since the very

                                    beginning - until I found there are

                                    still (though minor) needs for

                                    global indices to guarantee global

                                    uniqueness, I feel faithful with

                                    stm-containers/ttrie to implement a

                                    more scalable global index data

                                    structure, thanks for hinting me.</p>

                                  <p>So an evident solution comes into

                                    my mind now, is to run the server

                                    with a pool of tx processing

                                    threads, matching number of CPU

                                    cores, client RPC requests then get

                                    queued to be executed in some thread

                                    from the pool. But I'm really fond

                                    of the mechanism of M:N scheduler

                                    which solves massive/dynamic

                                    concurrency so elegantly. I had some

                                    good result with Go in this regard,

                                    and see GHC at par in doing this, I

                                    don't want to give up this enjoyable

                                    machinery.</p>

                                  <p>But looked at the stm

                                    implementation in GHC, it seems

                                    written TVars are exclusively locked

                                    during commit of a tx, I suspect

                                    this is the culprit when there're

                                    large M lightweight threads

                                    scheduled upon a small N hardware

                                    capabilities, that is when a

                                    lightweight thread yield control

                                    during an stm transaction commit,

                                    the TVars it locked will stay so

                                    until it's scheduled again (and

                                    again) till it can finish the

                                    commit. This way, descheduled

                                    threads could hold live threads from

                                    progressing. I haven't gone into

                                    more details there, but wonder if

                                    there can be some improvement for

                                    GHC RTS to keep an stm committing

                                    thread from descheduled, but

                                    seemingly that may impose more

                                    starvation potential; or stm can be

                                    improved to have its TVar locks

                                    preemptable when the owner

                                    trec/thread is in descheduled state?

                                    Neither should be easy but I'd

                                    really love massive lightweight

                                    threads doing STM practically well.<br>

                                  </p>

                                  <p>Best regards,</p>

                                  <p>Compl</p>

                                  <p><br>

                                  </p>

                                  <div>On 2020/7/24 上午12:57, Christopher

                                    Allen wrote:<br>

                                  </div>

                                  <blockquote type="cite">

                                    <div dir="ltr">It seems like you

                                      know how to run practical tests

                                      for tuning thread count and

                                      contention for throughput. Part of

                                      the reason you haven't gotten a

                                      super clear answer is "it

                                      depends." You give up fairness

                                      when you use STM instead of MVars

                                      or equivalent structures. That

                                      means a long running transaction

                                      might get stampeded by many small

                                      ones invalidating it over and

                                      over. The long-running transaction

                                      might never clear if the small

                                      transactions keep moving the

                                      cheese. I mention this because

                                      transaction runtime and size and

                                      count all affect throughput and

                                      latency. What might be ideal for

                                      one pattern of work might not be

                                      ideal for another. Optimizing for

                                      overall throughput might make the

                                      contention and fairness problems

                                      worse too. I've done practical

                                      tests to optimize this in the

                                      past, both for STM in Haskell and

                                      for RDBMS workloads.

                                      <div><br>

                                      </div>

                                      <div>The next step is sometimes

                                        figuring out whether you really

                                        need a data structure within a

                                        single STM container or if

                                        perhaps you can break up your

                                        STM container boundaries into

                                        zones or regions that roughly

                                        map onto update boundaries. That

                                        should make the transactions

                                        churn less. On the outside

                                        chance you do need to touch more

                                        than one container in a

                                        transaction, well, they compose.

                                        <div><br>

                                        </div>

                                        <div>e.g. <a href="https://hackage.haskell.org/package/stm-containers" target="_blank">https://hackage.haskell.org/package/stm-containers</a></div>

                                        <div><a href="https://hackage.haskell.org/package/ttrie" target="_blank">https://hackage.haskell.org/package/ttrie</a><br>

                                        </div>

                                        <div><br>

                                        </div>

                                        <div>It also sounds a bit like

                                          your question bumps into

                                          Amdahl's Law a bit.</div>

                                        <div><br>

                                        </div>

                                        <div>All else fails, stop using

                                          STM and find something more

                                          tuned to your problem space.</div>

                                        <div><br>

                                        </div>

                                        <div>Hope this helps,</div>

                                        <div>Chris Allen</div>

                                        <div><br>

                                        </div>

                                      </div>

                                    </div>

                                    <br>

                                    <div class="gmail_quote">

                                      <div dir="ltr" class="gmail_attr">On

                                        Thu, Jul 23, 2020 at 9:53 AM

                                        YueCompl via Haskell-Cafe <<a href="mailto:haskell-cafe@haskell.org" target="_blank">haskell-cafe@haskell.org</a>>

                                        wrote:<br>

                                      </div>

                                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                                        <div>Hello Cafe,

                                          <div><br>

                                          </div>

                                          <div>I'm working on an

                                            in-memory database, in

                                            Client/Server mode I just

                                            let each connected client

                                            submit remote procedure call

                                            running in its dedicated

                                            lightweight thread,

                                            modifying TVars in RAM per

                                            its business needs, then in

                                            case many clients connected

                                            concurrently and trying to

                                            insert new data, if they are

                                            triggering global index

                                            (some TVar) update, the

                                            throughput would drop

                                            drastically. I reduced the

                                            shared state to a simple int

                                            counter by TVar, got same

                                            symptom. While the

                                            parallelism feels okay when

                                            there's no hot TVar

                                            conflicting, or M is not

                                            much greater than N.</div>

                                          <div><br>

                                          </div>

                                          <div>As an empirical test

                                            workload, I have a `+RTS

                                            -N10` server process, it

                                            handles 10 concurrent

                                            clients okay, got ~5x of

                                            single thread throughput;

                                            but in handling 20

                                            concurrent clients, each of

                                            the 10 CPUs can only be

                                            driven to ~10% utilization,

                                            the throughput seems even

                                            worse than single thread.

                                            More clients can even drive

                                            it thrashing without much

                                             progressing.</div>

                                          <div><br>

                                          </div>

                                          <div> I can understand that

                                            pure STM doesn't scale well

                                            after reading [1], and I see

                                            it suggested [7] attractive

                                            and planned future work

                                            toward that direction.</div>

                                          <div><br>

                                          </div>

                                          <div>But I can't find certain

                                            libraries or frameworks

                                            addressing large M over

                                            small N scenarios, [1]

                                            experimented with designated

                                            N parallelism, and [7] is

                                            rather theoretical to my

                                            empirical needs.</div>

                                          <div><br>

                                          </div>

                                          <div>Can you direct me to some

                                            available library

                                            implementing the methodology

                                            proposed in [7] or other

                                            ways tackling this problem?</div>

                                          <div><br>

                                          </div>

                                          <div>I think the most

                                            difficult one is that a

                                            transaction should commit

                                            with global indices (with

                                            possibly unique constraints)

                                            atomically updated, and

                                            rollback with any violation

                                            of constraints, i.e.

                                            transactions have to cover

                                            global states like indices.

                                            Other problems seem more

                                            trivial than this.</div>

                                          <div><br>

                                          </div>

                                          <div>Specifically, [7] states:</div>

                                          <div><br>

                                          </div>

                                          <div>> It must be

                                            emphasized that all of the

                                            mechanisms we deploy

                                            originate, in one form or

                                            another, in the database

                                            literature from the 70s and

                                            80s. Our contribution is to

                                            adapt these techniques to

                                            software transactional

                                            memory, providing more

                                            effective solutions to

                                            important STM problems than

                                            prior proposals.</div>

                                          <div><br>

                                          </div>

                                          <div>I wonder any STM based

                                            library has simplified those

                                            techniques to be composed

                                            right away? I don't really

                                            want to implement those

                                            mechanisms by myself,

                                            rebuilding many wheels from

                                            scratch.</div>

                                          <div><br>

                                          </div>

                                          <div>Best regards,</div>

                                          <div>Compl</div>

                                          <div><br>

                                          </div>

                                          <div><br>

                                          </div>

                                          <div>[1] Comparing the

                                            performance of concurrent

                                            linked-list implementations

                                            in Haskell </div>

                                          <div><a href="https://simonmar.github.io/bib/papers/concurrent-data.pdf" target="_blank">https://simonmar.github.io/bib/papers/concurrent-data.pdf</a></div>

                                          <div><br>

                                          </div>

                                          <div>[7] M. Herlihy and E.

                                            Koskinen. Transactional

                                            boosting: a methodology for

                                            highly-concurrent

                                            transactional objects. In

                                            Proc. of PPoPP ’08, pages

                                            207–216. ACM Press, 2008.</div>

                                          <div><a href="https://www.cs.stevens.edu/~ejk/papers/boosting-ppopp08.pdf" target="_blank">https://www.cs.stevens.edu/~ejk/papers/boosting-ppopp08.pdf</a></div>

                                          <div><br>

                                          </div>

                                        </div>

_______________________________________________<br>

                                        Haskell-Cafe mailing list<br>

                                        To (un)subscribe, modify options

                                        or view archives go to:<br>

                                        <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br>

                                        Only members subscribed via the

                                        mailman list are allowed to

                                        post.</blockquote>

                                    </div>

                                    <br clear="all">

                                    <div><br>

                                    </div>

                                    -- <br>

                                    <div dir="ltr">

                                      <div dir="ltr">

                                        <div>

                                          <div dir="ltr">

                                            <div dir="ltr">Chris Allen<br>

                                              <div><span style="font-size:12.8px">Currently

                                                  working on </span><a href="http://haskellbook.com" target="_blank">http://haskellbook.com</a></div>

                                            </div>

                                          </div>

                                        </div>

                                      </div>

                                    </div>

                                  </blockquote>

                                </div>

_______________________________________________<br>

                                Haskell-Cafe mailing list<br>

                                To (un)subscribe, modify options or view

                                archives go to:<br>

                                <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br>

                                Only members subscribed via the mailman

                                list are allowed to post.</blockquote>

                            </div>

                          </div>

                        </blockquote>

                      </div>

                    </blockquote>

                  </div>

                </div>

              </blockquote>

            </div>

          </blockquote>

        </div>

      </blockquote>

      <br>

      <fieldset></fieldset>

      <pre>_______________________________________________

Haskell-Cafe mailing list

To (un)subscribe, modify options or view archives go to:

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a>

Only members subscribed via the mailman list are allowed to post.</pre>

    </blockquote>

  </div>

_______________________________________________<br>

Haskell-Cafe mailing list<br>

To (un)subscribe, modify options or view archives go to:<br>

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br>

Only members subscribed via the mailman list are allowed to post.</blockquote></div>