[Haskell-cafe] STM friendly TreeMap (or similar with range scan api) ? WAS: Best ways to achieve throughput, for large M:N ratio of STM threads, with hot TVar updates?

YueCompl compl.yue at icloud.com
Thu Jul 30 08:27:27 UTC 2020


Jo,

I have some updates wrt nonmoving GC in another post to the list just now. 

And per my understanding, GHC's GC doesn't seek free segments within a heap, it instead will copy all live objects to a new heap after then swap the new heap to be the live one, so I assume memory (address space) fragmentation doesn't make much trouble for a GHC process, as for other runtimes.

I suspect the difficulty resides in the detection of circular/cyclic circumstances wrt live data structures within the old heap, especially the circles form with arbitrary number of pointers of indirection. If the GC has to perform some dict lookup to decide if an object has been copied to new heap, that's O(n*log(n)) complexity in best case, where n is number of live objects in the heap.

To efficiently copy circular structures, one optimization I can imagine is to have a `new ptr` field in every heap object, then in copying another object with a pointer to one object, the `new ptr` can be read out and if not nil, assign the pointer field on another object' in the new heap to that value and it's done; or copy one object' to the new heap, and update the field on one object in the old heap pointing to the new heap. But I don't know details of GHC GC and can't imagine even feasibility of this technique. And even the new nonmoving GC may have similar difficulty to jump out of a circle when following pointers.

Regards,
Compl


> On 2020-07-30, at 15:24, Joachim Durchholz <jo at durchholz.org> wrote:
> 
> Am 30.07.20 um 07:31 schrieb Compl Yue via Haskell-Cafe:
>> And now I realize what presuring GC in my situation is not only the large number of pointers (TVars), and at the same time, they form many circular structures, that might be nightmare for a GC.
> 
> Cycles are relevant only for reference-counting collectors.
> As far as I understand http://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html, GHC offers only tracing collectors, and cycles are irrelevant there.
> 
>> I'm still curious why the new non-moving GC in 8.10.1 still don't get obvious business progressing in my situation. I tested it on my Mac yesterday and there I don't know how to see how CPU time is distributed over threads within a process, I'll further test it with some Linux boxes to try understand it better.
> 
> Hmm... can GHC's memory management fragment?
> If that's the case, you may be seeing GC trying to find free blocks in fragmented memory, and having to re-run the GC cycle to free a block so there's enough contiguous memory.
> It's a bit of a stretch, but it can happen, and testing that hypothesis would be relatively quick: Run the program with moving GC, observe running time and if it's still slow, check if the GC is actually eating CPU, or if it's merely waiting for other threads to respond to the stop-the-world signal.
> 
> Regards,
> Jo
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.



More information about the Haskell-Cafe mailing list