Stable pointers and hash table performance
Simon Marlow
marlowsd at gmail.com
Thu Feb 7 11:24:24 CET 2013
I've replied on the ticket, which is just about the bad interaction
between StablePtrs and generational GC. We should probably have a
separate ticket to track Eyal's patches to separate StablePtrs from
StableNames, but maybe he wants to work on it for a while longer before
submitting them for review. (I support the idea in principle)
Cheers,
Simon
On 07/02/13 08:39, Simon Peyton-Jones wrote:
> The ticket contains vastly less information than Eyal's post, though.
>
> Simon
>
> | -----Original Message-----
> | From: Edward Z. Yang [mailto:ezyang at MIT.EDU]
> | Sent: 07 February 2013 08:34
> | To: Simon Peyton-Jones
> | Cc: Eyal Lotem; "ghc-devs at haskell.org"
> | Subject: RE: Stable pointers and hash table performance
> |
> | http://hackage.haskell.org/trac/ghc/ticket/7670
> |
> | Excerpts from Simon Peyton-Jones's message of Thu Feb 07 00:30:21 -0800
> | 2013:
> | > Would it be worth turning this into a Trac ticket?
> | >
> | > Simon
> | >
> | > From: ghc-devs-bounces at haskell.org [mailto:ghc-devs-
> | bounces at haskell.org] On Behalf Of Eyal Lotem
> | > Sent: 07 February 2013 02:14
> | > To: ghc-devs at haskell.org
> | > Subject: Stable pointers and hash table performance
> | >
> | > Hey,
> | >
> | > Background
> | > ----------------
> | > I built a hash table in C: https://github.com/Peaker/small_hash
> | >
> | > According to a few simple benchmarks (mainly, 10 million insertions),
> | my C hash table is much faster than any Haskell data structure I tried
> | (e.g: 8 times faster than IntMap).
> | > You can run the C benchmark (in small_hash) via: "make && ./benchmark
> | 10000000".
> | >
> | > The other benchmarks are at:
> | https://github.com/Peaker/hashtable_benchmarks
> | > Benchmarks results on my hardware: http://hpaste.org/81902
> | >
> | > I was hoping to get that fast mapping goodness to Haskell, by building
> | an FFI to it:
> | > https://github.com/Peaker/small_hash_hs
> | >
> | > In order to do FFI bindings to a C data structure, that can contain
> | arbitrary Haskell values, I must create a lot of StablePtrs (for each
> | Haskell value stored in the C hash table).
> | >
> | > Problem 1:
> | > -------------
> | > Disappointingly, the FFI bindings were dozens of times slower(!) than
> | the C code.
> | >
> | > When using +RTS -H200M the speed is about 10 times slower(!) than the
> | C code.
> | > oprofile shows that the culprit is the Stable Ptr code.
> | >
> | > Digging in, I found that stable ptrs and stable names shared a table,
> | despite having quite different characteristics:
> | > * Stable Names aren't considered roots by GC, ptrs are.
> | > * Making two stable ptrs for the same Haskell value doesn't really
> | require them to unify -- so there's no need for the hash table or the
> | "refs" counter for stable ptrs.
> | >
> | > My Changes:
> | > -----------------
> | > I split the Stable Names into their own table (which allowed removing
> | the "refs" field from them).
> | > Stable Ptrs no longer use a hash table to unify, don't need "refs", or
> | "sn_obj", or "old".
> | >
> | > This made Stable Ptrs cost just 1 C ptr in the table and really cheap
> | to create/destroy.
> | >
> | > The commits are here:
> | > https://github.com/Peaker/ghc/commits/master
> | >
> | > They are not polished (I didn't thoroughly review myself yet, didn't
> | run the test suite, fix out-of-date comments involved) and not ready for
> | integration yet.
> | >
> | > Result:
> | > ---------
> | > The hash table benchmark via the FFI is now only ~2 times slower than
> | the C data structure and much faster than any other mapping structure
> | for that particular benchmark (10 million insertions of identity
> | integers, sequentially). A lot of the cost here is due to fine-grained
> | allocation, as the C code does bulk allocation. I shall improve this
> | benchmark later, and I believe it will tighten the gap some more.
> | >
> | > Problem 2:
> | > --------------
> | > http://hackage.haskell.org/trac/ghc/ticket/7670
> | >
> | > Every minor GC iterates a whole lot of stable ptrs. This means that
> | with the default heap size, runtime is about 8 times longer (than with a
> | 200M heap size).
> | >
> | > To avoid this, I'd like to split the stable ptrs table into
> | generations.
> | >
> | > To do this, I need to know which of the pointed objects were promoted
> | and to what generation, so I can promote my stable ptrs accordingly.
> | > It seems like the correct time to do this is in the
> | updateStablePtrTable (now named "updateStableTables")
> | >
> | > This is the part I need help with.
> | > How do I know if an object was promoted to a higher generation when I
> | traverse the objects?
> | >
> | > Thanks!
> | > Eyal
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
More information about the ghc-devs
mailing list