Stable pointers and hash table performance

Thu Feb 7 09:39:18 CET 2013

The ticket contains vastly less information than Eyal's post, though.

Simon

| -----Original Message-----
| From: Edward Z. Yang [mailto:ezyang at MIT.EDU]
| Sent: 07 February 2013 08:34
| To: Simon Peyton-Jones
| Cc: Eyal Lotem; "ghc-devs at haskell.org"
| Subject: RE: Stable pointers and hash table performance
| 
| http://hackage.haskell.org/trac/ghc/ticket/7670
| 
| Excerpts from Simon Peyton-Jones's message of Thu Feb 07 00:30:21 -0800
| 2013:
| > Would it be worth turning this into a Trac ticket?
| >
| > Simon
| >
| > From: ghc-devs-bounces at haskell.org [mailto:ghc-devs-
| bounces at haskell.org] On Behalf Of Eyal Lotem
| > Sent: 07 February 2013 02:14
| > To: ghc-devs at haskell.org
| > Subject: Stable pointers and hash table performance
| >
| > Hey,
| >
| > Background
| > ----------------
| > I built a hash table in C:  https://github.com/Peaker/small_hash
| >
| > According to a few simple benchmarks (mainly, 10 million insertions),
| my C hash table is much faster than any Haskell data structure I tried
| (e.g: 8 times faster than IntMap).
| > You can run the C benchmark (in small_hash) via: "make && ./benchmark
| 10000000".
| >
| > The other benchmarks are at:
| https://github.com/Peaker/hashtable_benchmarks
| > Benchmarks results on my hardware: http://hpaste.org/81902
| >
| > I was hoping to get that fast mapping goodness to Haskell, by building
| an FFI to it:
| > https://github.com/Peaker/small_hash_hs
| >
| > In order to do FFI bindings to a C data structure, that can contain
| arbitrary Haskell values, I must create a lot of StablePtrs (for each
| Haskell value stored in the C hash table).
| >
| > Problem 1:
| > -------------
| > Disappointingly, the FFI bindings were dozens of times slower(!) than
| the C code.
| >
| > When using +RTS -H200M the speed is about 10 times slower(!) than the
| C code.
| > oprofile shows that the culprit is the Stable Ptr code.
| >
| > Digging in, I found that stable ptrs and stable names shared a table,
| despite having quite different characteristics:
| >  * Stable Names aren't considered roots by GC, ptrs are.
| >  * Making two stable ptrs for the same Haskell value doesn't really
| require them to unify -- so there's no need for the hash table or the
| "refs" counter for stable ptrs.
| >
| > My Changes:
| > -----------------
| > I split the Stable Names into their own table (which allowed removing
| the "refs" field from them).
| > Stable Ptrs no longer use a hash table to unify, don't need "refs", or
| "sn_obj", or "old".
| >
| > This made Stable Ptrs cost just 1 C ptr in the table and really cheap
| to create/destroy.
| >
| > The commits are here:
| > https://github.com/Peaker/ghc/commits/master
| >
| > They are not polished (I didn't thoroughly review myself yet, didn't
| run the test suite, fix out-of-date comments involved) and not ready for
| integration yet.
| >
| > Result:
| > ---------
| > The hash table benchmark via the FFI is now only ~2 times slower than
| the C data structure and much faster than any other mapping structure
| for that particular benchmark (10 million insertions of identity
| integers, sequentially). A lot of the cost here is due to fine-grained
| allocation, as the C code does bulk allocation. I shall improve this
| benchmark later, and I believe it will tighten the gap some more.
| >
| > Problem 2:
| > --------------
| > http://hackage.haskell.org/trac/ghc/ticket/7670
| >
| > Every minor GC iterates a whole lot of stable ptrs. This means that
| with the default heap size, runtime is about 8 times longer (than with a
| 200M heap size).
| >
| > To avoid this, I'd like to split the stable ptrs table into
| generations.
| >
| > To do this, I need to know which of the pointed objects were promoted
| and to what generation, so I can promote my stable ptrs accordingly.
| > It seems like the correct time to do this is in the
| updateStablePtrTable (now named "updateStableTables")
| >
| > This is the part I need help with.
| > How do I know if an object was promoted to a higher generation when I
| traverse the objects?
| >
| > Thanks!
| > Eyal