[GHC] #8885: Add inline versions of clone array primops

GHC ghc-devs at haskell.org
Thu Mar 13 22:51:47 UTC 2014


#8885: Add inline versions of clone array primops
-------------------------------------+------------------------------------
        Reporter:  tibbe             |            Owner:  simonmar
            Type:  feature request   |           Status:  patch
        Priority:  normal            |        Milestone:
       Component:  Compiler          |          Version:  7.9
      Resolution:                    |         Keywords:
Operating System:  Unknown/Multiple  |     Architecture:  Unknown/Multiple
 Type of failure:  None/Unknown      |       Difficulty:  Unknown
       Test Case:                    |       Blocked By:
        Blocking:                    |  Related Tickets:
-------------------------------------+------------------------------------

Comment (by tibbe):

 There's not much point in comparing the new inline version vs the old,
 incorrect version. Fixing the old, incorrect version just to get the
 benchmark numbers doesn't seem worth it, as we'd have to replicate
 `MAYBE_GC` in `StgCmmPrim`. Instead I compare the new inline version
 against the new out-of-line version (which calls `allocate`). The inline
 versions is 69% faster.

 Here are the `+RTS -s` numbers for the new out-of-line version:

 {{{
    1,600,041,120 bytes allocated in the heap
           57,992 bytes copied during GC
           35,992 bytes maximum residency (1 sample(s))
           21,352 bytes maximum slop
                1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      3173 colls,     0 par    0.01s    0.01s     0.0000s
 0.0000s
   Gen  1         1 colls,     0 par    0.00s    0.00s     0.0002s
 0.0002s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time    0.25s  (  0.25s elapsed)
   GC      time    0.01s  (  0.01s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time    0.26s  (  0.26s elapsed)

   %GC     time       2.1%  (2.9% elapsed)

   Alloc rate    6,417,285,798 bytes per MUT second

   Productivity  97.9% of total user, 95.6% of total elapsed
 }}}

 And for the inline version:

 {{{
    1,600,041,120 bytes allocated in the heap
           57,224 bytes copied during GC
           35,992 bytes maximum residency (1 sample(s))
           21,352 bytes maximum slop
                1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      3125 colls,     0 par    0.00s    0.01s     0.0000s
 0.0000s
   Gen  1         1 colls,     0 par    0.00s    0.00s     0.0002s
 0.0002s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time    0.08s  (  0.08s elapsed)
   GC      time    0.00s  (  0.01s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time    0.08s  (  0.09s elapsed)

   %GC     time       6.1%  (8.2% elapsed)

   Alloc rate    20,999,017,271 bytes per MUT second

   Productivity  93.9% of total user, 89.2% of total elapsed
 }}}

 You can see that the GC issue has been fixed.

 I've attached updated versions of my patches that address the `MAYBE_GC`
 issue, which was also present in my new out-of-line implementation.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8885#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list