[GHC] #15418: Performance drop 60 times on non-profiling binary

GHC ghc-devs at haskell.org
Thu Aug 30 06:26:28 UTC 2018


#15418: Performance drop 60 times on non-profiling binary
-------------------------------------+-------------------------------------
        Reporter:  hth313            |                Owner:  (none)
            Type:  bug               |               Status:  infoneeded
        Priority:  high              |            Milestone:  8.8.1
       Component:  Runtime System    |              Version:  8.4.3
      Resolution:                    |             Keywords:
Operating System:  MacOS X           |         Architecture:  x86_64
 Type of failure:  Runtime           |  (amd64)
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #14414, #9599     |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by osa1):

 OK, so profiling build is faster than any of these .. Becuase we can't see
 the
 source you'll have to debug this yourself. Here's what I'd do next:

 - Add `-ddump-simpl -ddump-to-file -dsuppress-uniques` to `ghc-options` in
 your
   .cabal, and build without profiling. Copy generated .dump-simpl files to
   another directory, and build again this time with profiling (make sure
 to use
   same optimisation settings in both!), copy the .dump-simpl files again
 to
   another directory. Do directory diff (perhaps using kdiff3) and see the
   differences. You should see lots of minor changes (like the extra `scc`
   expression in the profiled version) and those should not matter, focus
 on
   larger changes. Looking at results in comment:13, you should see some
 code in
   non-profiled version that allocates more. One example where this happens
 is
   when GHC unboxes strict arguments/fields but not in the whole program so
 you
   still need the boxed version of the value. In that case at some point
 you
   re-box the value, causing more allocation. There are other reasons too,
 hard
   to list all..

 - Looking at the numbers in comment:13, it looks like profiling version
 has less
   max residency. Perhaps in non-profiled version some closures are floated
 to
   the top-level, causing increased residency. Keep this in mind when
 comparing
   Core outputs.

 Failing all that, try to extract some minimal reproducer from your code
 base
 and share it :-)

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15418#comment:19>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list