[GHC] #14208: Performance with O0 is much better than the default or with -O2, runghc performs the best

GHC ghc-devs at haskell.org
Wed Mar 28 07:44:35 UTC 2018


#14208: Performance with O0 is much better than the default or with -O2, runghc
performs the best
-------------------------------------+-------------------------------------
        Reporter:  harendra          |                Owner:  osa1
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by simonpj):

 > If I change the optimization flags to -O0 for benchmark stanza in cabal
 file I can get close to ghci performance.

 That contradicts what Omer found in comment:27.

 Nevertheless, if what you say is true, it'd be easier to debug with -O0
 than GHCi (which brings the bytecode generator into the picture).

 > GHCi is 6x faster than my regular compiled code

 This is totally bonkers and we MUST find out what is happening :-).

 I suggest not getting diverted into speculation about CPS.  We have a
 repro case; let's just dig into it and find out what is going on.

 My suggestions

 * In comment:31 Does the same thing happen with -O0 vs -O, or only with
 GHCi vs -O?

 * In all repros, do the huge differences also show up in the bytes-
 allocated numbers?  (If so, we don't need the Criterion apparatus.)

 * I notice that in comment:27, in the 2-module case, comparing -O0 and
 -O1:
   * Allocation is about halved in -O1
   * But runtime actually increases

   That is most peculiar.

 * Matthew says in comment:34 "I can reproduce this..".  That's great.  But
 what is "this" precisely?  Which version of GHC?  What timing data?  What
 happened to allocation and GC numbers?

 Somehow a 6x increase in execution time ought not to be hard to find!

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14208#comment:38>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list