[GHC] #14644: Improve cmm/assembly for pattern matches with two constants.

GHC ghc-devs at haskell.org
Sun Jan 14 11:31:42 UTC 2018


#14644: Improve cmm/assembly for pattern matches with two constants.
-------------------------------------+-------------------------------------
        Reporter:  AndreasK          |                Owner:  AndreasK
            Type:  task              |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.2
  (CodeGen)                          |             Keywords:  Codegen, CMM,
      Resolution:                    |  Patterns, Pattern Matching
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):  Phab:D4294
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by AndreasK):

 Replying to [comment:3 svenpanne]:

 > Additional things to consider: Performance in tight loops is often
 vastly different, because branch prediction/caching will most likely kick
 in visibly. Correctly predicted branches will cost you almost nothing,
 while unknown/incorrectly predicted branches will be much more costly. In
 the absence of more information from their branch predictor, quite a few
 processors assume that backward branches are taken and forward branches
 are assumed to be not taken. So code layout has a non-trivial performance
 impact.

 I went over Agners guide and it seems like this is only for Netburst
 CPU's, the last of which was released in 2001 so I'm not too worried about
 these. And even if you have on of these according to Agner:

 > It is rarely worth the effort to take static prediction into account.
 Almost any branch that is executed sufficiently often for its timing to
 have any significant effect is likely to stay in the BTB so that only the
 dynamic prediction counts.

 All other architectures he lists default to not taken if they use static
 prediction at all.


 ----

 What might help explain the difference is that jumps not taken should be
 faster than taken jumps on both modern Intel and AMD CPU's.

 If someone wants to dig deeper Agner probably has enough info in the
 guides to explain the change completely based on the assembly generated.
 But I don't think that is necessary.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14644#comment:11>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list