[GHC] #8971: Native Code Generator 7.8.1 RC2 is not as optimized as 7.6.3...

Fri Apr 25 05:38:43 UTC 2014

#8971: Native Code Generator 7.8.1 RC2 is not as optimized as 7.6.3...
--------------------------------------------+------------------------------
        Reporter:  GordonBGood              |            Owner:
            Type:  bug                      |           Status:  new
        Priority:  normal                   |        Milestone:
       Component:  Compiler (NCG)           |          Version:  7.8.1-rc2
      Resolution:                           |         Keywords:
Operating System:  Unknown/Multiple         |     Architecture:
 Type of failure:  Runtime performance bug  |  Unknown/Multiple
       Test Case:                           |       Difficulty:  Unknown
        Blocking:                           |       Blocked By:
                                            |  Related Tickets:
--------------------------------------------+------------------------------

Comment (by jstolarek):

 I know nothing about 7.6 Cmm generation, but I worked on 7.8 Cmm pipeline
 this summer so I can offer some guidance. First I'd like to clarify a few
 things:

 > both NCG and LLVM will share the same C-- output

 No, they will not. LLVM backend requires "proc-point splitting". This
 means we need to turn every Cmm block that is succcesor of more than one
 block into a separate procedure (at least that is my understanding). See
 [https://github.com/ghc/ghc/blob/f8e12e2b396e0c475e1403ab8ac3fc4d63c1681e/compiler/cmm/CmmPipeline.hs#L104
 here] and
 [https://github.com/ghc/ghc/blob/f8e12e2b396e0c475e1403ab8ac3fc4d63c1681e/compiler/cmm/CmmPipeline.hs#L77
 here] to see source of differences between Cmm generated for both
 backends.
 [https://github.com/ghc/ghc/blob/f8e12e2b396e0c475e1403ab8ac3fc4d63c1681e/compiler/cmm/CmmProcPoint.hs#L35
 Here] you'll find more on proc-points.

 > straight cmm dump has regressed to have lost even the basic
 optimizations that were there with the older version.

 I'm not sure what was the philosophy behind 7.6 Cmm geneartion but in 7.8
 we just generate Cmm from STG in the simplest possible way and then
 optimize that Cmm. This is similar to generating Core from Haskell and
 then doing a series of core-to-core transformations. So you need to look
 at the final Cmm, not the one that comes out from the Cmm->STG pass.

 > It may be that the NCG is using the non-optimized version of CMM as a
 source

 It is using the optimized version. You can see for yourself in the
 [https://github.com/ghc/ghc/blob/f8e12e2b396e0c475e1403ab8ac3fc4d63c1681e/compiler/main/HscMain.hs#L1237
 tryNewCodeGen function] and its
 [https://github.com/ghc/ghc/blob/f8e12e2b396e0c475e1403ab8ac3fc4d63c1681e/compiler/main/HscMain.hs#L1159
 call site].

 >  Anticipating your next request to look at the STG output

 Bad anticipation :-) Look at generated assembly. Only this can tell you
 what is the real difference between generated code. I wouldn't be
 surprised to see something in the lines of #8048.

 > Thus, the bug/regression appears to be go further back than just the new
 NCG (which is likely using the non-optimized CMM code as input) but also
 to the CMM code generator in that it is producing much less efficient CMM
 code.

 Let me repeat: a) NCG is using optimized Cmm (look at the code); b) don't
 look at the un-optimized Cmm - it's irrelevant.

 Finally, I you want to learn more about Cmm pipeline and Cmm debugging see
 [wiki:Commentary/Compiler/CodeGen].

 Hope that helps.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8971#comment:11>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler