[GHC] #16004: Vector performance regression in GHC 8.6

GHC ghc-devs at haskell.org
Thu Dec 6 18:09:35 UTC 2018


#16004: Vector performance regression in GHC 8.6
-------------------------------------+-------------------------------------
        Reporter:  guibou            |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.6.3
       Component:  Compiler          |              Version:  8.6.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by AndreasK):

 Reproduced with 8.4.3 and 8.6.1


 For the original code using -fno-full-laziness performance is almost the
 same for 8.4 and 8.6, and what little difference there is probably comes
 from using a different branch order at the Cmm level.


 {{{
 $ ~/bench-exe.exe ./test-8.6-nofloat.exe -- ./test-8.4.exe
 benchmarking execute: ./test-8.6-nofloat.exe
 time                 1.632 s    (1.352 s .. 1.859 s)
                      0.997 R²   (0.988 R² .. 1.000 R²)
 mean                 1.629 s    (1.600 s .. 1.658 s)
 std dev              49.03 ms   (0.0 s .. 49.85 ms)
 variance introduced by outliers: 19% (moderately inflated)

 benchmarking execute: ./test-8.4.exe
 time                 1.646 s    (1.493 s .. 1.863 s)
                      0.998 R²   (0.994 R² .. NaN R²)
 mean                 1.597 s    (1.560 s .. 1.622 s)
 std dev              37.88 ms   (0.0 s .. 43.65 ms)
 variance introduced by outliers: 19% (moderately inflated)

 }}}

 The difference between full-laziness not seems to be that with full-
 laziness we float out the creation of the
 [0..n] list, instead of transforming the code into a simple loop as
 intended.

 So we end up with this inner loop that passes around the list explicitly.
 I assume deforestation fails here?

 {{{
 joinrec {
   go2_s7mc
   go2_s7mc ds3_X6W3 eta2_X3r
     = case ds3_X6W3 of {
         [] -> jump exit2_XF eta2_X3r;
         : y3_X6Yn ys2_X6Yq ->
           case readDoubleArray#
                   (ipv1_a6zy `cast` <Co:50>)
                   (+# (*# x1_a5F9 1000#) x_a69v)
                   (eta2_X3r `cast` <Co:14>)
           of
           { (# ipv4_X6am, ipv5_X6ao #) ->
           case y3_X6Yn of { I# y4_X6c8 ->
           case writeDoubleArray#
                   (ipv1_a6zy `cast` <Co:50>)
                   (+# (*# x1_a5F9 1000#) y4_X6c8)
                   ipv5_X6ao
                   ipv4_X6am
           of s'#1_X6b3
           { __DEFAULT ->
           jump go2_s7mc ys2_X6Yq (s'#1_X6b3 `cast` <Co:13>)
           }
           }
           }
       }; }
 }}}

 Not an export in the deforestation machinery so leaving that for someone
 else.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/16004#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list