[GHC] #12893: Profiling defeats stream fusion when using vector library

GHC ghc-devs at haskell.org
Mon Nov 28 23:18:29 UTC 2016


#12893: Profiling defeats stream fusion when using vector library
-------------------------------------+-------------------------------------
           Reporter:  newhoggy       |             Owner:
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.0.1
           Keywords:  stream-fusion  |  Operating System:  MacOS X
  profiling                          |
       Architecture:  x86_64         |   Type of failure:  Runtime
  (amd64)                            |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I have this problem where enabling profiling in Haskell undoes stream
 fusion and I’d like to do my profiling with stream fusion intact.

 The following reproduces my problem:

 {{{
 $ git clone git at github.com:haskell-works/hw-tutorial-performance.git
 $ cd hw-tutorial-performance/
 $ stack init
 $ stack build --executable-profiling
 $ time $(find .stack-work/dist/ -type f -name hw-tutorial-performance-
 rwhe) 1e7
 5000000.5

 real    0m4.432s
 user    0m3.484s
 sys    0m0.936s

 $ rm -rf .stack-work/
 $ stack build
 $ time $(find .stack-work/dist/ -type f -name hw-tutorial-performance-
 rwhe) 1e7
 5000000.5

 real    0m0.034s
 user    0m0.018s
 sys    0m0.013s
 With profiling, my program takes over 4 seconds to run. Without profiling,
 it only takes 0.034 seconds.
 }}}

 The problem has nothing to do with stack.


 {{{
 (java-1.8.0 ghc-7.10.3) ✔ ~/wrk/haskell-works/hw-tutorial-performance/hw-
 tutorial-performance-rwhe [master|✚ 1]
 08:12 $ ghc -fforce-recomp --make -O2 -funbox-strict-fields -rtsopts
 -optc-O2 Main.hs
 [1 of 1] Compiling Main             ( Main.hs, Main.o )
 Linking Main ...
 (java-1.8.0 ghc-7.10.3) ✔ ~/wrk/haskell-works/hw-tutorial-performance/hw-
 tutorial-performance-rwhe [master|✚ 1]
 08:12 $ time ./Main +RTS -sstderr -RTS 1e7
 5000000.5
          115,888 bytes allocated in the heap
            3,480 bytes copied during GC
           44,384 bytes maximum residency (1 sample(s))
           17,056 bytes maximum slop
                1 MB total memory in use (0 MB lost due to fragmentation)

                                      Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s
 0.0000s
   Gen  1         1 colls,     0 par    0.000s   0.005s     0.0049s
 0.0049s

   INIT    time    0.000s  (  0.002s elapsed)
   MUT     time    0.010s  (  0.025s elapsed)
   GC      time    0.000s  (  0.005s elapsed)
   EXIT    time    0.000s  (  0.005s elapsed)
   Total   time    0.012s  (  0.037s elapsed)

   %GC     time       1.8%  (13.1% elapsed)

   Alloc rate    11,904,262 bytes per MUT second

   Productivity  97.1% of total user, 31.2% of total elapsed


 real    0m0.057s
 user    0m0.012s
 sys     0m0.010s
 (java-1.8.0 ghc-7.10.3) ✔ ~/wrk/haskell-works/hw-tutorial-performance/hw-
 tutorial-performance-rwhe [master|✚ 1]
 08:12 $ ghc -prof -fforce-recomp --make -O2 -funbox-strict-fields -rtsopts
 -optc-O2 Main.hs
 [1 of 1] Compiling Main             ( Main.hs, Main.o )
 Linking Main ...
 (java-1.8.0 ghc-7.10.3) ✔ ~/wrk/haskell-works/hw-tutorial-performance/hw-
 tutorial-performance-rwhe [master|✚ 1]
 08:12 $ time ./Main +RTS -sstderr -RTS 1e7
 5000000.5
    3,012,099,000 bytes allocated in the heap
    4,079,698,408 bytes copied during GC
    1,162,199,352 bytes maximum residency (13 sample(s))
       19,313,368 bytes maximum slop
             1990 MB total memory in use (0 MB lost due to fragmentation)

                                      Tot time (elapsed)  Avg pause  Max
 pause
   Gen  0      4869 colls,     0 par    1.046s   1.055s     0.0002s
 0.0017s
   Gen  1        13 colls,     0 par    1.230s   1.960s     0.1508s
 0.5764s

   INIT    time    0.000s  (  0.004s elapsed)
   MUT     time    0.963s  (  0.825s elapsed)
   GC      time    2.276s  (  3.016s elapsed)
   RP      time    0.000s  (  0.000s elapsed)
   PROF    time    0.000s  (  0.000s elapsed)
   EXIT    time    0.009s  (  0.185s elapsed)
   Total   time    3.251s  (  4.029s elapsed)

   %GC     time      70.0%  (74.8% elapsed)

   Alloc rate    3,126,345,026 bytes per MUT second

   Productivity  30.0% of total user, 24.2% of total elapsed


 real    0m4.197s
 user    0m3.251s
 sys     0m0.883s
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12893>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list