[GHC] #14980: Runtime performance regression with binary operations on vectors
GHC
ghc-devs at haskell.org
Wed Jun 27 10:48:27 UTC 2018
#14980: Runtime performance regression with binary operations on vectors
-------------------------------------+-------------------------------------
Reporter: ttylec | Owner: bgamari
Type: bug | Status: new
Priority: high | Milestone: 8.8.1
Component: Compiler | Version: 8.2.2
Resolution: | Keywords: vector
| bitwise operations
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by ttylec):
Replying to [comment:20 tdammers]:
This is not totally "bad" behavior. This:
> {{{
> "Generated"
> benchmarking 64 columns/raw unbox vectors
> time 460.0 μs (447.0 μs .. 473.6 μs)
> 0.995 R² (0.992 R² .. 0.997 R²)
> mean 446.4 μs (440.2 μs .. 455.1 μs)
> std dev 24.42 μs (17.29 μs .. 31.22 μs)
> variance introduced by outliers: 48% (moderately inflated)
>
> benchmarking 64 columns/binary packed
> time 52.25 μs (51.66 μs .. 53.04 μs)
> 0.998 R² (0.997 R² .. 0.999 R²)
> mean 52.60 μs (51.99 μs .. 53.67 μs)
> std dev 2.665 μs (1.919 μs .. 4.073 μs)
> variance introduced by outliers: 55% (severely inflated)
> }}}
this is "good", we have significant speedup. But this:
> {{{
> benchmarking 256 columns/raw unbox vectors
> time 439.9 μs (434.4 μs .. 447.4 μs)
> 0.998 R² (0.997 R² .. 1.000 R²)
> mean 439.0 μs (435.0 μs .. 446.4 μs)
> std dev 17.95 μs (10.79 μs .. 27.38 μs)
> variance introduced by outliers: 35% (moderately inflated)
>
> benchmarking 256 columns/binary packed
> time 304.7 μs (288.7 μs .. 330.4 μs)
> 0.965 R² (0.940 R² .. 0.998 R²)
> mean 324.1 μs (302.9 μs .. 364.4 μs)
> std dev 62.33 μs (26.19 μs .. 97.61 μs)
> variance introduced by outliers: 91% (severely inflated)
> }}}
is "bad". However, what I observed with the full code of our project, the
speed-up is lost when we exceed the specific number of columns... but that
number is platform specific (AMD performs worst, Intel is usually good,
but then to MacBook Pro i5 CPU seems to be better than in i7 Lenovo on
ubuntu).
But since on the same platform you get different results for 256 columns,
having speedup in with 64 columns, make me wonder, can the system kernel
and/or libraries be affecting that?
As for me not being able to reproduce my original report: I did try to
remove `~/.stack` and `~/.stack-work`. Tried both with `split-obj` in
stack config and without. I still can't get "bad" results on my current
hardware/software.
I am using only stack, I don't have system-wide GHC.
I will try to do more tests on different machine (with debian 9); try the
macbook at home too.
But to sum up what we know until now: libraries are ruled out, compiler
version seems to be ruled out too. What's left? GHC binary package, OS
kernel, system libs? Does anything of that make sense?
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14980#comment:22>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list