<div dir="ltr"><div>I am admittedly unsure of how GHC's optimisation benchmarks are currently implemented/carried out, but I feel as though this paper and its findings could be relevant to GHC devs:</div><div><br></div><a href="http://cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf">http://cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf</a><br><div><br></div><div>Basically, according to this paper, the cache effects of changing where the stack starts based on the number of environment variables are huge for many compiler benchmarks, and adjusting for this effect shows that gcc -O3 is only in actuality 1% faster than gcc -O2.</div><div><br></div><div>Some further thoughts, per <a href="http://aftermath.rocks/2016/04/11/wrong_data/">http://aftermath.rocks/2016/04/11/wrong_data/</a> :</div><div><br></div><div>"The question they looked at was the following: does the compiler’s -O3 optimization flag result in speedups over -O2? This question is investigated in the light of measurement biases caused by two sources: Unix environment size, and linking order.</div><div>to the total size of the representation of Unix environment variables (such as PATH, HOME, etc.). Typically, these variables are part of the memory image of each process. The call stack begins where the environment ends. This gives rise to the following hypothesis: changing the sizes of (unused!) environment variables can change the alignment of variables on the stack and thus the performance of the program under test due to different behavior of hardware buffers such as caches or TLBs. (This is the source of the hypothetical example in the first paragraph, which I made up. On the machine where I am typing this, my user name appears in 12 of the environment variables that are set by default. All other things being equal, another user with a user name of a different length will have an environment size that differs by a multiple of 12 bytes.)"</div><div><br></div><div>"So does this hypothesis hold? Yes. Using a simple computational kernel the authors observe that changing the size of the environment can often cause a slowdown of 33% and, in one particular case, by 300%. On larger benchmarks the effects are less pronounced but still present. Using the C programs from the standard SPEC CPU2006 benchmark suite, the effects of -O2 and -O3 optimizations were compared across a wide range of environment sizes. For several of the programs a wide range of variations was observed, and the results often included both positive and negative observations. The effects were not correlated with the environment size. All this means that for some benchmarks, a compiler engineer might by accident test a purported optimization in a lucky environment and observe a 10% speedup, while users of the same optimization in an unlucky environment may have a 10% slowdown on the same workload."</div><div><br></div><div>I write this out of curiosity, as well as concern, over how this may affect GHC.</div></div>