Compile times: Average vs worst case runtime

Thu Aug 30 17:08:25 UTC 2018

Hello Devs,

I developed a patch improving on GHC's code layout during GSoC: 
https://ghc.haskell.org/trac/ghc/ticket/15124
The gains are pretty nice with most library benchmarks showing 
improvments in the 1-2% range or more.

Even compile times went down comparing head vs stage2 built with, and 
using my patch!
Making compilation of nofib overall faster by 0.5-1% depending on the 
exact setup.

Now the bad news:
In some cases the algorithm has bad big O performance, in practice this 
seems to be code with
things like cases with 200+ alternatives. Where these cases also 
represent most of the code compiled.

The worst case I saw was doubled compile time in a Module which only 
consisted of a 500 deep if/else chain only selecting
an value.

While there are some small optimizations possible to improve on this I 
don't expect to make these edge cases much faster overall.
However it will also be possible to fall back to the old behavior to 
avoid the large compile times for modules known to trigger this
edge case.

Which brings me to my main question: When should we use the new code layout.
1. Always since on average both compile and runtime is faster.
2. -O1/O2 since users often use this when performance matters.
3. -O2 since users expect compilation times to blow up with it?
4. Never as users experience compile time slowdowns when they hit these 
edge cases.

I'm would prefer 2. 3. 1. 4. in that order but I wonder what the wider 
community is thinking.

Cheers
Andreas