Measuring compiler performance

Wed Apr 8 15:14:50 UTC 2020

Many thanks, Richard, Andreas, Joachim, and Ben, for your responses! I
have a few things to try now. :)

>     * what I call the "Cabal test"; namely:
>
>         $ _build/stage1/bin/ghc -O -ilibraries/Cabal/Cabal \
>           libraries/Cabal/Cabal/Setup.hs +RTS -s

Thanks for spelling it out like that, Ben! I'm slightly embarrassed to
say that I hadn't been aware that I could use GHC directly in this way
to build a package!

Andreas, you wrote:

> In general I only compile as linking adds overhead which isn't really part of GHC.

How do I tell GHC to build e.g. nofib/spectral/simple/Main.hs or Cabal
without linking?

I'll eventually try to distill a wiki page from all this!

Cheers,
Simon

>
>  * My WIP nofib branch [1] makes nofib much faster and easier to work
>    with and adds the ability to measure perf counters, in addition to
>    the usual RTS and cachegrind statistics.
>
>  * My nofib branch produces output in a uniform, easy to consume format
>    and provides a tool for comparing sets of measurements in this format.
>
>  * My ghc_perf tool [2] is very useful for extracting runtime and perf
>    statistics from Haskell program runs; furthermore, it produces output
>    in the same format as expected by the aforementioned nofib-compare
>    utility.
>
>  * I have a utility [3] which I use to reproducibly build a set of
>    branches, run the testsuite, nofib, and the Cabal test on each of
>    them. Admittedly it could use a bit of cleanup but it does its job
>    reasonably well, making performance measurement a "set it and forget
>    it" sort of task.
>
>  * We collect and record a complete set of testsuite statistics (saved
>    to git notes 43]); however, we currently do not import these into
>    gipeda.
>
>  * We don't currently have a box which can measure reliable timings
>    (since our builders are nearly all virtualised cloud instances). I'm
>    going to need to do some shuffling to change this.
>
>  * One potentially useful source of performance information (which sadly
>    we currently do not exploit) is the -ddump-timing output produced
>    during head.hackage runs.
>
> [1] https://gitlab.haskell.org/ghc/nofib/merge_requests/24
> [2] https://gitlab.haskell.org/bgamari/ghc-utils/blob/master/ghc_perf.py
> [3] https://gitlab.haskell.org/bgamari/ghc-utils/-/tree/master/build-all
> [4] https://gitlab.haskell.org/ghc/ghc/-/wikis/building/running-tests/performance-tests
>
>
> > A problem in this context is that reliable performance measurements
> > require a quiet machine. Closing my browser, and turning off other
> > programs is – in my perception – rather inconvenient, particularly
> > when I have to do it for a prolonged time.
> >
> > Ideally I wouldn't have to perform these measurements on my local
> > machine at all! Do you usually use a separate machine for this? _Very_
> > convenient would be some kind of bot whom I could tell e.g.
> >
>
> Indeed it is inconvenient. I am in the lucky situation that I have
> another machine locally that can be made reasonably quiet without
> interfering with my worflow. However, in general
>
> > @perf-bot compiler perf
> >
> > …or more concretely
> >
> > @perf-bot compile nofib/spectral/simple/Main.hs
> >
> > …or just
> >
> > @nofib-bot run
> >
> > … or something like that.
> >
> > I've noticed that CI now includes a perf-nofib job. But since it
> > appears to run on a different machine each time, I'm not sure whether
> > it's actually useful for comparing performance. Could it be made more
> > useful by running it consistently on the same dedicated machine?
> >
> Indeed, we currently don't have a dedicated machine for timings.
> However, allocations and executable sizes are still useful.
>
> Nevertheless, as noted above I think that we should make more of an
> effort to measure time. I need to do some shuffling of our runners so we
> have a quiet bare-metal which can be dedicated to performance
> measurement. I'll try to get to this in the next day or so.
>
> > Another question regarding performing compiler perf measurements
> > locally is which build flavour to use: So far I have used the "perf"
> > flavour. A problem here is that a full build seems to take close to an
> > hour. A rebuild with --freeze1 takes ~15 minutes on my machine. Is
> > this the right flavour to use?
> >
> I think perf is the best option for performance measurement (afterall,
> we want to know what users would see). However, it is indeed a bit
> painful.
>
> > BTW what's the purpose of the profiled GHC modules built with this
> > flavour which just seem to additionally prolong compile time? I don't
> > see a ghc-prof binary or similar in _build/stage1/bin.
> >
> Indeed; there is little sense in building profiled modules just for
> performance measurement. However, I don't believe we currently have a
> build flavour which provides comparable optimisation but without the
> profiled way. Perhaps we should add one.
>
> > Also, what's the status of gipeda? The most recent commit at
> > https://perf.haskell.org/ghc/ is from "about a year ago"?
> >
> Indeed the machine which was previously providing gipeda builds is sadly
> no longer around; consequently it's on ice at the moment. I would like
> to get it going again but recently correctness issues have been taking
> up more time than I would like to admit.
>
> > Sorry for this load of questions and complaints! I do believe though
> > that if work on compiler performance was a bit better documented and
> > more convenient, we might see even more progress on that front. :)
> >
> Quite alright! Typing out the points above made me realize that there is
> indeed quite a bit of knowledge that the wiki leaves un-said.
>
> Cheers,
>
> - Ben
>