GHC perf

Simon Peyton Jones simonpj at microsoft.com
Thu Jan 23 12:31:19 UTC 2020


Thanks

This information is a bit spread out over the wiki page.

Which wiki page?   Yes, it'd be fantastic to write this out clearly.  Thanks!


$ git checkout a12b34c56 && git submodule update --init

$ ./hadrian/build.sh test --only-perf

$ git checkout x98y76z54 && git submodule update --init

$ ./hadrian/build.sh test --only-perf

$ python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local a12b34c56 x98y76z54

$ firefox chart.html

Ah.  Now I'm lost.  Somehow the second and fourth line must be recording info, locally in my tree, but two distinct batches of information.   Perhaps kept distinct by the current commit?  Where is the info actually stored?

OK, suppose I start from commit XX, and make some local changes.   Then I do the -only-perf thing.  presumably that'll be recorded tagged with XX.  That's fine; just want it to be clear.  Worth adding this info to the wiki page, so we have a clear mental model.

Thanks

Simon



From: David Eichmann <davide at well-typed.com>
Sent: 23 January 2020 11:19
To: Simon Peyton Jones <simonpj at microsoft.com>
Subject: Re: GHC perf


Simon

  *   This compares two local builds

Yes

  *   It does not require fetching CI perf data; in fact it 100% independent of the CI system

Yes

  *   It does require two separate build trees (that is fine)

No, this does not require different build trees, <baseline> and <target> are git commits (or similar e.g. branch name). The actual process might look like:

$ git checkout a12b34c56 && git submodule update --init

$ ./hadrian/build.sh test --only-perf

$ git checkout x98y76z54 && git submodule update --init

$ ./hadrian/build.sh test --only-perf

$ python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local a12b34c56 x98y76z54

$ firefox chart.html

This information is a bit spread out over the wiki page. Perhaps a "quick start" section describing this use case would be helpful.
On 1/22/20 10:54 AM, Simon Peyton Jones wrote:
David

Thanks.   Concerning this:

  1.  Checkout an the <baseline> commit.
  2.  Use `git status` to double check git sees a clean working tree.
  3.  Run the performance tests.
  4.  Check out your <target> branch.
  5.  Use `git status` to double check git sees a clean working tree (else commit any changes)
  6.  Run the performance tests.
  7.  Compare metrics (filtering for `local` metrics and outputting a chart):

            python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <baseline> <target>
I believe that

  *   This compares two local builds
  *   It does not require fetching CI perf data; in fact it 100% independent of the CI system
  *   It does require two separate build trees (that is fine)

Is that right?  If so, two questions

  *   In that Python command line (step 7) is "<baseline>" the path to the root of the baseline tree, or to some file within that tree?
  *   Is this process (and what it does) written up on some wiki page somewhere?  Where? Rather than replying to me individually, it'd be better to use this conversation to produce better guidance for everyone.
Thanks

Simon


From: David Eichmann <davide at well-typed.com><mailto:davide at well-typed.com>
Sent: 20 January 2020 10:37
To: Simon Peyton Jones <simonpj at microsoft.com><mailto:simonpj at microsoft.com>; Ben Gamari <ben at well-typed.com><mailto:ben at well-typed.com>
Cc: ghc-devs <ghc-devs at haskell.org><mailto:ghc-devs at haskell.org>
Subject: Re: GHC perf

Hi Simon,

  *   There are two things going on:

     *   CI perf measurements
     *   Local machine perf measurements

I think that they are somehow handled differently (why?) but they are all muddled up on the wiki page.

They are handled differently because we do not want to compare local metrics with CI metrics. The exception is when local metrics don't exist, then we fall back to CI metrics as a baseline (see How baseline metrics are calculated<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fwikis%2Fbuilding%2Frunning-tests%2Fperformance-tests%23how-baselines-are-calculated&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4ed72566411438ee43208d79ff60518%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637153751314078539&sdata=s%2BNW5fv9ot2PMytFO%2FdGznCgVViJOafOnR3F4HJVSvs%3D&reserved=0>).

  *   My goal is this:

     *   Start with a master commit, say from Dec 2019.
     *   Implement some change, on a branch.
     *   sh validate -legacy (or something else if you like)
     *   Look at perf regressions.
Getting to the *raw data* should be easy:

  1.  Checkout an the <baseline> commit.
  2.  Use `git status` to double check git sees a clean working tree.
  3.  Run the performance tests.
  4.  Check out your <target> branch.
  5.  Use `git status` to double check git sees a clean working tree (else commit any changes)
  6.  Run the performance tests.
  7.  Compare metrics (filtering for `local` metrics and outputting a chart):

            python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <baseline> <target>
see `python3 testsuite/driver/perf_notes.py --help` for more filtering options. This doesn't detect regressions automatically, it only shows you the raw data. Ideally we'd add an option to the testrunner to let you specify a baseline commit manually. I suspect that would be close to what you're looking for.



  *   I believe I have first to utter the incantation
$ git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc-performance-notes.git&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4ed72566411438ee43208d79ff60518%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637153751314078539&sdata=4%2FpItW9dgtS4n1Mw6Nc%2FLwk4zBTm6Hs%2BUFY2nmax%2Fe8%3D&reserved=0> refs/notes/perf:refs/notes/ci/perf
Yes, this fetches the latest CI metrics into your git notes.



  *   But then:

     *   How do I ensure that the baseline perf numbers I get relate to the master commit I started from, back in Dec 2019?  I don't want numbers from Jan 2020.
see above.



     *   If I rebase my branch on top of HEAD, say, how do I update the perf baseline numbers to be for HEAD
The test runner should use HEAD's metrics automatically (see How baseline metrics are calculated<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fwikis%2Fbuilding%2Frunning-tests%2Fperformance-tests%23how-baselines-are-calculated&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4ed72566411438ee43208d79ff60518%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637153751314088533&sdata=QwciHCg3%2FkBrddbKxTRR04kyYVUQkACuRdlpitqSMrU%3D&reserved=0>), though you will need to fetch CI metrics or run the perf tests locally on HEAD to get the relevant metrics.



     *   Generally, how can I tell the commit to which the baseline numbers relate?
The test runner will output (per test) which baseline commit is used e.g. "... from local
baseline @ HEAD~2" says the baseline was a local run from 2 commits ago.

  *   Also, in my tree I have a series of incremental changes; I want to see if any of them have perf regressions.    How do I do that?

You can run the perf tests on each commit *in commit order*, and the previous commit will always be used as the baseline. You can also then chart the results:

            python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <oldestCommit>..<newestCommit>

Sorry if this is a bit unoptimal, but I Hope that helps

- David E







--

David Eichmann, Haskell Consultant

Well-Typed LLP, http://www.well-typed.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.well-typed.com%2F&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4ed72566411438ee43208d79ff60518%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637153751314088533&sdata=McXFFQ5Ln4exv4mfPhL742%2BbmGbcodP1PgLNBi%2BUWmI%3D&reserved=0>



Registered in England & Wales, OC335890

118 Wymering Mansions, Wymering Road, London W9 2NF, England

--

David Eichmann, Haskell Consultant

Well-Typed LLP, http://www.well-typed.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.well-typed.com%2F&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4ed72566411438ee43208d79ff60518%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637153751314098526&sdata=5jcfMZMpIvQ2UbF8sfkcT7ESqtSVDTKeehTMG5%2BhPFU%3D&reserved=0>



Registered in England & Wales, OC335890

118 Wymering Mansions, Wymering Road, London W9 2NF, England
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20200123/c1e7209e/attachment.html>


More information about the ghc-devs mailing list