<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Yes, I think the counter point of "automating what Ben does" so

      people besides Ben can do it is very important. In this case, I

      think a good thing we could do is asynchronously build more of

      master post-merge, such as use the perf stats to automatically

      bisect anything that is fishy, including within marge bot roll-ups

      which wouldn't be built by the regular workflow anyways. <br>

    </p>

    <p>I also agree with Sebastian that the overfit/overly-synthetic

      nature of our current tests + the sketchy way we ignored drift

      makes the current approach worth abandoning in any event. The fact

      that the gold standard must include tests of larger, "real world"

      code, which unfortunately takes longer to build, I also think is a

      point towards this asynchronous approach: We trade MR latency for

      stat latency, but better utilize our build machines and get better

      stats, and when a human is to fix something a few days later, they

      have a much better foundation to start their investigation.</p>

    <p>Finally I agree with SPJ that for fairness and sustainability's

      sake, the person investigating issues after the fact should

      ideally be the MR authors, and definitely definitely not Ben. But

      I hope that better stats, nice looking graphs, and maybe a system

      to automatically ping MR authors, will make the perf debugging

      much more accessible enabling that goal.<br>

    </p>

    <p>John<br>

    </p>

    <div class="moz-cite-prefix">On 3/17/21 9:47 AM, Sebastian Graf

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAAS+=P8U6PCqaN7ZC-15rJeTjp5rV-VQReKp_2tL6pCcmeEmWw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div>Re: Performance drift: I opened <a

            href="https://gitlab.haskell.org/ghc/ghc/-/issues/17658"

            moz-do-not-send="true">https://gitlab.haskell.org/ghc/ghc/-/issues/17658</a>

          a while ago with an idea of how to measure drift a bit better.</div>

        <div>It's basically an automatically checked version of "Ben

          stares at performance reports every two weeks and sees that

          T9872 has regressed by 10% since 9.0"</div>

        <div><br>

        </div>

        <div>Maybe we can have Marge check for drift and each individual

          MR for incremental perf regressions?<br>

        </div>

        <br>

        <div>Sebastian<br>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">Am Mi., 17. März 2021 um

          14:40 Uhr schrieb Richard Eisenberg <<a

            href="mailto:rae@richarde.dev" moz-do-not-send="true">rae@richarde.dev</a>>:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div style="overflow-wrap: break-word;"><br>

            <div><br>

              <blockquote type="cite">

                <div>On Mar 17, 2021, at 6:18 AM, Moritz Angermann <<a

                    href="mailto:moritz.angermann@gmail.com"

                    target="_blank" moz-do-not-send="true">moritz.angermann@gmail.com</a>>

                  wrote:</div>

                <br>

                <div><span

style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">But

                    what do we expect of patch authors? Right now if

                    five people write patches to GHC, and each of them

                    eventually manage to get their MRs green, after a

                    long review, they finally see it assigned to marge,

                    and then it starts failing? Their patch on its own

                    was fine, but their aggregate with other people's

                    code leads to regressions? So we now expect all

                    patch authors together to try to figure out what

                    happened? Figuring out why something regressed is

                    hard enough, and we only have a very few people who

                    are actually capable of debugging this. Thus I

                    believe it would end up with Ben, Andreas, Matthiew,

                    Simon, ... or someone else from GHC HQ anyway to

                    figure out why it regressed, be it in the Review

                    Stage, or dissecting a marge aggregate, or on

                    master.</span></div>

              </blockquote>

            </div>

            <br>

            <div>I have previously posted against the idea of allowing

              Marge to accept regressions... but the paragraph above is

              sadly convincing. Maybe Simon is right about opening up

              the windows to, say, be 100% (which would catch a 10x

              regression) instead of infinite, but I'm now convinced

              that Marge should be very generous in allowing regressions

              -- provided we also have some way of monitoring drift over

              time.</div>

            <div><br>

            </div>

            <div>Separately, I've been concerned for some time about the

              peculiarity of our perf tests. For example, I'd be quite

              happy to accept a 25% regression on T9872c if it yielded a

              1% improvement on compiling Cabal. T9872 is very very very

              strange! (Maybe if *all* the T9872 tests regressed, I'd be

              more worried.) I would be very happy to learn that some

              more general, representative tests are included in our

              examinations.</div>

            <div><br>

            </div>

            <div>Richard</div>

          </div>

          _______________________________________________<br>

          ghc-devs mailing list<br>

          <a href="mailto:ghc-devs@haskell.org" target="_blank"

            moz-do-not-send="true">ghc-devs@haskell.org</a><br>

          <a

            href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

        </blockquote>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

ghc-devs mailing list

<a class="moz-txt-link-abbreviated" href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a>

<a class="moz-txt-link-freetext" href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a>

</pre>

    </blockquote>

  </body>

</html>