On CI

Wed Mar 17 15:16:10 UTC 2021

 > I'd be quite happy to accept a 25% regression on T9872c if it yielded
a 1% improvement on compiling Cabal. T9872 is very very very strange!
(Maybe if *all* the T9872 tests regressed, I'd be more worried.)

While I fully agree with this. We should *always* want to know if a
small syntetic benchmark regresses by a lot.
Or in other words we don't want CI to accept such a regression for us
ever, but the developer of a patch should need to explicitly ok it.

Otherwise we just slow down a lot of seldom-used code paths by a lot.

Now that isn't really an issue anyway I think. The question is rather is
2% a large enough regression to worry about? 5%? 10%?

Cheers,
Andreas

Am 17/03/2021 um 14:39 schrieb Richard Eisenberg:
>
>
>> On Mar 17, 2021, at 6:18 AM, Moritz Angermann
>> <moritz.angermann at gmail.com <mailto:moritz.angermann at gmail.com>> wrote:
>>
>> But what do we expect of patch authors? Right now if five people
>> write patches to GHC, and each of them eventually manage to get their
>> MRs green, after a long review, they finally see it assigned to
>> marge, and then it starts failing? Their patch on its own was fine,
>> but their aggregate with other people's code leads to regressions? So
>> we now expect all patch authors together to try to figure out what
>> happened? Figuring out why something regressed is hard enough, and we
>> only have a very few people who are actually capable of debugging
>> this. Thus I believe it would end up with Ben, Andreas, Matthiew,
>> Simon, ... or someone else from GHC HQ anyway to figure out why it
>> regressed, be it in the Review Stage, or dissecting a marge
>> aggregate, or on master.
>
> I have previously posted against the idea of allowing Marge to accept
> regressions... but the paragraph above is sadly convincing. Maybe
> Simon is right about opening up the windows to, say, be 100% (which
> would catch a 10x regression) instead of infinite, but I'm now
> convinced that Marge should be very generous in allowing regressions
> -- provided we also have some way of monitoring drift over time.
>
> Separately, I've been concerned for some time about the peculiarity of
> our perf tests. For example, I'd be quite happy to accept a 25%
> regression on T9872c if it yielded a 1% improvement on compiling
> Cabal. T9872 is very very very strange! (Maybe if *all* the T9872
> tests regressed, I'd be more worried.) I would be very happy to learn
> that some more general, representative tests are included in our
> examinations.
>
> Richard
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20210317/fd272abc/attachment.html>