On CI

Sebastian Graf sgraf1337 at gmail.com
Wed Feb 17 10:31:39 UTC 2021


Hi Moritz,

I, too, had my gripes with CI turnaround times in the past. Here's a
somewhat radical proposal:

   - Run "full-build" stage builds only on Marge MRs. Then we can assign to
   Marge much earlier, but probably have to do a bit more of (manual)
   bisecting of spoiled Marge batches.
      - I hope this gets rid of a bit of the friction of small MRs. I
      recently caught myself wanting to do a bunch of small, independent, but
      related changes as part of the same MR, simply because it's such a hassle
      to post them in individual MRs right now and also because it
steals so much
      CI capacity.
   - Regular MRs should still have the ability to easily run individual
   builds of what is now the "full-build" stage, similar to how we can run
   optional "hackage" builds today. This is probably useful to pin down the
   reason for a spoiled Marge batch.
   - The CI capacity we free up can probably be used to run a perf build
   (such as the fedora release build) on the "build" stage (the one where we
   currently run stack-hadrian-build and the validate-deb9-hadrian build), in
   parallel.
   - If we decide against the latter, a micro-optimisation could be to
   cache the build artifacts of the "lint-base" build and continue the build
   in the validate-deb9-hadrian build of the "build" stage.

The usefulness of this approach depends on how many MRs cause metric
changes on different architectures.

Another frustrating aspect is that if you want to merge an n-sized chain of
dependent changes individually, you have to

   - Open an MR for each change (initially the last change will be
   comprised of n commits)
   - Review first change, turn pipeline green   (A)
   - Assign to Marge, wait for batch to be merged   (B)
   - Review second change, turn pipeline green
   - Assign to Marge, wait for batch to be merged
   - ... and so on ...

Note that (A) incurs many context switches for the dev and the latency of
*at least* one run of CI.
And then (B) incurs the latency of *at least* one full-build, if you're
lucky and the batch succeeds. I've recently seen batches that were
resubmitted by Marge at least 5 times due to spurious CI failures and
timeouts. I think this is a huge factor for latency.

Although after (A), I should just pop the the patch off my mental stack,
that isn't particularly true, because Marge keeps on reminding me when a
stack fails or succeeds, both of which require at least some attention from
me: Failed 2 times => Make sure it was spurious, Succeeds => Rebase next
change.

Maybe we can also learn from other projects like Rust, GCC or clang, which
I haven't had a look at yet.

Cheers,
Sebastian

Am Mi., 17. Feb. 2021 um 09:11 Uhr schrieb Moritz Angermann <
moritz.angermann at gmail.com>:

> Friends,
>
> I've been looking at CI recently again, as I was facing CI turnaround
> times of 9-12hs; and this just keeps dragging out and making progress hard.
>
> The pending pipeline currently has 2 darwin, and 15 windows builds
> waiting. Windows builds on average take ~220minutes. We have five builders,
> so we can expect this queue to be done in ~660 minutes assuming perfect
> scheduling and good performance. That is 11hs! The next windows build can
> be started in 11hs. Please check my math and tell me I'm wrong!
>
> If you submit a MR today, with some luck, you'll be able to know if it
> will be mergeable some time tomorrow. At which point you can assign it to
> marge, and marge, if you are lucky and the set of patches she tries to
> merge together is mergeable, will merge you work into master probably some
> time on Friday. If a job fails, well you have to start over again.
>
> What are our options here? Ben has been pretty clear about not wanting a
> broken commit for windows to end up in the tree, and I'm there with him.
>
> Cheers,
>  Moritz
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20210217/01661257/attachment.html>


More information about the ghc-devs mailing list