Continuous Integration and Cross Compilation

Mon Jul 7 01:40:17 UTC 2014

Hi again,

I think I may have been too brief in my reply. To recap previous discussion, it seems there are a few pieces which can be approached separately:

1) arbitrary/discretionary cross compilation
2) continuous integration for all patchsets
3) nightly builds

The first, as has been pointed out, is a lot of nontrivial work. The second either requires the first and a cloud service, or a lot of hardware (though it was mentioned that the buildbots can work in a CI mode). The third, we already have, thanks to the buildbots and those who have set them up.

I think using Jenkins may be a step in the right direction for a few reasons:

• there are hundreds of supported plugins [1] which cover notifications, code review [2], cloud computing services, and so on
• there is quite a lot of polish as far as generated reports go [3]
• it seems easy/nice to use out of the box (from a few minutes’ fiddling on my part)

Now, I don’t have much experience with buildbots, so I may be unfairly elevating Jenkins here. If buildbots can be easily extended to do exactly what we need, I’m all for it, and in that case I’d volunteer to help in that regard.

Will

[1] https://wiki.jenkins-ci.org/display/JENKINS/Plugins
[2] http://www.dctrwatson.com/2013/01/jenkins-and-phabricator/
[3] https://ci.jenkins-ci.org

On Jul 6, 2014, at 7:26 PM, William Knop <william.knop.nospam at gmail.com> wrote:

> Hi Pali,
> 
> Apologies for the delayed response.
> 
> I treated cloud compilation as “free” in the context of the buildbots. If we can cross-compile (on Amazon EC2 or the like) ghcs which run on each arch we have for buildbots, the buildbots themselves will have 1/5 the load. I came to that figure from the buildbot page, where it looked like the average compile time was around 80 minutes, and the average test suite run was around 20 minutes.
> 
> I see your point about cloud cross compilation and buildbot testing not covering all cases of regressions. I think this is where the CI vs. nightly builds distinction applies well. Cloud compilation and buildbot testing may be fast enough to do CI on every patch set, while total regression coverage could be provided by nightly builds. Jenkins CI allows us to roll our own CI with our own machines, cloud compute services, and loads of other content/auditing/workflow services.
> 
> That said, while I think it would be nice to have quick CI in addition to nightly builds, I don’t know if it’s sensible/desired for ghc. Since Jerkins CI is stable yet very actively developed, it seems at least it wouldn't incur too much maintenance on our part. Of course, the devil is in the details, so I’d be happy to set it up on a few of my machines to investigate.
> 
> Will
> 
> 
> On Jun 20, 2014, at 6:15 AM, Páli Gábor János <pali.gabor at gmail.com> wrote:
> 
>> Hello William,
>> 
>> 2014-06-20 0:50 GMT+02:00 William Knop <william.knop.nospam at gmail.com>:
>>> 1. We have a pretty good spread of buildbots, but as far as I know there aren’t
>>> very many of them. Running only the test suite would increase their utility by
>>> roughly 5x (from looking at the buildbot time breakdowns [1]).
>> 
>> How would this increase their utility?  I naively believe the purpose
>> of CI is to rebuild and test the source code after each changeset to
>> see if it was bringing regressions.  Running the test suite only does
>> not seem to convey this.  Many of the regressions could be observed
>> build-time, which means the most safe bet would be to rebuild and test
>> everything on the very same platform.
>> 
>>> 2. Building ghc is time and resource intensive, which makes it hard for people
>>> to host buildbots. Even though my machines are relatively new, I can’t usually
>>> host one because it would interfere with my other work. I would be more
>>> tempted to if it was limited to just the test suite, and perhaps others would as
>>> well.
>> 
>> My buildbots complete the steps (git clone, full build, testing) in
>> about 1 hour 40 minutes (with about 1 hour 15 minutes spent in the
>> compilation phase), while they run in parallel with a shift about an
>> hour.  They run on the same machine, together with the coordination
>> server.  This is just a 3.4-GHz 4-core Intel Core i5, with a couple of
>> GBs of RAM, I would not call it a high-end box, though.
>> 
>> Note that it is on purpose that the builders do not use -j for builds,
>> meaning that they do not parallelize the invoked make(1)-subprocesses,
>> which automatically makes the builds longer.  Perhaps it would be
>> worth experimenting with incremental builds and allowing for parallel
>> builds as they could cut down on the build times more efficiently.
>