Continuous Integration and Cross Compilation

Thu Jun 19 22:50:51 UTC 2014

Hi Austin,

Thank you for the quick and thorough reply! There are a lot of points to cover, so I’ll respond in a few sections.

*** The CI Scheme

I realize the vast majority of the work would be in #1, but just want to highlight the idea that there is a real benefit to be had. To address the latter part of your email, I suggested splitting the test suite from the build for a few reasons:

1. We have a pretty good spread of buildbots, but as far as I know there aren’t very many of them. Running only the test suite would increase their utility by roughly 5x (from looking at the buildbot time breakdowns [1]).

2. Building ghc is time and resource intensive, which makes it hard for people to host buildbots. Even though my machines are relatively new, I can’t usually host one because it would interfere with my other work. I would be more tempted to if it was limited to just the test suite, and perhaps others would as well.

3. Cloud computing would enable very fast builds, such that we could conceivably automatically build (and then test on the buildbots) for every patch set submission / pull request.

I believe that sort of streamlining would make ghc development both more accessible to others and more enjoyable for all.

*** Cross Compilation and Template Haskell

Now on to the meat of the problem! I’m not too familiar with the really scary bits of TH, but I'll start with:

> TemplateHaskell must load and run object code on the *host* platform, but the compiler must generate code for the *target* platform.

As you pointed out, this is a big deal. How clear of a delineation does TH have as far as what runs on each platform? I believe this segregation is fundamental to TH, which I’ll explain below.

> There are ways around some of these problems; for one, we could compile every module twice, once for the host, and once for the target.

I don’t think that’s necessary (or maybe I’m misunderstanding and we’re saying the same thing). Consider the following:

1. TH compiles to object code
2. The object code is run on the build machine, which generates haskell AST
3. “Regular" GHC compiles the haskell AST to object code

Currently, the notion of build, host, and target are sort of mashed together with the assumption that build and host will be the same. It seems like “all” we have to do it tell the TH part of GHC to target the build arch, and the rest of GHC to target the host arch. But then there’s this...

> There are many, many subtle points to consider if we go down this route - what happens for example if I cross compile from a 64bit machine to a 32bit one, but TemplateHaskell wants some knowledge like what "sizeOf (undefined :: CLong)" is?

This comes back to the line between build and host in TH— there needs to be one. Perhaps there should be buildSizeOf and hostSizeOf for TH to use, and similar for other machine specific stuff?

I think the messiest part of this it that existing packages assume build == host. Their maintainers would have to be prodded to respect the build/host division, and the packages would have to be updated. Actually, one advantage to adding in build and host variations to machine specific functions is that we can just deprecate the unsegregated versions and not break anyone’s stuff. Using one of the deprecated functions in a cross-compile would simply spit out an error and terminate the build, or instead perhaps fall back to double compilation.

Regarding the many subtle points to consider, if the sort of path I describe is at all sane (please tell me if not!), I can open a trac ticket so we can chip away at them.

*** Cross Compilation, Redux

There is one more part to this story, however. Ultimately, a single build of ghc should be able to have multiple targets (or in other words, one build of ghc should be able to target multiple hosts). LLVM allows us to do this, but ghc’s notion of cross compiler is limited. Here is the current setup [2]:

Stage 0:
• built on: ---
• runs on: host
• targets: host

Libs Boot:
• built on: build
• runs on: build
• targets: ---

Stage 1:
• built on: build
• runs on: host
• targets: target

Libs Install:
• built on: build
• runs on: target
• targets: ---

Stage 2:
• built on: build
• runs on: target
• targets: target

What I propose is the following (stage 0 and libs boot are unchanged):

Stage 1:
• built on: build
• runs on: build
• targets: targets

Libs Toolchain Host:
• built on: build
• runs on: host
• targets: ---

Libs Toolchain Target-x:
• built on: build
• runs on: target-x
• targets: ---

Libs Toolchain Target-y:
• built on: build
• runs on: target-y
• targets: ---

Libs Toolchain Target-z:
• built on: build
• runs on: target-z
• targets: ---

Stage 2:
• built on: build
• runs on: host
• targets: host, target-x, target-y, target-z

Most people will only want targets == host, in which case only the host toolchain will be built, so "regular" builds should be exactly the same as they are now. One may also produce a specialized cross-compiler (i.e. no host toolchain and one target toolchain), which is equivalent to how ghc currently builds a cross compiler. Or, one may choose to produce a compiler that targets whatever combination of targets one desires (currently impossible).

A build of ghc that runs on the cloud (as proposed above), one might have host=linux-x86_64, targets=the-whole-shebang. For the compilers produced by the cloud, one would have targets == host, simply because we just want to be able to run the test suite on a given machine.

*** Build Machine Needs

> Really, we need to distinguish between two needs:
> 
> 1) Continuous integration.
> 
> 2) Nightly builds.
> 
> These two systems have very different needs in practice:
> 
> 1) A CI system needs to be *fast*, and it needs to have dedicated
> resources to respond to changes quickly. This means we need to
> *minimize* the amount of time for developer turn around to see
> results. That includes minimizing the needed configurations. Shipping
> builds to remote machines just for CI would greatly complicate this
> and likely make it far longer on its own, not to mention it increases
> with every system we add.
> 
> 2) A nightly build system is under nowhere near the same time
> constraints, although it also needs to be dedicated. If an ARM/Linux
> machine takes 6 hours to build (perhaps it's shared or something, or
> just really wimpy), that's totally acceptable. These can then report
> nightly about the results and we can reasonably blame
> people/changesets based on that.

I totally agree on the distinction you’ve drawn here, though I don’t think the CI proposed above would increase build times. On the contrary, I think it would greatly reduce build times (assuming we use fast cloud compute nodes). I’ll try to collect some stats (and costs) to back that up.

I didn’t realize Travis CI has a build time limit, so thanks for pointing that out. Fifty minutes, though! Not enough for us, certainly.

I’ve read a fair amount about Jenkins CI [3], which is very actively developed, has zillions of plugins, and integrates with all sorts of sites. It’s also open source and locally installable, which means we could set it to email, generate online reports, tell Phabricator (or GitHub*) that a patch set is bogus, dispense coffee, etc. It might warrant more investigation as a possible replacement for buildbots.

* I agree with most of your points with mixing so many tools, each with their own methodologies. Although I’d get a warm and fuzzy feeling from being able to fork and send a pull request that gets automatically validated, it probably doesn’t makes sense to pursue that right now.

*** Oh my!

This response has gotten pretty long! Apologies if I missed something, or otherwise misunderstood. Anyway, if there’s a path here that seems sensible, I’ll have a go at it.

Will

[1] http://haskell.inf.elte.hu/builders/
[2] https://ghc.haskell.org/trac/ghc/wiki/CrossCompilation
[3] http://jenkins-ci.org

On Jun 18, 2014, at 7:53 PM, Austin Seipp <austin at well-typed.com> wrote:

> Hi William,
> 
> Thanks for the email. Here're some things to consider.
> 
> For one, cross compilation is a hot topic, but it is going to be a
> rather large amount of work to fix and it won't be easy. The primary
> problem is that we need to make Template Haskell cross-compile, but in
> general this is nontrivial: TemplateHaskell must load and run object
> code on the *host* platform, but the compiler must generate code for
> the *target* platform. There are ways around some of these problems;
> for one, we could compile every module twice, once for the host, and
> once for the target. Upon requesting TH, the Host GHC would load Host
> Object Code, but the final executable would link with the Target
> Object Code.
> 
> There are many, many subtle points to consider if we go down this
> route - what happens for example if I cross compile from a 64bit
> machine to a 32bit one, but TemplateHaskell wants some knowledge like
> what "sizeOf (undefined :: CLong)" is? The host code sees a 64-bit
> quantity while the target actually will deal with a 32bit one. This
> could later explode horribly. And this isn't limited to different
> endianness either - it applies to the ABI in general. 64bit Linux ->
> 64bit Windows would be just as problematic with this exact case, as
> one uses LP64, while the other uses LLP64 data models.
> 
> So #1 by itself is a very, very non-trivial amount of work, and IMO I
> don't think it's necessary for better builds. There are other routes
> possible for cross compilation perhaps, but I'd speculate they are all
> equally as non-trivial as this one.
> 
> Finally, the remainder of the scheme, including shipping builds to
> remote machines and have them be tested sounds a bit more complicated,
> and I'm wondering what the advantages are. In particular it seems like
> this merely exposes more opportunities for failure points in the CI
> system, because now all CI depends on cross compilation working
> properly, being able to ship reports back and forth, and more.
> Depending on CC in particular is a huge burden it sounds: it makes it
> hard to distinguish when a cross-compilation bug may cause a failure
> as opposed to a changeset from a committer, which widens the scope of
> what we need to consider. A CI system should be absolutely as
> predictable as possible, and this adds a *lot* of variables to the
> mix. Cross compilation is really something that's not just one big
> task - there will be many *small* bugs laying in wait after that, the
> pain of a thousand cuts.
> 
> Really, we need to distinguish between two needs:
> 
> 1) Continuous integration.
> 
> 2) Nightly builds.
> 
> These two systems have very different needs in practice:
> 
> 1) A CI system needs to be *fast*, and it needs to have dedicated
> resources to respond to changes quickly. This means we need to
> *minimize* the amount of time for developer turn around to see
> results. That includes minimizing the needed configurations. Shipping
> builds to remote machines just for CI would greatly complicate this
> and likely make it far longer on its own, not to mention it increases
> with every system we add.
> 
> 2) A nightly build system is under nowhere near the same time
> constraints, although it also needs to be dedicated. If an ARM/Linux
> machine takes 6 hours to build (perhaps it's shared or something, or
> just really wimpy), that's totally acceptable. These can then report
> nightly about the results and we can reasonably blame
> people/changesets based on that.
> 
> Finally, both of these become more complicated by the fact GHC is a
> large project that has a highly variable number of configurations we
> have to keep under control: static, dynamic, static+dynamic,
> profiling, LLVM builds, builds where GHC itself is profiled, as well
> as the matrix of those combinations: LLVM+GHC Profiled, etc etc etc.
> Each of these configurations expose bugs in their own right.
> Unfortunately doing #1 with all these configurations would be
> ludicrous: it would explode the build times for any given system, and
> it also drastically multiplies the hardware resources we'd need for CI
> if we wanted them to respond quickly to any given changeset, because
> you not only have to *build* them, you must run them. And now you have
> to run a lot of them. A nightly build system is more reasonable for
> these problems, because taking hours and hours is expected. These
> problems would still be true even with cross compilation, because it
> multiplies the amount of work every CI run must do no matter what.
> 
> We actually already do have both of these already, too: Joachim
> Breitner for example has set us up a Travis-CI[1] setup, while Gabor
> Pali has set us up nightly builds[2]. Travis-CI does the job of fast
> CI, but it's not good for a few reasons:
> 
> 1) We have literally zero visibility into it for reports. Essentially
> we only know when it explodes because Joachim yells at us (normally at
> me :) This is because GitHub is not our center-of-the-universe,
> despite how much people yearn for it to be so.
> 
> 2) The time limit is unacceptable. Travis-CI for example actually
> cannot do dynamic builds of GHC because it takes too long. Considering
> GHC is shipping dynamically on major platforms now, that's quite a
> huge loss for a CI system to miss (and no, a separate build matrix
> configuration doesn't work here - GHC builds statically and
> dynamically at the same time, and ships both - there's no way to have
> "only static" and "only dynamic" entries.)
> 
> 3) It has limited platform support - only recently did it have OS X,
> and Windows is not yet in sight. Ditto for FreeBSD. These are crucial
> for CI as well, as they encompass all our Tier-1 platforms. This could
> be fixed with cross compilation, but again, that's a big, big project.
> 
> And finally, on the GitHub note, as I said in the prior thread about
> Phabricator, I don't actually think it offers us anything useful at
> this point in time - literally almost nothing other than "other
> projects use GitHub", which is not an advantage, it's an appeal to
> popularity IMO. Webhooks still cannot do things like ban tabs,
> trailing whitespace, or enforce submodule integrity. We have to have
> our own setup for all of that. I'm never going to hit the 'Merge
> Button' for PRs - validation is 100% mandatory on behalf of the
> merger, and again, Travis-CI cannot provide coherent coverage even if
> we could use it for that. And because of that there's no difference
> between GitHub any other code site - I have to pull the branch
> manually and test myself, which I could do with any random git
> repository in the world.
> 
> The code review tools are worse than Phabricator. Finally, if we are
> going to accept patches from people, we need to have a coherent,
> singular way to do it - mixing GitHub PRs, Phabricator, and uploading
> patches to Trac is just a nightmare for pain, and not just for me,
> even though I do most of the patch work - it incurs the burden on
> *every* person who wants to review code to now do so in many separate
> places. And we need to make code review *easier*, not harder! If
> anything, we should be consolidating on a single place (obviously, I'd
> vote for Phabricator), not adding more places to make changes that we
> all have to keep up with, when we don't even use the service itself!
> That's why I proposed Phabricator: because it is coherent and a
> singular place to go to, and very good at what it does, and does not
> attempt to 'take over' GHC itself. GitHub is a fairly all-or-nothing
> proposition if you want any benefits it delivers, if you ask me (I say
> this as someone who likes GitHub for smaller projects). I just don't
> think their tools are suitable for us.
> 
> So, back to the topic. I think the nightly builds are actually in an
> OK state at the moment, since we do get reports from them, and
> builders do check in regularly. The nightly builders also cover a more
> diverse set of platforms than our CI will. But the CI and turnaround
> could be *greatly* improved, I think, because ghc-complete is
> essentially ignored or unknown by many people.
> 
> So I'll also make a suggestion: just to actually get something that
> will pull GHC's repo every 10 minutes or so, do a build, and then
> email ghc-devs *only* if failures pop up. In fact, we could just
> re-use the existing nightly build infrastructure for this, and just
> make it check very regularly, and just run standard amd64/Linux and
> Windows builds upon changes. I could provide hardware for this. This
> would increase the visibility of reports, not require *any* new code,
> and already works.
> 
> Overall, I will absolutely help you in every possible way, because
> this really is a problem for newcomers, and existing developers, when
> we catch dumb failures later than we should. But I think the proposed
> solution here is extraordinarily complex in comparison to what we
> actually need right now.
> 
> ... I will say that if you *did* fix cross compilation however to work
> with TH you would be a hero to many people - myself included -
> continuous integration aside! :)
> 
> [1] https://github.com/nomeata/ghc-complete
> [2] http://haskell.inf.elte.hu/builders/
> 
> On Wed, Jun 18, 2014 at 3:10 PM, William Knop
> <william.knop.nospam at gmail.com> wrote:
>> Hello all,
>> 
>> I’ve seen quite a few comments on the list and elsewhere lamenting the time it takes to compile and validate ghc. It’s troublesome not only because it’s inconvenient, but, more seriously, people are holding off on sending patches in which stifles development. I would like to propose a solution:
>> 
>> 1. Implement proper cross-compilation, such that build and host may be different— e.g. a linux x86_64 machine can build ghc that runs on Windows x86. What sort of work would this entail?
>> 
>> 2. Batch cross-compiled builds for all OSs/archs on a continuous integration service (e.g. Travis CI) or cloud service, then package up the binaries with the test suite.
>> 
>> 3. Send the package to our buildbots, and run the test suite.
>> 
>> 4. (optional) If using a CI service, have the buildbots send results back to the CI. This could be useful if we'd use GitHub for pulls in the future *.
>> 
>> Cheers,
>> Will
>> 
>> 
>> * I realize vanilla GitHub currently has certain annoying limitations, though some of them are pretty easy to solve via the github-services and/or webhooks. I don’t think this conflicts with the desire to use Phabricator, either, so I’ll send details and motivations to that thread.
>> 
>> 
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>> 
> 
> 
> 
> -- 
> Regards,
> 
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/