[GHC DevOps Group] CI effort status

Manuel M T Chakravarty manuel.chakravarty at tweag.io
Mon Feb 5 03:12:32 UTC 2018


Hi Ben,

Thanks a lot for the summary.

> 04.02.2018 04:28 Ben Gamari <ben at well-typed.com>:
> 
> Ben Gamari <ben at well-typed.com> writes:
> 
>> Manuel M T Chakravarty <manuel.chakravarty at tweag.io> writes:
>> 
>>> Hi Ben,
>>> 
>>> I meant to post on https://github.com/appveyor/ci/issues/517
>>> <https://github.com/appveyor/ci/issues/517> to request an increased
>>> limit, but didn’t get around to it yet. If you’d be able to put a
>>> request on that issue, that’d be great.
>>> 
>> Sure, I'm on it.
>> 
> I have created a new GHCAppveyor (due to name length constraints)
> Appveyor [1] project and configured it to pull from the ghc/ghc GitHub
> mirror. I have also requested, and was granted, the typical build time
> limit extension to 90 minutes.
> 
> Unfortunately, it seems that even 90 minutes is insufficient to even
> finish a build, much less run the testsuite, under Appveyor's build
> environment. Given where the build was terminated, I would guess that
> it would need at least another 10 minutes of compilation to make it to
> the testsuite. On top of this the testsuite will require another ~35
> minutes (as it is quite heavy on process spawning, which is very
> expensive on Windows).
> 
> I haven't yet inquired as to whether a further build time extension
> would be possible. However, I am not hopeful that our plan of using
> Appveyor will be feasible without purchasing build time.

As we previously discussed, if we need to purchase build time, then so be it. However, I have been wondering about the following approach. Given that we want to eventually run the testsuite from a vanilla distribution produced by the build process, would it be feasible to split the building and running the testsuite into two runs?

> On the CircleCI front, I have been continuing work to clear up the
> remaining build failures. At this point only two remain:
> 
> * I have a patch (D4360) to fix T11489 by running our build jobs as an
>   unprivileged user
> 
> * scc01 appears to be slightly non-deterministic; I am investigating
>   this.
> 
> Unfortunately the CircleCI infrastructure is still exhibiting a fair
> amount of flakiness. See, for instance, this build which is shown to be
> "Cancelled" despite having finished (and having apparently been run at
> least twice). Judging from the build history, this seems to be a fairly
> regular occurrence. I have contacted CircleCI about this but have not
> yet heard back.

I’ll ask our devs, what their experience has been lately.

Cheers,
Manuel

> I am also occassionally seeing rather extreme variance in test times.
> In particular the linux-llvm target usually completes in around 4 hours
> 20 minutes, but sometimes takes over 5 hours, resulting in the build
> timing out. It appears that the build hangs during the testsuite run
> (e.g. [2]); it's not impossible that this is due to a bug in the
> testsuite driver but I have been able to reproduce this neither locally
> nor remotely on CircleCI infrastructure so it has proved to be a tough
> nut to crack.
> 
> Cheers,
> 
> - Ben
> 
> [1] https://ci.appveyor.com/project/GHCAppveyor/ghc
> [2] https://circleci.com/gh/ghc/ghc/1558
> _______________________________________________
> Ghc-devops-group mailing list
> Ghc-devops-group at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devops-group



More information about the Ghc-devops-group mailing list