[GHC DevOps Group] CI effort status

Gershom B gershomb at gmail.com
Mon Feb 5 02:41:36 UTC 2018


A question from an observer here -- my understanding was that part of
the plan with the shift in  CI infrastructure was that the burden
would be lifted from Ben's exclusive shoulders here and there would be
some greater division of labor, which is made possible in part by
using shared standard services rather than self-hosted  solutions. But
at the moment I see reports largely of Ben continuing to try to
resolve issues and move this plan forward on his own. Is there still
some medium-term plan have a more collective effort in this
transition?

--Gershom


On Sat, Feb 3, 2018 at 12:28 PM, Ben Gamari <ben at well-typed.com> wrote:
> Ben Gamari <ben at well-typed.com> writes:
>
>> Manuel M T Chakravarty <manuel.chakravarty at tweag.io> writes:
>>
>>> Hi Ben,
>>>
>>> I meant to post on https://github.com/appveyor/ci/issues/517
>>> <https://github.com/appveyor/ci/issues/517> to request an increased
>>> limit, but didn’t get around to it yet. If you’d be able to put a
>>> request on that issue, that’d be great.
>>>
>> Sure, I'm on it.
>>
> I have created a new GHCAppveyor (due to name length constraints)
> Appveyor [1] project and configured it to pull from the ghc/ghc GitHub
> mirror. I have also requested, and was granted, the typical build time
> limit extension to 90 minutes.
>
> Unfortunately, it seems that even 90 minutes is insufficient to even
> finish a build, much less run the testsuite, under Appveyor's build
> environment. Given where the build was terminated, I would guess that
> it would need at least another 10 minutes of compilation to make it to
> the testsuite. On top of this the testsuite will require another ~35
> minutes (as it is quite heavy on process spawning, which is very
> expensive on Windows).
>
> I haven't yet inquired as to whether a further build time extension
> would be possible. However, I am not hopeful that our plan of using
> Appveyor will be feasible without purchasing build time.
>
>
> On the CircleCI front, I have been continuing work to clear up the
> remaining build failures. At this point only two remain:
>
>  * I have a patch (D4360) to fix T11489 by running our build jobs as an
>    unprivileged user
>
>  * scc01 appears to be slightly non-deterministic; I am investigating
>    this.
>
> Unfortunately the CircleCI infrastructure is still exhibiting a fair
> amount of flakiness. See, for instance, this build which is shown to be
> "Cancelled" despite having finished (and having apparently been run at
> least twice). Judging from the build history, this seems to be a fairly
> regular occurrence. I have contacted CircleCI about this but have not
> yet heard back.
>
> I am also occassionally seeing rather extreme variance in test times.
> In particular the linux-llvm target usually completes in around 4 hours
> 20 minutes, but sometimes takes over 5 hours, resulting in the build
> timing out. It appears that the build hangs during the testsuite run
> (e.g. [2]); it's not impossible that this is due to a bug in the
> testsuite driver but I have been able to reproduce this neither locally
> nor remotely on CircleCI infrastructure so it has proved to be a tough
> nut to crack.
>
> Cheers,
>
> - Ben
>
> [1] https://ci.appveyor.com/project/GHCAppveyor/ghc
> [2] https://circleci.com/gh/ghc/ghc/1558
>
> _______________________________________________
> Ghc-devops-group mailing list
> Ghc-devops-group at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devops-group
>


More information about the Ghc-devops-group mailing list