[GHC DevOps Group] CI effort status

Simon Peyton Jones simonpj at microsoft.com
Wed Feb 7 08:39:25 UTC 2018


Manuel, my apologies: I should certainly have included Tweag in the list, which I typed too hurriedly.  It has been a huge relief to me to have a new /proactive/ source of leadership on GHC.  That's a really big contribution. Thank you Tweag!

My main point was that that we should seek to widen the group that contributes, whether it's leadership, time, compute resources, or money, so that the burden does not fall too heavily on a few, such as yourself.

Simon

|  -----Original Message-----
|  From: Manuel M T Chakravarty [mailto:manuel.chakravarty at tweag.io]
|  Sent: 07 February 2018 02:34
|  To: Simon Peyton Jones <simonpj at microsoft.com>
|  Cc: Gershom Bazerman <gershomb at gmail.com>; ghc-devops-
|  group at haskell.org
|  Subject: Re: [GHC DevOps Group] CI effort status
|  
|  I would more than welcome concrete offers of resources or suggestions
|  on how to get more resources. Mathieu and I have worked towards
|  getting additional resources since we announced the group, but these
|  things (apparently) take time. We surely could use the help of
|  everybody involved in this group!
|  
|  Cheers,
|  Manuel
|  
|  PS: Just a gentle reminder that several Tweag people (including me)
|  have spent Tweag time on this effort. This certainly doesn’t match the
|  investment of Facebook or Microsoft, but it serves as constructive
|  proof that this is not just for large firms.
|  
|  > 07.02.2018 00:31 Simon Peyton Jones <simonpj at microsoft.com>:
|  >
|  > Thanks Gershom.
|  >
|  > I think of the devops group as
|  >
|  > 1 Broadening "ownership" of GHC's development and release processes,
|  > so that a larger group of people feel that they can influence and
|  > contribute to GHC's development, and hence feel more comfortable
|  > making GHC mission-critical to their business or other plans
|  >
|  > 2 Making it more likely that what we do with GHC actually matches
|  > what GHC's users want
|  >
|  > 3 Broadening and deepening the pool of stakeholders who are  willing
|  > to contribute time and/or money to making GHC into the
|  >  solidly reliable tool that they need.   (Currently we have
|  >  Microsoft, Facebook, IOHK contributing directly, I think.)
|  >
|  > I think Gershom's message is really about (3).  To me, progress on
|  > (1) and (2) will help to make the case for (3).  But I don’t want to
|  > lose sight of (3).  The factor that precipitated the devops group's
|  > formation was a sudden awareness about how vulnerable we are, as a
|  > community, to a very small number supporters.
|  >
|  > Discussion at ICFP made me think that several other companies would
|  > consider making donations, if (a) we had a compelling case that it'd
|  > be money well spent, and (b) the actual process worked. For (b) I
|  > think some would prefer a central fund; others might prefer a
|  specific
|  > task or set of tasks to fund.  The discussion on mechanism is a bit
|  > stalled I think.
|  >
|  > We don't currently have a crisis.  But I think there may already be
|  > things for which application of money might help: e.g. paying for
|  > CircleCI cycles rather than spending Ben's time trying to shoehorn
|  > everything into the for-free limits.  Maybe Appveyor is similar.
|  >
|  > So I would very much welcome it if the Devops group could take, as
|  an
|  > important task (even if it is not day-to-day urgent) task, working
|  out
|  > a sustainable model for GHC's maintenance, support, CI, and
|  releases.
|  >
|  > Simon
|  >
|  >
|  > |  -----Original Message-----
|  > |  From: Ghc-devops-group
|  > | [mailto:ghc-devops-group-bounces at haskell.org]
|  > |  On Behalf Of Gershom B
|  > |  Sent: 05 February 2018 03:58
|  > |  To: Manuel Chakravarty <mchakravarty at me.com>
|  > |  Cc: ghc-devops-group at haskell.org
|  > |  Subject: Re: [GHC DevOps Group] CI effort status
|  > |
|  > |  Let my articulate my question a bit more clearly. Looking at the
|  > | devops group charter
|  > | (https://ghc.haskell.org/trac/ghc/wiki/DevOpsGroupCharter), it
|  says
|  > | the following about the goals:
|  > |
|  > |
|  > |  The mission of the GHC DevOps Group is to
|  > |
|  > |  * to take leadership of the devops aspects of GHC,
|  > |  * to resource it better, and
|  > |  * to broaden the sense of community ownership and control of GHC.
|  > |
|  > |
|  > |  Further it says under “Resources”:
|  > |
|  > |  "The GHC DevOps Group identifies the ongoing and one-off devops
|  > | requirements of GHC. It develops and manages the strategies and
|  > | projects to implement the needed tools, processes, and
|  documentation
|  > | to meet those requirements. To that end and on the basis of
|  > | actionable  project plans, it seeks to obtain the necessary
|  > | resources from  organisations that rely on GHC as a production-
|  ready
|  > | tool. By doing  this, we aim to unlock more resources than are
|  > | currently available. At  the same time, we seek broad community
|  > | ownership to minimise the load  on any single contributor and to
|  avoid a single point of failure."
|  > |
|  > |  My concern is at the moment there has been discussion regarding
|  > | devops  aspects, and perhaps a broadened sense of community
|  > | ownership and  control.
|  > |
|  > |  But I do not see better resourcing, although the initial
|  > | contributions  of CI configurations were certainly a good
|  kickstart.
|  > | As such, I do  not see community ownership in the sense of the
|  > | latter paragraph —  i.e. in the sense that it will “minimise the
|  > | load on any single  contributor” and thus “avoid a single point of
|  failure.”
|  > |
|  > |  The way this works, as I understand it, is a quid-pro-quo. In
|  order
|  > | to  accomplish goals with regards to regularity of GHC releases,
|  > | streamlined processes, etc., there needs to be at least some
|  > | infusion  of resources, presumably “unlocked” from "organisations
|  > | that rely on  GHC as a production-ready tool”.
|  > |
|  > |  Otherwise this quickly becomes expecting a variety of new work
|  from
|  > | the same cast of characters, just with more voices on a
|  mailinglist
|  > | chiming in with proposals as to what they would like see
|  accomplished.
|  > |
|  > |  I am well aware that assembling resources and pulling them
|  together
|  > | is  _hard_, and many attempts to do so founder. I’ve been
|  > | participant in  any number of foundered attempts myself over the
|  > | years, or attempts  that have accomplished a few useful things,
|  but
|  > | far from even the  modest initial goals they set out with.
|  > |
|  > |  But I do not want this aspect of the DevOps Group charter to fade
|  > | from  consciousness — getting these resources is not automatic. It
|  > | requires  constant shaking of tree branches, and constant attempts
|  > | to  reformulate problems and break them down in ways that make
|  more
|  > | collaboration amenable — as well as not-infrequent followup on
|  > | partial  commitments or indications towards such in the past, to
|  try
|  > | to pin  down their concrete implementation.
|  > |
|  > |  What I am seeing right now is that there is a danger of settling
|  > | into  a “new status quo” with no new resources, and I think that
|  > | would be a  not good thing for the future prospects of the DevOps
|  > | Group, and  probably quickly lead to it being yet another
|  stillborn effort.
|  > |
|  > |  I am not offering anything at the moment here — I have nothing
|  _to_
|  > | offer. But this is my attempt to provide a gentle “poke” to all
|  > | those  on the list who thought they might have some ability to
|  play
|  > | a role in  this to please be forthcoming with some proposals as to
|  > | how they might  help. One important aspect to bear in mind is that
|  > | at this point, it  seems to me that money is _not_ the issue. That
|  > | is to say, if there  was a volunteer of some skilled-ops time
|  which
|  > | had experience  wrangling with CI, that would be idea. But if
|  there
|  > | was a proffer of  some such time, but from e.g. some associated
|  > | contractor who would  need to be paid to help out, then it would
|  > | probably be feasible to  fund that as well, from any number of
|  sources.
|  > |
|  > |  The plan for migrating CI is only partially complete. The working
|  > | hypothesis is that when it is complete, it will mean less work in
|  > | the  long-run. But in my opinion, dealing with someone else’s
|  flaky
|  > | boxes  (e.g. those of CircleCI) is not much better than dealing
|  with
|  > | your own  flaky boxes, except you have to now bother other people
|  to
|  > | figure out  more stuff for you. So if we could get someone with
|  > | experience to be a  deputy CircleCI ombudsman or the like, and
|  take
|  > | charge of some aspect  of this work, I think we would have a much
|  > | greater chance of A)  success, and B) genuinely distributing the
|  workload more widely.
|  > |
|  > |  Best,
|  > |  Gershom
|  > |
|  > |
|  > |  On February 4, 2018 at 10:19:51 PM, Manuel Chakravarty
|  > |  (mchakravarty at me.com(mailto:mchakravarty at me.com)) wrote:
|  > |
|  > |  > Hi Gershom,
|  > |  >
|  > |  > Ben is surely the main actor and he has put considerable effort
|  > | into  this. However, we (Tweag) did help out initially writing
|  some
|  > | of the  original CI configurations.
|  > |  >
|  > |  > Having said that, it would be absolutely fabulous if other
|  > | developers could help out. Please let Ben and me know if you know
|  > | anybody who would be happy to help!
|  > |  >
|  > |  > Cheers,
|  > |  > Manuel
|  > |  >
|  > |  > > 05.02.2018 13:41 Gershom B :
|  > |  > >
|  > |  > > A question from an observer here -- my understanding was that
|  > | part  > > of the plan with the shift in CI infrastructure was that
|  > | the  burden  > > would be lifted from Ben's exclusive shoulders
|  here
|  > | and there  would  > > be some greater division of labor, which is
|  > | made possible in part  by  > > using shared standard services
|  rather
|  > | than self-hosted solutions.
|  > |  > > But at the moment I see reports largely of Ben continuing to
|  > | try  to  > > resolve issues and move this plan forward on his own.
|  > | Is there  still  > > some medium-term plan have a more collective
|  > | effort in this  > > transition?
|  > |  > >
|  > |  > > --Gershom
|  > |  > >
|  > |  > >
|  > |  > > On Sat, Feb 3, 2018 at 12:28 PM, Ben Gamari wrote:
|  > |  > >> Ben Gamari writes:
|  > |  > >>
|  > |  > >>> Manuel M T Chakravarty writes:
|  > |  > >>>
|  > |  > >>>> Hi Ben,
|  > |  > >>>>
|  > |  > >>>> I meant to post on
|  > |  > >>>>
|  > |  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
|  > |  > >>>>
|  > |  github.com%2Fappveyor%2Fci%2Fissues%2F517&data=02%7C01%7Csimonpj%
|  > |  > >>>>
|  > |  40microsoft.com%7C9ccf53542ebf4d6fde6b08d56c4c9ff3%7Cee3303d7fb73
|  > |  > >>>>
|  > |  4b0c8589bcd847f1c277%7C1%7C0%7C636533998742980043&sdata=THCgzdMyj
|  > |  > >>>> CjJH554JKyJZ%2FLyHSYhpH7NRBlCQI%2BiFlM%3D&reserved=0
|  > |  > >>>> to request an increased
|  > |  > >>>> limit, but didn’t get around to it yet. If you’d be able
|  to
|  > | put  a  > >>>> request on that issue, that’d be great.
|  > |  > >>>>
|  > |  > >>> Sure, I'm on it.
|  > |  > >>>
|  > |  > >> I have created a new GHCAppveyor (due to name length
|  > | constraints)  > >> Appveyor [1] project and configured it to pull
|  > | from the ghc/ghc  > >> GitHub mirror. I have also requested, and
|  was
|  > | granted, the  typical  > >> build time limit extension to 90
|  > | minutes.
|  > |  > >>
|  > |  > >> Unfortunately, it seems that even 90 minutes is insufficient
|  > | to  > >> even finish a build, much less run the testsuite, under
|  > | Appveyor's  > >> build environment. Given where the build was
|  > | terminated, I would  > >> guess that it would need at least
|  another
|  > | 10 minutes of  compilation  > >> to make it to the testsuite. On
|  top
|  > | of this the testsuite will  > >> require another ~35 minutes (as
|  it
|  > | is quite heavy on process  > >> spawning, which is very expensive
|  on
|  > | Windows).
|  > |  > >>
|  > |  > >> I haven't yet inquired as to whether a further build time
|  > | extension  > >> would be possible. However, I am not hopeful that
|  > | our plan of  using  > >> Appveyor will be feasible without
|  > | purchasing build time.
|  > |  > >>
|  > |  > >>
|  > |  > >> On the CircleCI front, I have been continuing work to clear
|  up
|  > | the  > >> remaining build failures. At this point only two remain:
|  > |  > >>
|  > |  > >> * I have a patch (D4360) to fix T11489 by running our build
|  > | jobs  as  > >> an unprivileged user  > >>  > >> * scc01 appears to
|  > | be slightly non-deterministic; I am  > >> investigating this.
|  > |  > >>
|  > |  > >> Unfortunately the CircleCI infrastructure is still
|  exhibiting
|  > | a  > >> fair amount of flakiness. See, for instance, this build
|  > | which is  > >> shown to be "Cancelled" despite having finished
|  (and
|  > | having  > >> apparently been run at least twice). Judging from the
|  > | build  > >> history, this seems to be a fairly regular occurrence.
|  I
|  > | have  > >> contacted CircleCI about this but have not yet heard
|  back.
|  > |  > >>
|  > |  > >> I am also occassionally seeing rather extreme variance in
|  test
|  > | times.
|  > |  > >> In particular the linux-llvm target usually completes in
|  > | around 4  > >> hours  > >> 20 minutes, but sometimes takes over 5
|  > | hours, resulting in the  > >> build timing out. It appears that
|  the
|  > | build hangs during the  > >> testsuite run (e.g. [2]); it's not
|  > | impossible that this is due to  a  > >> bug in the testsuite
|  driver
|  > | but I have been able to reproduce  this  > >> neither locally nor
|  > | remotely on CircleCI infrastructure so it has  > >> proved to be a
|  > | tough nut to crack.
|  > |  > >>
|  > |  > >> Cheers,
|  > |  > >>
|  > |  > >> - Ben
|  > |  > >>
|  > |  > >> [1]
|  > |  > >>
|  > |
|  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci
|  > |  > >>
|  > |
|  .appveyor.com%2Fproject%2FGHCAppveyor%2Fghc&data=02%7C01%7Csimonpj%
|  > |  > >>
|  > |
|  40microsoft.com%7C9ccf53542ebf4d6fde6b08d56c4c9ff3%7Cee3303d7fb734b
|  > |  > >>
|  > |
|  0c8589bcd847f1c277%7C1%7C0%7C636533998742980043&sdata=mDcBqtT9QibXc
|  > |  > >> ozn%2FWCPxr2mnHAKjPL3uP2mTZRIXt0%3D&reserved=0
|  > |  > >> [2]
|  > |  > >>
|  > |
|  https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci
|  > |  > >>
|  > |
|  rcleci.com%2Fgh%2Fghc%2Fghc%2F1558&data=02%7C01%7Csimonpj%40microso
|  > |  > >>
|  > |
|  ft.com%7C9ccf53542ebf4d6fde6b08d56c4c9ff3%7Cee3303d7fb734b0c8589bcd
|  > |  > >>
|  > |
|  847f1c277%7C1%7C0%7C636533998742980043&sdata=nXIxvQCdxoV9U8mbsVdLNe
|  > |  > >> gX2cIKgoEUwkIbvFtODAM%3D&reserved=0
|  > |  > >>
|  > |  > >> _______________________________________________
|  > |  > >> Ghc-devops-group mailing list
|  > |  > >> Ghc-devops-group at haskell.org
|  > |  > >> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-
|  devops-
|  > |  group
|  > |  > >>
|  > |  > > _______________________________________________
|  > |  > > Ghc-devops-group mailing list
|  > |  > > Ghc-devops-group at haskell.org
|  > |  > >
|  > | https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devops-group
|  > |  >
|  > |  _______________________________________________
|  > |  Ghc-devops-group mailing list
|  > |  Ghc-devops-group at haskell.org
|  > |  https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devops-
|  group
|  > _______________________________________________
|  > Ghc-devops-group mailing list
|  > Ghc-devops-group at haskell.org
|  > https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devops-group



More information about the Ghc-devops-group mailing list