Proposal: accept pull requests on GitHub

Simon Peyton Jones simonpj at microsoft.com
Mon Sep 7 13:47:01 UTC 2015


I am very much at the ignorant end of this debate: I'll just use whatever I'm told to use.  But I do resonate with this observation from Austin:

|  For one, having two code review tools of any form is completely
|  bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
|  switch, we should just switch. Having to have people decide how to
|  contribute with two tools is as crazy as having two VCSs and just a
|  way of asking people to get *more* confused, and have us answer more
|  questions. That's something we need to avoid.

As a code contributor and reviewer, this is awkward. As a contributor, how do I choose? As a reviewer I'm presumably forced to learn both tools.

But I'll go with the flow... I do not have a well-informed opinion about the tradeoffs.

(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket?  Then we could convert the former (if someone wants to submit that way) into the latter.)

Simon


|  -----Original Message-----
|  From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
|  Austin Seipp
|  Sent: 03 September 2015 05:42
|  To: Niklas Hambüchen
|  Cc: Simon Marlow; ghc-devs at haskell.org
|  Subject: Re: Proposal: accept pull requests on GitHub
|  
|  (JFYI: I hate to announce my return with a giant novel of negative-
|  nancy-ness about a proposal that just came up. I'm sorry about this!)
|  
|  TL;DR: I'm strongly -1 on this, because I think it introduces a lot of
|  associated costs for everyone, the benefits aren't really clear, and I
|  think it obscures the real core issue about "how do we get more
|  contributors" and how to make that happen. Needless to say, GitHub
|  does not magically solve both of these AFAICS.
|  
|  As is probably already widely known, I'm fairly against GitHub because
|  I think at best its tools are mediocre and inappropriate for GHC - but
|  I also don't think this proposal or the alternatives stemming from it
|  are very good, and that it reduces visibility of the real, core
|  complaints about what is wrong. Some of those problems may be with
|  Phabricator, but it's hard to sort the wheat from the chaff, so to
|  speak.
|  
|  For one, having two code review tools of any form is completely
|  bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
|  switch, we should just switch. Having to have people decide how to
|  contribute with two tools is as crazy as having two VCSs and just a
|  way of asking people to get *more* confused, and have us answer more
|  questions. That's something we need to avoid.
|  
|  For the same reason, I'm also not a fan of 'use third party thing to
|  augment other thing to remove its deficiencies making it OK', because
|  the problem is _it adds surface area_ and other problems in other
|  cases. It is a solution that should be considered a last resort,
|  because it is a logical solution that applies to everything. If we
|  have a bot that moves GH PRs into Phab and then review them there, the
|  surface area of what we have to maintain and explain has suddenly
|  exploded: because now instead of 1 thing we have 3 things (GH, Phab,
|  bot) and the 3 interactions between them, for a multiplier of *six*
|  things we have to deal with. And then we use reviewable,io, because GH
|  reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian.
|  We can logically 'automate' everything in all ways to make all
|  contributors happy, but there's a real *cognitive* overhead to this
|  and humans don't scale as well as computers do. It is not truly
|  'automated away' if the cognitive burden is still there.
|  
|  I also find it extremely strange to tell people "By the way, this
|  method in which you've contributed, as was requested by community
|  members, is actually a complete proxy for the real method of
|  contributing, you can find all your imported code here". How is this
|  supposed to make contribution *easier* as opposed to just more
|  confusing? Now you've got the impression you're using "the real thing"
|  when in reality it's shoved off somewhere else to have the nitpicking
|  done. Just using Phabricator would be less complicated, IMO, and much
|  more direct.
|  
|  The same thing goes for reviewable.io. Adding it as a layer over
|  GitHub just makes the surface area larger, and puts less under our
|  control. And is it going to exist in the same form in 2 or 3 years?
|  Will it continue to offer the same tools, the same workflows that we
|  "like", and what happens when we hit a wall? It's easy to say
|  "probably" or "sure" to all this, until we hit something we dislike
|  and have no possibility of fixing.
|  
|  And once you do all this, BTW, you can 'never go back'. It seems so
|  easy to just say 'submit pull requests' once and nothing else, right?
|  Wrong. Once you commit to that infrastructure, it is *there* and
|  simply taking it out from under the feet of those using it is not only
|  unfortunate, it is *a huge timesink to undo it all*. Which amounts to
|  it never happening. Oh, but you can import everything elsewhere! The
|  problem is you *can't* import everything, but more importantly you
|  can't *import my memories in another way*, so it's a huge blow to
|  contributors to ask them about these mental time sinks, then to forget
|  them all. And as your project grows, this becomes more of a memory as
|  you made a first and last choice to begin with.
|  
|  Phabricator was 'lucky' here because it had the gateway into being the
|  first review tool for us. But that wasn't because it was *better* than
|  GitHub. It was because we were already using it, and it did not
|  interact badly with our other tools or force us to compromise things -
|  so the *cost* was low. The cost is immeasurably higher by default
|  against GitHub because of this, at least to me. That's just how it is
|  sometimes.
|  
|  Keep in mind there is a cost to everything and how you fix it. GitHub
|  is not a simple patch to add a GHC feature. It is a question that
|  fundamentally concerns itself with the future of the project for a
|  long time. The costs must be analyzed more aggressively. Again,
|  Phabricator had 'first child' preferential treatment. That's not
|  something we can undo now.
|  
|  I know this sounds like a lot of ad hoc mumbo jumbo, but please bear
|  with me: we need to identify the *root issue* here to fix it.
|  Otherwise we will pay for the costs of an improper fix for a long
|  time, and we are going to keep having this conversation over, and over
|  again. And we need to weigh in the cost of fixing it, which is why I
|  mention that so much.
|  
|  So with all this in mind, you're back to just using GitHub. But again
|  GitHub is quite mediocre at best. So what is the point of all this?
|  It's hinted at here:
|  
|  > the number of contributions will go up, commits will be smaller, and
|  there will be more of them per pull request (contributors will be able
|  to put style changes and refactorings into separate commits, without
|  jumping through a bunch of hoops).
|  
|  The real hint is that "the number of contributions will go up". That's
|  a noble goal and I think it's at the heart of this proposal.
|  
|  Here's the meat of it question: what is the cost of achieving this
|  goal? That is, what amount of work is sufficient to make this goal
|  realizable, and finally - why is GitHub *the best use of our time for
|  achieving this?* That's one aspect of the cost - that it's the best
|  use of the time. I feel like this is fundamentally why I always seem
|  to never 'get' this argument, and I'm sure it's very frustrating on
|  behalf of the people who have talked to me about it and like GitHub.
|  But I feel like I've never gotten a straight answer for GHC.
|  
|  If the goal is actually "make more people contribute", that's pretty
|  broad. I can make that very easy: give everyone who ever submits a
|  patch push access. This is a legitimate way to run large projects that
|  has worked. People will almost certainly be more willing to commit,
|  especially when overhead on patch submission is reduced so much. Why
|  not just do that instead? It's not like we even mandate code review,
|  although we could. You could reasonably trust CI to catch and revert
|  things a lot of the time for people who commit directly to master. We
|  all do it sometimes.
|  
|  I'm being serious about this. I can start doing that tomorrow because
|  the *cost is low*, both now and reasonably speaking into some
|  foreseeable future. It is one of many solutions to raw heart of the
|  proposal. GitHub is not a low cost move, but also, it is a *long term
|  cost* because of the technical deficiencies it won't aim to address
|  (merge commits are ugly, branch reviews are weak, ticket/PR namespace
|  overlaps with Trac, etc etc) or that we'll have to work around.
|  
|  That means that if we want GitHub to fix the "give us more
|  contributors" problem, and it has a high cost, it not only has _to fix
|  the problem_, it also has to do that well enough to offset its cost. I
|  don't think it's clear that is the case right now, among a lot of
|  other solutions.
|  
|  I don't think the root issue is "We _need_ GitHub to get more
|  contributors". It sounds like the complaint is more "I don't like how
|  Phabricator works right now". That's an important distinction, because
|  the latter is not only more specific, it's more actionable:
|  
|    - Things like Arcanist can be tracked as a Git submodule. There is
|  little to no pain in this, it's low cost, and it can always be
|  synchronized with Phabricator. This eliminates the "Must clone
|  arcanist" and "need to upgrade arcanist" points.
|  
|    - Similarly when Phabricator sometimes kills a lot of builds, it's
|  because I do an upgrade. That's mostly an error on my part and I can
|  simply schedule upgrades regularly, barring hotfixes or somesuch. That
|  should basically eliminate these. The other build issues are from
|  picking the wrong base commit from the revision, I think, which I
|  believe should be fixable upstream (I need to get a solid example of
|  one that isn't a mega ultra patch.)
|  
|    - If Harbormaster is not building dependent patches as mentioned in
|  WhyNotPhabricator, that is a bug, and I have not been aware of it.
|  Please make me aware of it so I can file bugs! I seriously don't look
|  at _every_ patch, I need to know this. That could have probably been
|  fixed ASAP otherwise.
|  
|    - We can get rid of the awkwardness of squashes etc by using
|  Phabricator's "immutable" history, although it introduces merge
|  commits. Whether this is acceptable is up to debate (I dislike merge
|  commits, but could live with it).
|  
|    - I do not understand point #3, about answering questions. Here's
|  the reality: every single one of those cases is *almost always an
|  error*. That's not a joke. Forgetting to commit a file, amending
|  changes in the working tree, and specifying a reviewer are all total
|  errors as it stands today. Why is this a minus? It catches a useful
|  class of 'interaction bugs'. If it's because sometimes Phabricator
|  yells about build arifacts in the tree, those should be .gitignore'd.
|  If it's because you have to 'git stash' sometimes, this is fairly
|  trivial IMO. Finally, specifying reviewers IS inconvenient, but
|  currently needed. We could easily assign a '#reviewers' tag that would
|  add default reviewers.
|      - In the future, Phabricator will hopefully be able to
|  automatically assign the right reviewers to every single incoming
|  patch, based on the source file paths in the tree, using the Owners
|  tool. Technically, we could do that today if we wanted, it's just a
|  little more effort to add more Herald rules. This will be far, far
|  more robust than anything GitHub can offer, and eliminates point #3.
|  
|    - Styling, linting etc errors being included, because reviews are
|  hard to create: This is tangential IMO. We need to just bite the
|  bullet on this and settle on some lint and coding styles, and apply
|  them to the tree uniformly. The reality is *nobody ever does style
|  changes on their own*, and they are always accompanied by a diff, and
|  they always have to redo the work of pulling them out, Phab or not.
|  Literally 99% of the time we ask for this, it happens this way.
|  Perhaps instead we should just eliminate this class of work by just
|  running linters over all of the source code at once, and being happy
|  with it.
|  
|    Doing this in fact has other benefits: like `arc lint` will always
|  _correctly_ report when linting errors are violated. And we can reject
|  patches that violate them, because they will always be accurate.
|  
|    - As for some of the quotes, some of them are funny, but the real
|  message lies in the context. :) In particular, there have been several
|  cases (such as the DWARF work) where the idea was "write 30 commits
|  and put them on Phabricator". News flash: *this is bad*, no matter
|  whether you're using Phabricator or not, because it makes reviewing
|  the whole thing immensely difficult from a reviewer perspective. The
|  point here is that we can clear this up by being more communicative
|  about what we expect of authors of large patches, and communicating
|  your intent ASAP so we can get patches in as fast as possible. Writing
|  a patch is the easiest part of the work.
|  
|  And more:
|  
|    - Clean up the documentation, it's a mess. It feels nice that
|  everything has clear, lucid explanations on the wiki, but the wiki is
|  ridiculously massive and we have a tendancy for 'link creep' where we
|  spread things out. The contributors docs could probably stand to be
|  streamlined. We would have to do this anyway, moving to GitHub or not.
|  
|    - Improve the homepage, directly linking to this aforementioned
|  page.
|  
|    - Make it clear what we expect of contributors. I feel like a lot of
|  this could be explained by having a 5 minute drive-by guide for
|  patches, and then a longer 10-minute guide about A) How to style
|  things, B) How to format your patches if you're going to contribute
|  regularly, C) Why it is this way, and D) finally links to all the
|  other things you need to know. People going into Phabricator expecting
|  it to behave like GitHub is a problem (more a cultural problem IMO but
|  that's another story), and if this can't be directly fixed, the best
|  thing to do is make it clear why it isn't.
|  
|  Those are just some of the things OTTOMH, but this email is already
|  way too long. This is what I mean though: fixing most of these is
|  going to have *seriously smaller cost* than moving to GitHub. It does
|  not account for "The GitHub factor" of people contributing "just
|  because it's on GitHub", but again, that value has to outweigh the
|  other costs. I'm not seriously convinced it does.
|  
|  I know it's work to fix these things. But GitHub doesn't really
|  magically make a lot of our needs go away, and it's not going to
|  magically fix things like style or lint errors, the fact Travis-CI is
|  still pretty insufficient for us in the long term (and Harbormaster is
|  faster, on our own hardware, too), or that it will cause needlessly
|  higher amounts of spam through Trac and GitHub itself. I don't think
|  settling on it as - what seems to be - a first resort, is a really
|  good idea.
|  
|  
|  On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hambüchen <mail at nh2.me> wrote:
|  > On 02/09/15 22:42, Kosyrev Serge wrote:
|  >> As a wild idea -- did anyone look at /Gitlab/ instead?
|  >
|  > Hi, yes. It does not currently have a sufficient review
|  functionality
|  > (cannot handle multiple revisions easily).
|  >
|  > On 02/09/15 20:51, Simon Marlow wrote:
|  >> It might feel better
|  >> for the author, but discovering what changed between two branches
|  of
|  >> multiple commits on github is almost impossible.
|  >
|  > I disagree with the first part of this: When the UI of the review
|  tool
|  > is good, it is easy to follow. But there's no open-source
|  > implementation of that around.
|  >
|  > I agree that it is not easy to follow on Github.
|  > _______________________________________________
|  > ghc-devs mailing list
|  > ghc-devs at haskell.org
|  > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
|  >
|  
|  
|  
|  --
|  Regards,
|  
|  Austin Seipp, Haskell Consultant
|  Well-Typed LLP, http://www.well-typed.com/
|  _______________________________________________
|  ghc-devs mailing list
|  ghc-devs at haskell.org
|  http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list