Proposal: accept pull requests on GitHub
Simon Peyton Jones
simonpj at microsoft.com
Mon Sep 7 13:47:01 UTC 2015
I am very much at the ignorant end of this debate: I'll just use whatever I'm told to use. But I do resonate with this observation from Austin:
| For one, having two code review tools of any form is completely
| bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
| switch, we should just switch. Having to have people decide how to
| contribute with two tools is as crazy as having two VCSs and just a
| way of asking people to get *more* confused, and have us answer more
| questions. That's something we need to avoid.
As a code contributor and reviewer, this is awkward. As a contributor, how do I choose? As a reviewer I'm presumably forced to learn both tools.
But I'll go with the flow... I do not have a well-informed opinion about the tradeoffs.
(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
Simon
| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
| Austin Seipp
| Sent: 03 September 2015 05:42
| To: Niklas Hambüchen
| Cc: Simon Marlow; ghc-devs at haskell.org
| Subject: Re: Proposal: accept pull requests on GitHub
|
| (JFYI: I hate to announce my return with a giant novel of negative-
| nancy-ness about a proposal that just came up. I'm sorry about this!)
|
| TL;DR: I'm strongly -1 on this, because I think it introduces a lot of
| associated costs for everyone, the benefits aren't really clear, and I
| think it obscures the real core issue about "how do we get more
| contributors" and how to make that happen. Needless to say, GitHub
| does not magically solve both of these AFAICS.
|
| As is probably already widely known, I'm fairly against GitHub because
| I think at best its tools are mediocre and inappropriate for GHC - but
| I also don't think this proposal or the alternatives stemming from it
| are very good, and that it reduces visibility of the real, core
| complaints about what is wrong. Some of those problems may be with
| Phabricator, but it's hard to sort the wheat from the chaff, so to
| speak.
|
| For one, having two code review tools of any form is completely
| bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
| switch, we should just switch. Having to have people decide how to
| contribute with two tools is as crazy as having two VCSs and just a
| way of asking people to get *more* confused, and have us answer more
| questions. That's something we need to avoid.
|
| For the same reason, I'm also not a fan of 'use third party thing to
| augment other thing to remove its deficiencies making it OK', because
| the problem is _it adds surface area_ and other problems in other
| cases. It is a solution that should be considered a last resort,
| because it is a logical solution that applies to everything. If we
| have a bot that moves GH PRs into Phab and then review them there, the
| surface area of what we have to maintain and explain has suddenly
| exploded: because now instead of 1 thing we have 3 things (GH, Phab,
| bot) and the 3 interactions between them, for a multiplier of *six*
| things we have to deal with. And then we use reviewable,io, because GH
| reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian.
| We can logically 'automate' everything in all ways to make all
| contributors happy, but there's a real *cognitive* overhead to this
| and humans don't scale as well as computers do. It is not truly
| 'automated away' if the cognitive burden is still there.
|
| I also find it extremely strange to tell people "By the way, this
| method in which you've contributed, as was requested by community
| members, is actually a complete proxy for the real method of
| contributing, you can find all your imported code here". How is this
| supposed to make contribution *easier* as opposed to just more
| confusing? Now you've got the impression you're using "the real thing"
| when in reality it's shoved off somewhere else to have the nitpicking
| done. Just using Phabricator would be less complicated, IMO, and much
| more direct.
|
| The same thing goes for reviewable.io. Adding it as a layer over
| GitHub just makes the surface area larger, and puts less under our
| control. And is it going to exist in the same form in 2 or 3 years?
| Will it continue to offer the same tools, the same workflows that we
| "like", and what happens when we hit a wall? It's easy to say
| "probably" or "sure" to all this, until we hit something we dislike
| and have no possibility of fixing.
|
| And once you do all this, BTW, you can 'never go back'. It seems so
| easy to just say 'submit pull requests' once and nothing else, right?
| Wrong. Once you commit to that infrastructure, it is *there* and
| simply taking it out from under the feet of those using it is not only
| unfortunate, it is *a huge timesink to undo it all*. Which amounts to
| it never happening. Oh, but you can import everything elsewhere! The
| problem is you *can't* import everything, but more importantly you
| can't *import my memories in another way*, so it's a huge blow to
| contributors to ask them about these mental time sinks, then to forget
| them all. And as your project grows, this becomes more of a memory as
| you made a first and last choice to begin with.
|
| Phabricator was 'lucky' here because it had the gateway into being the
| first review tool for us. But that wasn't because it was *better* than
| GitHub. It was because we were already using it, and it did not
| interact badly with our other tools or force us to compromise things -
| so the *cost* was low. The cost is immeasurably higher by default
| against GitHub because of this, at least to me. That's just how it is
| sometimes.
|
| Keep in mind there is a cost to everything and how you fix it. GitHub
| is not a simple patch to add a GHC feature. It is a question that
| fundamentally concerns itself with the future of the project for a
| long time. The costs must be analyzed more aggressively. Again,
| Phabricator had 'first child' preferential treatment. That's not
| something we can undo now.
|
| I know this sounds like a lot of ad hoc mumbo jumbo, but please bear
| with me: we need to identify the *root issue* here to fix it.
| Otherwise we will pay for the costs of an improper fix for a long
| time, and we are going to keep having this conversation over, and over
| again. And we need to weigh in the cost of fixing it, which is why I
| mention that so much.
|
| So with all this in mind, you're back to just using GitHub. But again
| GitHub is quite mediocre at best. So what is the point of all this?
| It's hinted at here:
|
| > the number of contributions will go up, commits will be smaller, and
| there will be more of them per pull request (contributors will be able
| to put style changes and refactorings into separate commits, without
| jumping through a bunch of hoops).
|
| The real hint is that "the number of contributions will go up". That's
| a noble goal and I think it's at the heart of this proposal.
|
| Here's the meat of it question: what is the cost of achieving this
| goal? That is, what amount of work is sufficient to make this goal
| realizable, and finally - why is GitHub *the best use of our time for
| achieving this?* That's one aspect of the cost - that it's the best
| use of the time. I feel like this is fundamentally why I always seem
| to never 'get' this argument, and I'm sure it's very frustrating on
| behalf of the people who have talked to me about it and like GitHub.
| But I feel like I've never gotten a straight answer for GHC.
|
| If the goal is actually "make more people contribute", that's pretty
| broad. I can make that very easy: give everyone who ever submits a
| patch push access. This is a legitimate way to run large projects that
| has worked. People will almost certainly be more willing to commit,
| especially when overhead on patch submission is reduced so much. Why
| not just do that instead? It's not like we even mandate code review,
| although we could. You could reasonably trust CI to catch and revert
| things a lot of the time for people who commit directly to master. We
| all do it sometimes.
|
| I'm being serious about this. I can start doing that tomorrow because
| the *cost is low*, both now and reasonably speaking into some
| foreseeable future. It is one of many solutions to raw heart of the
| proposal. GitHub is not a low cost move, but also, it is a *long term
| cost* because of the technical deficiencies it won't aim to address
| (merge commits are ugly, branch reviews are weak, ticket/PR namespace
| overlaps with Trac, etc etc) or that we'll have to work around.
|
| That means that if we want GitHub to fix the "give us more
| contributors" problem, and it has a high cost, it not only has _to fix
| the problem_, it also has to do that well enough to offset its cost. I
| don't think it's clear that is the case right now, among a lot of
| other solutions.
|
| I don't think the root issue is "We _need_ GitHub to get more
| contributors". It sounds like the complaint is more "I don't like how
| Phabricator works right now". That's an important distinction, because
| the latter is not only more specific, it's more actionable:
|
| - Things like Arcanist can be tracked as a Git submodule. There is
| little to no pain in this, it's low cost, and it can always be
| synchronized with Phabricator. This eliminates the "Must clone
| arcanist" and "need to upgrade arcanist" points.
|
| - Similarly when Phabricator sometimes kills a lot of builds, it's
| because I do an upgrade. That's mostly an error on my part and I can
| simply schedule upgrades regularly, barring hotfixes or somesuch. That
| should basically eliminate these. The other build issues are from
| picking the wrong base commit from the revision, I think, which I
| believe should be fixable upstream (I need to get a solid example of
| one that isn't a mega ultra patch.)
|
| - If Harbormaster is not building dependent patches as mentioned in
| WhyNotPhabricator, that is a bug, and I have not been aware of it.
| Please make me aware of it so I can file bugs! I seriously don't look
| at _every_ patch, I need to know this. That could have probably been
| fixed ASAP otherwise.
|
| - We can get rid of the awkwardness of squashes etc by using
| Phabricator's "immutable" history, although it introduces merge
| commits. Whether this is acceptable is up to debate (I dislike merge
| commits, but could live with it).
|
| - I do not understand point #3, about answering questions. Here's
| the reality: every single one of those cases is *almost always an
| error*. That's not a joke. Forgetting to commit a file, amending
| changes in the working tree, and specifying a reviewer are all total
| errors as it stands today. Why is this a minus? It catches a useful
| class of 'interaction bugs'. If it's because sometimes Phabricator
| yells about build arifacts in the tree, those should be .gitignore'd.
| If it's because you have to 'git stash' sometimes, this is fairly
| trivial IMO. Finally, specifying reviewers IS inconvenient, but
| currently needed. We could easily assign a '#reviewers' tag that would
| add default reviewers.
| - In the future, Phabricator will hopefully be able to
| automatically assign the right reviewers to every single incoming
| patch, based on the source file paths in the tree, using the Owners
| tool. Technically, we could do that today if we wanted, it's just a
| little more effort to add more Herald rules. This will be far, far
| more robust than anything GitHub can offer, and eliminates point #3.
|
| - Styling, linting etc errors being included, because reviews are
| hard to create: This is tangential IMO. We need to just bite the
| bullet on this and settle on some lint and coding styles, and apply
| them to the tree uniformly. The reality is *nobody ever does style
| changes on their own*, and they are always accompanied by a diff, and
| they always have to redo the work of pulling them out, Phab or not.
| Literally 99% of the time we ask for this, it happens this way.
| Perhaps instead we should just eliminate this class of work by just
| running linters over all of the source code at once, and being happy
| with it.
|
| Doing this in fact has other benefits: like `arc lint` will always
| _correctly_ report when linting errors are violated. And we can reject
| patches that violate them, because they will always be accurate.
|
| - As for some of the quotes, some of them are funny, but the real
| message lies in the context. :) In particular, there have been several
| cases (such as the DWARF work) where the idea was "write 30 commits
| and put them on Phabricator". News flash: *this is bad*, no matter
| whether you're using Phabricator or not, because it makes reviewing
| the whole thing immensely difficult from a reviewer perspective. The
| point here is that we can clear this up by being more communicative
| about what we expect of authors of large patches, and communicating
| your intent ASAP so we can get patches in as fast as possible. Writing
| a patch is the easiest part of the work.
|
| And more:
|
| - Clean up the documentation, it's a mess. It feels nice that
| everything has clear, lucid explanations on the wiki, but the wiki is
| ridiculously massive and we have a tendancy for 'link creep' where we
| spread things out. The contributors docs could probably stand to be
| streamlined. We would have to do this anyway, moving to GitHub or not.
|
| - Improve the homepage, directly linking to this aforementioned
| page.
|
| - Make it clear what we expect of contributors. I feel like a lot of
| this could be explained by having a 5 minute drive-by guide for
| patches, and then a longer 10-minute guide about A) How to style
| things, B) How to format your patches if you're going to contribute
| regularly, C) Why it is this way, and D) finally links to all the
| other things you need to know. People going into Phabricator expecting
| it to behave like GitHub is a problem (more a cultural problem IMO but
| that's another story), and if this can't be directly fixed, the best
| thing to do is make it clear why it isn't.
|
| Those are just some of the things OTTOMH, but this email is already
| way too long. This is what I mean though: fixing most of these is
| going to have *seriously smaller cost* than moving to GitHub. It does
| not account for "The GitHub factor" of people contributing "just
| because it's on GitHub", but again, that value has to outweigh the
| other costs. I'm not seriously convinced it does.
|
| I know it's work to fix these things. But GitHub doesn't really
| magically make a lot of our needs go away, and it's not going to
| magically fix things like style or lint errors, the fact Travis-CI is
| still pretty insufficient for us in the long term (and Harbormaster is
| faster, on our own hardware, too), or that it will cause needlessly
| higher amounts of spam through Trac and GitHub itself. I don't think
| settling on it as - what seems to be - a first resort, is a really
| good idea.
|
|
| On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hambüchen <mail at nh2.me> wrote:
| > On 02/09/15 22:42, Kosyrev Serge wrote:
| >> As a wild idea -- did anyone look at /Gitlab/ instead?
| >
| > Hi, yes. It does not currently have a sufficient review
| functionality
| > (cannot handle multiple revisions easily).
| >
| > On 02/09/15 20:51, Simon Marlow wrote:
| >> It might feel better
| >> for the author, but discovering what changed between two branches
| of
| >> multiple commits on github is almost impossible.
| >
| > I disagree with the first part of this: When the UI of the review
| tool
| > is good, it is easy to follow. But there's no open-source
| > implementation of that around.
| >
| > I agree that it is not easy to follow on Github.
| > _______________________________________________
| > ghc-devs mailing list
| > ghc-devs at haskell.org
| > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
| >
|
|
|
| --
| Regards,
|
| Austin Seipp, Haskell Consultant
| Well-Typed LLP, http://www.well-typed.com/
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
More information about the ghc-devs
mailing list