Version control systems

Max Bolingbroke batterseapower at hotmail.com
Fri Aug 15 08:01:08 EDT 2008


2008/8/15 Isaac Dupree <isaacdupree at charter.net>:
> So let's figure out how it would work (I have doubts too!) So, within the
> directory that's a git repo (ghc), we have some other repos, git (testsuite)
> and darcs (some libraries).  Does anyone know how git handles nested repos
> even natively?

You can explicitly tell Git about nested Git repos using
http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html.
This essentially associates a particular version of each subrepo with
every version of the repo that contains them, so e.g. checking out GHC
from 2 weeks ago could check out the libraries from the same point in
time.

AFAIK, nothing in Git caters for subrepos of a different VCS.

> Then, adding complexity, git branches are normally done by
> switching in-place.  So how does this interact with VCS like darcs that
> doesn't have a concept of in-place switching of branches?

Since we will set up Git to ignore the contents of the Darcs repos, it
will simply leave them unmodified. This is exactly like the current
situation, where rolling back / patching the GHC repo does not affect
the others. If you want Darcs-like behaviour (one branch per repo) you
are free to do this in Git as well, in which case since you never
switch branches the nested Darcs repos should never be inappropriate
for your branch.

Personally, since I only ever hack GHC and tend to leave the libraries
alone, I could still use the in-place branching without difficulty.

> (Now, I wouldn't
> be surprised if git, the monstrosity that it is, has already invented
> answers for these sort of questions :-) But we need to figure out the
> answers for whatever situation we choose for the 6.11 development cycle, and
> probably document them somewhere on the wiki (that I lazily didn't bother to
> check again before writing this message).

The situation above is pretty much the whole story, if we are taking
the route where we just convert the GHC+testsuite repo to Git. I don't
think it's particularly confusing, but maybe that's because I've spent
too long thinking about VCSs :-).

This thread has got quite large, and doesn't appear to have made much
progress towards a resolution. Let me try and sum up the discussion so
far.

There seem to be four stakeholders in this switch:
a) Current GHC developers
b) Future GHC developers
c) People who just contribute to the libaries
d) Maintainers of other compilers GHC shares repos with

And there are at least 5 options for how to proceed:
1) Convert just GHC and Testsuite to Git, leave everything else in Darcs
  Pros:
  - No change in habits required for stakeholders c, d
  - Resolves all Darcs issues discussed at length before, pleasing
stakeholders a, b

  Cons:
  - Requires two VCSs to be installed and learnt (more points of
failure, makes source tree less accessible, doesn't solve any Darcs'
build+install problems), affecting stakeholders a and b
  - Difficult to check out a consistent version of the source tree (no
submodules), affecting stakeholders a and b

2) Wait for Darcs2 to get better
  Pros:
    - No change in habits required for any stakeholders (though we
still have one-off switching cost)
    - Potentially resolves all Darcs issues, pleasing stakeholders a, b
    - Only option that will not require a workflow change for GHC
developers (more topic branches rather than "spontaneous branches" and
cherry-picking), pleasing stakeholders a

  Cons:
    - Darcs will probably continue to be less popular and well
supported than Git (see Debian popcon graphs for the trend
difference). Reduced popularity will affect the ability of
stakeholders b to contribute (learning barrier), and less support/real
world use may potentially lead to a higher incidence of bugs
encountered, affecting stakeholders a-d. This point is certainly
debatable.
    - Apparently somewhat vaporware at the moment

3) Convert all repos to Git
  Pros:
    - Native Git submodule integration, makes life easier for stakeholders a-b
    - Single (popular) command set to learn, single thing to install:
makes life better for stakeholder b at least

  Cons:
    - Significant inconvenience for stakeholders c-d as they have to
change their own projects

4) Branch all repos into Git but leave the Darcs repos alone and push
Darcs patches into the Git repos automatically. Never push to these
Git repos in any other way, similar to Cabal repo currently
  Pros:
    - As option 3
    - Stakeholders c-d do not need to do anything

  Cons:
    - Makes it harder to hack on the libraries within a GHC checkout,
affecting a, b
    - Automatic synchronisation will require occasional maintenance by someone

5) Branch all repos into Git and then set up a manual merging / sync
process that tries to turn Git commits into Darcs patches and
vice-versa
  Pros:
    - As option 3
    - Hack on the libraries in a GHC checkout with ease, pleasing a, b
    - Stakeholders c-d do not need to do anything

  Cons:
     - Synchronisation much more fragile than 4), will likely require
constant maintenance

This summary is probably incomplete and inaccurate. However, if people
find it useful for organising the various lines of discussion on this
issue, perhaps someone could Wikify it so we can get a complete, clear
picture?

My personal preference is for 3), but that's because I'm a stakeholder
"a" who isn't a great fan of spontaneous branches!

Anyway, there are good arguments on every side, so I don't want to
advocate a particular position (and indeed, my opinions quite rightly
do not carry any weight! :-). However I'd really like for us to work
out what is going on so we have a clear plan for moving away from
Darcs 1, which is an inadequate VCS for GHC for reasons that have been
discussed to death. I hope (perhaps naively) that this email can
provide a framework for reaching a consensus agreeable to all parties.

All the best,
Max


More information about the Glasgow-haskell-users mailing list