[Haskell-cafe] cheap in-repo local branches (just needs implementation)

Justin Bailey jgbailey at gmail.com
Tue Jul 21 19:42:19 EDT 2009


I like it. git branches are nice to work with, and they don't the
conceptual pain of "creating" an new repository.

Things that make them nice:

  * When switching branches, all your files magically update (if they
have not been modified).
  * Easy to maintain multiple branches, say "stable" and
"experimental". That helps me avoid getting clobbered by other's
changes to APIs I depend on.

Things that are a pain:

  * Comparing commits (patches) between branches. Its hard to tell
what is one and what is in another.
  * When you have modified files, git is super picky about switching branches.
  * Once a remote branch is pushed to a public repo, its scary to
remove it. You don't want to break somebody, but you don't want that
old junk hanging around either.

I don't mean to write about git, but if darcs was to have branches,
thats the kind of stuff I would love to see.

On Tue, Jul 21, 2009 at 2:23 PM, Eric Kow<kowey at darcs.net> wrote:
> Hi everyone,
>
> Max Battcher had an idea that I thought I should post on the mailing list.
>
> The idea is about making branches in darcs.  Right now, we take the view that a
> darcs branch is a darcs repository plain and simple.  If you want to create a
> branch, all you have to do is darcs get (darcs get --lazy to be faster).  While
> this is very simple, a lot of us think that it's inconvenient (one because it's
> slow, and two because you have to think of where to put the branch).
>
> So darcs users have been asking about in-repo branches for a while.  And now,
> Max has come up with a way to implement them.  What's nice about his approach
> is that it lets us keep the simplicity of darcs, while giving more demanding
> users a chance to work with branches.  It also takes advantage of the Petr
> Ročkai's Summer of Code project to make darcs faster in our daily lives and for
> the matter, paves the way for a possible darcs plugin system in the future.
>
> On Max's advice, I'm cross-posting to Haskell Cafe.  Haskellers: here's a nice
> chance for you get a cool Darcs feature without not very much effort or Darcs
> hacking experience :-)
>
> More info on: http://bugs.darcs.net/issue555
>
> ------------------------------------------------------------
> Max's write-up
> ------------------------------------------------------------
>
> Here's a quick primer: Basically, darcs >= 2.0 uses a hashed pristine
> store that acts as a file object cache. An interesting artifact of the
> pristine.hashed store, which is being pushed into a useful third-party
> accessible library named hashed-storage, however, is that it does (for
> many reasons, most co-evolutionary) resemble the git object store. There
> are several differences, but one of the key differences that applies to
> the topic at hand is that darcs generally garbage collects
> pristine.hashed objects much faster than git.
>
> Darcs is very quick to garbage collect old objects partly because many
> aren't all that useful, but mostly because the primary representation
> for a repository state is the patch store (and inventory), so there is
> only one root pointer in the pristine store. Petr, the author of the
> hashed-storage library, briefly discusses this in his most recent design
> post about the future of hashed-storage:
>
> http://mornfall.net/blog/designing_storage_for_darcs.html
>
> Here's where the primer meets the topic at hand: A darcs branch consists
> of three major components: an inventory store, a patch store, and a
> pristine store. To store multiple branches "in the same place" you need
> to take care of: 1) storing the alternate inventories, and 2) if you
> want it to be relatively fast, storing additional objects in the
> pristine store. (The patch store will already happily hold more patches
> than are referenced in the current inventory.) (1) is mostly a matter of
> naming alternate inventories and swapping between them. Thanks to the
> *ahem* git-like nature of pristine.hashed/hashed-storage: darcs could
> easily archive (many) more pristine objects, than it will during normal
> operation, in pristine.hashed and it may be as simple as storing
> additional, useful "root pointers" visible to hashed-storage so that it
> knows not to garbage collect the objects from other branches.
>
> Here's where the fun happens: It seems to me that a branch switching
> tool, utilizing darcs' existing repository data stores, could be built
> almost "purely" on top of mostly just the hashed-storage library (which
> has been designed for reuse), as it exists today or hopefully with only
> minor tweaking, and with only minimal interaction with darcs itself.
> That is, in-repo branching could be provided entirely, today or soon, as
> a second/third-party tool to darcs. (!)
>
> I think this is great from a darcs perspective: darcs itself remains
> conceptually simple (1 repository == 1 branch), which is something that
> I for one love about darcs, and doesn't need additional commands in
> darcs iteslf. But yet, power users (and git escapees) would have easy
> access to a ``darcs-branch`` tool that provides simple and powerful
> in-repo switching. Potentially, such a tool is also a great candidate to
> be an earlier adopter for the darcs library support and can help better
> define and enhance darcs' public API. (It's also interesting in that it
> mirrors that hg's support for branches is an addon, and that both hg and
> git have darcs-like patch queues as addons.)
>
> I think this is even better from a hashed-storage perspective:
> ``darcs-branch`` would be a strong (new) use case for hashed-storage as
> a public API. The tool would provide good incentive to keep
> hashed-storage's API clean, and better incentive (than darcs' normal
> operation) to keep hashed-storage's garbage collection and object
> compaction strong. (With the 'cheap' cost of in-repo branches primarily
> a consequence of how well hashed-storage stores the additional objects
> of multiple branches. As a bonus, normal darcs operations should benefit
> as well from the gc/compaction optimizations that darcs-branch
> operations may make more obvious.)
>
> At a high-level, a ``darcs-branch`` tool would provide core commands to:
>
> 1) Store the current repository state as a new branch by copying the
> current inventory and inserting a new pristine root for the branch.
> (``darcs-branch new`` or ``darcs-branch freeze``, perhaps)
>
> 2) Switch to a previously stored branch, by making the branch's
> inventory the new current inventory and the branch's pristine root the
> new current pristine root; updating the working directory as necessary.
> (``darcs-branch switch``)
>
> Additionally, there would be other useful management tools
> (``darcs-branch list``, ``darcs-branch remove`` (or unfreeze)). I think
> that these four commands could be done with no darcs interaction at all
> (unless the branch being switched to has an incomplete/lazy pristine).
>
> Useful commands that would need darcs interaction for patch management
> would be things like ``darcs-branch push`` to push patches between named
> branches (equivalent at a high level to ``darcs send -o new.dpatch
> --context branchB.context; darcs-branch switch branchB; darcs apply
> new.dpatch``), and ``darcs-branch obtain`` to obtain new in-repo local
> branches from an existing context file, remote/external-local
> repository, tag, or other matcher (that is, darcs get from one in-repo
> branch to a new one).
>
> I doubt that a ``darcs-branch get`` to download all of the branches
> other than "current" (or HEAD, if you prefer, or "main" as I prefer) of
> a remote repository would need any darcs interaction (downloading the
> inventories and then many/most/all of the pristine objects). We can bet
> that darcs' usual lazy patch-getting behavior should work out of the box
> even for multiple branches.
>
> Well, that's the general idea, at least. I believe that a willing
> volunteer and a bit of help from Petr could build such a tool
> "relatively quickly" and hopefully might even possibly work with today's
> darcs as it is.
>
> --
> Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
> PGP Key ID: 08AC04F9
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkpmMb4ACgkQBUrOwgisBPlvzwCfbgyQQ/fV6QfAl4NgKJpjx7Bw
> 7QYAoOEaF2XrNyqJ9tfUjvJpgc/KjkYI
> =nZFr
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>


More information about the Haskell-Cafe mailing list