[Haskell-cafe] darcs hacking sprint 4 report

Eric Y. Kow kowey at darcs.net
Sun Mar 28 18:33:42 EDT 2010

Hi everybody,

Here's our report from ZuriHac.  It's been also posted to the Darcs blog
with a couple of photos stolen from Johan's blog, some Darcs rebase
scribbling and a screenshot of Darcsden's intriguing fork-tracking


The Fourth Darcs Hacking Sprint took place last weekend (19 to 21 March)
as part of the Zurich Haskell Hackathon.  We had a very productive
sprint, a bit of code written, polished off many key discussions, had a
little beer and a lot of fun.

In this sprint, we worked on finishing some performance work for the
upcoming Darcs 2.5 release this summer (hashed storage, patch index,
global caches, inventory hashing); planning our work for the Darcs 2.6
release next year (smart servers, cache cleanup, darcs rebase) and
working with new users of the Darcs library.

Issues resolved
* issue643   darcs send -o output  - Guillaume Hoffmann
* issue1473  annotate command line - Stefan Wehr
* issue1456  portable darcs dist   - Guillaume Hoffmann

New Darcs Hackers
We're always happy to work with new Darcs developers.  At this sprint,
we were joined by four new contributors.

Guillaume Hoffmann
Guillaume has been writing our Darcs Weekly News articles for a year
now.  Over the weekend he got his first taste of Darcs hacking, knocking
out three ProbablyEasy bugs (darcs dist internals, darcs send -o UI,
darcs apply with gzipped patch bundles).  Guillaume reports that he can
see himself doing more of this in the future!

Steven Keuchel
Steven worked on a new feature to display the file contents hashed
associated with any patch.  This makes it easier for third party tools
to inspect the patch files behind Darcs.

Stefan Wehr and David Leuschner
Stefan and David mostly worked on the Darcs Patch Manager, but to warm
up, they tackled a couple of ProbablyEasy bugs, particularly a bug in
darcs annotate that was affecting Redmine

Hacking continued...

Bugfix: Darcs on Windows shares
Salvatore tracked down the Windows regression on 2.4 that make
Darcs not work on windows shares.

Performance: Fast darcs annotate
Benedikt Schmidt continued his work on the patch index (formerly known
as the filecache). The patch index keeps track of which patches affect
which files.  This index will bring a big boost to darcs annotate
performance, particularly for files which are affected by relative small
number of patches.

Performance: Global cache
Luca continued his work on breaking up the global cache
($HOME/.darcs/cache) into buckets for faster access.  Working with
Reinier and Petr, Luca has developed an approach to migrating from old
style caches to the new style bucketed ones.  He has also improved the
implementation to use hard links, to avoid disk space doubling and to
preserve backwards compatibility with prior versions of Darcs.

Windows installer
Salvatore put together a nice Windows installer using the `bamse package
<http://hackage.haskell.org/package/bamse>`_.  It looks like we will be
able to use this for the planned Darcs 2.5 release this summer.  This
work will also open the door to nicer integration with Windows tools,
for example, using a bundled Tortoise SSH for better experience working
with SSH passphrases.

Interactive cherry picking
Florent improved the quality of the Darcs cherry picking code, making it
easier to fine tune our user interface and some day support graphical
interfaces via the Darcs library.  Witnessed list zippers for the win?

Interactive diff
Florent also started work on adding Darcs's interactive cherry picking
to darcs diff, making it possible to choose a set of patches to view as
a diff.

Performance: Hashed storage completion
Darcs has a representation of file and directory trees called slurpies.
Petr polished off his work to replace the slurpies with his more
efficient, general purpose hashed-storage library.  Slurpies are going
away, and Darcs will be faster for it.  He and Ganesh also discussed
how to gracefully transition from repositories created before the
hashed-storage refactor.

Performance: Using tags when writing patches
Petr ported work by David Roundy to solve a `scalability
regression <http://bugs.darcs.net/issue1106>`_ in hashed repositories.
For darcs commands that write out patches, we had a naive hashing
operation that does not account for the fact that patches behind tags
cannot be modified.  Darcs was unnecessarily traversing the entire
sequence of patches (ie. O(n) time) when it could easily have been
just traversing the sequence since the last tag.

UTF-8 metadata
Reinier continued to improve the encoding of Darcs patch metadata.
Darcs is completely agnonstic with respect to the encoding of your
files.  Unfortunately, this agnostism extends to patch metadata (patch
name, patch author), making it difficult for people to collaborate
across different locales.  To address this problem, Reinier has been
working to make Darcs store its patch metadata in a single encoding
(UTF-8) while gracefully supporting older patches (with metadata in
potentially any encoding).


Release process
The Darcs 2.4 release was quite a tricky one to navigate.  We found that bugs
were only being flushed out on release candidate time and sometimes after the
release proper.

We would like to encourage more people to try out Darcs work in progress and
give us feedback early in the release process.  After chatting about this,
Reinier (with Ganesh, Eric and Petr) decided that as Release Manager, he would
put out a Darcs alpha every 4 weeks.

In the future we may investigate automatic nightly builds via the buildbot
and a platform support policy such as the one used by Tahoe.

Darcs patch index (fast darcs annotate)
Benedikt updated us on the recent status of his ongoing patch index work
(formerly known as the filecache).  We discussed the things that make
the patch index convincing (permanant, repo-local, unique identifiers
for files) the interaction between the patch index and the type
witnesses and also ways of tuning the patch index performance and
keeping it small.

We're looking forward to sharing the new patch index optimisation with
you in upcoming releases.  Darcs annotate may become a lot more useful
in the next couple of releases!

Readable darcs annotate
Fast darcs annotate won't be useful if nobody can read it. Benedikt and
Eric worked on designing a better output format darcs annotate.  Taking
a page from git blame, there will be one line per source file line, with
columns for patch identifier, author name, date and finally the line.
One of the design questions was how we should best refer to darcs
patches, the current best candidate being a prefix of the darcs patch
metadata hash.

Fast darcs over networks
Darcs get over networks is slow, painfully slow.  Petr has suggested two
priorities for improving the performance of network operations.  The
first would be to introduce a `darcs optimize --http
<http://bugs.darcs.net/issue1771>`_ feature which would optimise the
Darcs repository for fetching over a network (for example, by creating
a "snapshot" of the pristine cache to be fetched in one go).  The second
priority would be develop a `smart server
<http://bugs.darcs.net/issue1773>`_ that would provide darcs clients
with only the files they need and in the optimal number of chunks.
The two ideas combined would make an excellent Google Summer of Code

Darcs rebase
Prior to the sprint, Ganesh has been working on a `darcs rebase
<http://wiki.darcs.net/Ideas/RebaseDesign>`_ feature which will help
Darcs users work with long term branches, and other cases where patch
commutation by itself is not enough.  At the sprint, Ganesh explained
his work to everyone interested.  Together we settled on a rough plan
for the user interface.  It looks like our new rebase command will offer
a typically Darcs-ish twist: interactive cherry picking.

Darcs library
Ganesh and Florent talked with three teams building software in the
Darcs ecosystem (DPM: Stephan Wehr and David Leuschner, Mac Darcs record
GUI: Benedikt Huber and David Markvica, DarcsDen: Alex Suraci).  There
was a surprising degree of commonality.

The conversations have given us a much stronger sense of direction with
the Darcs library.  In particular, Ganesh is convinced that we should
commit to our use witnesses - at the very least getting them completely
finished so we can run with them, probably turning them on by default,
and quite possibly dropping the non-witnesses builds.

Default switches
We held a quick roundtable discussion to settle some decisions on
Darcs default switches that have been hanging in the air.  Our decisions
for Darcs 2.5:

* --no-set-scripts-executable [unchanged]
* pull/push/send --no-set-default
* send --edit-description
* record --no-test
* check --no-test

Performance presentations
Petr and Benedkit gave lighting talks, showing some of our recent performance
work to the Haskell community.  Some exciting numbers from Benedikt's work
(`notes <http://wiki.darcs.net/Ideas/PatchIndex>`_) include a 6 second darcs
annotate on a file in the GHC repository (previously this did not complete
within a half hour).  With Petr's work, we are able to [TODO numbers].

Google Summer of Code
We discussed our priorities for this year's Google Summer of Code.  We have
decided that we would focus our attention on performance issues.  If we had two
GSoC students this year, we would be mainly interested in dividing them between

network performance
  developing a smart server for much faster darcs get and pull over a network
local performance
  performing a comprehensive overhaul of the Darcs hashed file cache handling

We also discussed ways to make the best use of our students' time.  The Darcs
team has participated in GSoC twice and learning a lot from the experience.
This year we would like to see if we could publish some clear guidelines both
on what we expect from GSoC students and what they can expect from us.  Watch
the mailing list for more discussion on this topic.

Budding Ecosystem
We were pleasantly suprised to find ourselves with users of the (still
unstable) Darcs API.  These new arrivals give us the feeling that the
collection of `related software
<http://wiki.darcs.net/RelatedSoftware>`_ is coalescing into a new
Darcs ecosystem.

Darcs Patch Manager
David Leuschner and Stefan Wehr worked on an exciting new patch management
program for project maintainers.  The Darcs Patch Manager (DPM) offers a new
way for repository maintainers to keep track of incoming Darcs patches,
including their amendements and dependencies. ::

     $ dpm -r MAIN_REPO -s DPM_DB list
        very cool feature [State: OPEN]
          2481 Tue Mar 16 17:50:23  2010 Dave Devloper <dave at example.com>
               State: UNDECIDED, Reviewed: no
          7861 Tue Mar 16 17:20:45  2010 Dave Devloper <dave at example.com>
               State: REJECTED, Reviewed: yes
               marked as rejected: one minor bug
        some other patch [State: OPEN]
          7631 Tue Mar 16 13:15:20  2010 Eric E. <eric at example.com>
               State: REJECTED, Reviewed: yes

Towards the end of the hackathon, David gave a nice short demo of DPM in
action and deftly avoided the wrath of the demo Gods.

MacOS X GUI for Darcs record
Benedikt Huber and David Markvica started work on a graphical interface
to the Darcs record command.  One key twist is that they make use of the
Darcs API to get the kind of dependency-tracking interactiveness
goodness that Darcs offers.  Bendedikt and Huber report that they have
spent most of the hackathon getting to grips with the library.  Darcs
type witnesses were very helpful for avoiding errors, but they also
impose a steep learning curve.

Alex Suraci and Simon Michael made several improvements to `Darcsden
<http://darcsden.com/>`_, an open source hosting solution (akin to
Github and Patch-tag).  Some recent changes were Atom feeds, the ability
to view forks of your repository and cherry-pick patches from them
(work in progress).  Darcsden also makes use of the Darcs API.

Want to host Darcs Hacking Sprint 2010-10?
The Darcs Team would like to hold hacking sprints twice a year.  These sprints
are an important occassion for us to hold design discussions, hack some code,
train new Darcs hackers and generally bond as a team.

Do you think you can help?  Please get in touch with me if you think you may be
able to host a group of around 20 Darcs hackers one of these October or
November weekends.

Getting over 75 Haskell hackers into Zürich and having them up and
running on arrival (Swiss power plugs notwithstanding) was no easy task!
We'd like to thank Johan Tibell, David Anderson and the rest of the
Google Crew for their hard work organising this hackathon.

Thanks also to the generous donors who chipped into our 2010 Darcs
Travel Fund.  We'll be looking forward to using the leftover cash for
the upcoming 5th Darcs Hacking Sprint in October or November.

Finally here are some words from happy Darcs hackers:

   The sprint was a wonderful social occasion, and it was great meeting
   most of the Darcs hackers, and also seeing other Haskell hackers
   interested in working in the Darcs ecosystem.  I especially enjoyed
   teaching them how to use our API. -- Florent

   The atmosphere was wonderful and I consider the sprint to have been
   very productive overall. -- Petr

   This is coolest thing I ever did -- Luca

See you in half a year!

We had ten Darcs hackers in Zürich along with four Haskellers using
the Darcs API to do awesome things (plus two more on IRC).

- Florent Becker
- Guillaume Hoffmann
- Eric Kow
- Reinier Lamers
- Ganesh Sittampalam
- Petr Rockai
- Salvatore Insalaco
- Luca Molteni
- Benedikt Schmidt
- Steve Keuchel
- Benedikt Huber
- David Markvica
- Stefan Wehr
- David Leuschner
- Simon Michael [IRC] - Darcsden
- Alex Suraci [IRC] - Darcsden

Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20100328/982b0eb3/attachment.bin

More information about the Haskell-Cafe mailing list