[arch-haskell] Thoughts on Procedure

Thu Oct 14 15:04:36 EDT 2010

Hi guys,

in my understanding, our current update procedure works like this:

 1) We notice that a package was updated (or added) on Hackage by means
    of RSS.

 2) A maintainer runs cabal2arch to generate an updated PKGBUILD.

 3) If the generated PKGBUILD looks good, the file is committed to the
    Git repository and uploaded to AUR.

There are a few things worth noting about that procedure:

 - A maintainer must perform 1 manual step per updated package: that is
   linear complexity O(n).

 - There is no mechanism to guarantee that the updated set of PKGBUILD
   files actually works.

 - It's common practice to use version control systems like Git to track
   original source code. Our setup, however, tracks generated files: the
   PKGBUILDs are produced automatically by cabal2arch. So why do we
   track them? Shouldn't we rather track the Cabal files?

Naturally, one wonders how to improve the update process. There are a
few possible optimizations:

 - The simplest way to verify whether all PKGBUILDs compile is to, well,
   compile them. Given a set of updated packages, all packages that
   directly or indirectly depend on any of the updated packages need
   re-compilation, and the current set of PKGBUILDs is to be considered
   valid only if all those builds succeed.

 - It is possible to download the entire state of Hackage in a single
   tarball. Given all the Cabal files, a Makefile can automatically
   re-generate those PKGBUILDs that need updating. The same Makefile can
   also run the necessary builds, and it also perform the necessary
   uploads to AUR.

Based on these thoughts, I would like to propose an improved procedure
for discussion. Let our Git repository track a set of Cabal files. Then
an update would work like this:

 1) A maintainer downloads

      http://hackage.haskell.org/packages/archive/00-index.tar.gz

    and extracts the Cabal files into a checked-out Git repository.

 2) Optionally, inspect changes with "git status" and "git diff".

 3) Run "make all" to re-build all PKGBUILD files that need updating.

 4) Run "make check" to perform all necessary re-builds of binary
    packages. If all builds succeed, proceed with (5). Otherwise, figure
    out which package broke the build and revert the changes in the
    corresponding Cabal file. Go back to (3).

 5) Run "make upload" and "git commit" the changes.

Now, this procedure is supposed to update AUR, but "make upload" can be
easily extended to copy the generated packages into a binary repository
as well.

The worst case scenario occurs when every single available update breaks
during "make check". In that case, the procedure has linear complexity
O(n). The best case scenario, on the other hand, is the one where every
single update succeeds. That case is handled by running "make all &&
make check && make upload", which gives constant complexity O(1).

More importantly, however, the "make check" phase would guarantee that
we never ever publish a configuration that doesn't compile.

How do you feel about the idea?

Take care,
Peter