[Hackage] #447: do parallel builds
Hackage
trac at galois.com
Sat Jan 10 11:01:35 EST 2009
#447: do parallel builds
---------------------------------+------------------------------------------
Reporter: duncan | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: cabal-install tool | Version:
Severity: normal | Keywords:
Difficulty: normal | Ghcversion: 6.8.3
Platform: |
---------------------------------+------------------------------------------
The latest version of the gentoo portage tool is rather slick. It can do
parallel builds and it displays a nice summary on the command line, eg:
{{{
# emerge -uD system -j --load-average=4.5
Calculating dependencies... done!
>>> Verifying ebuild manifests
>>> Starting parallel fetch
>>> Emerging (1 of 14) dev-libs/expat-2.0.1-r1
>>> Emerging (2 of 14) sys-devel/autoconf-wrapper-6
>>> Emerging (3 of 14) sys-kernel/linux-headers-2.6.27-r2
>>> Installing sys-devel/autoconf-wrapper-6
>>> Jobs: 0 of 14 complete, 1 running Load avg: 2.99, 1.59, 0.67
}}}
Note how they solve the problem of how to display what is going on when
there are multiple builds happening. The answer is not to display it at
all! This would have to go hand-in-hand with logging all builds so that we
can still diagnose failures.
Note the final line, that gets updated to display the current number of
jobs running, the number completed etc. It also shows the load average.
The job scheduler has two parameters, one is a maximum number of jobs (or
unlimited) and the other is a load average. It will only launch new jobs
if the load average is less than the given maximum. That allows it to
interact reasonably well with builds that use `make -j` internally. In the
example above I set the load average to be just slightly more than the
number of CPUs I've got.
It looks to me like it serialises some bits, like installing, since
saturating the disk with multiple parallel installs is generally of no
benefit, indeed it can be slower. Also downloads seem to be serialised,
again because there is probably little benefit to making multiple
connections to the same server.
Anyway, the point is, cabal-install ought to be able to do all this. Some
bits we can do now. We already have a graph representation of the install
plan and we recalculate when a package fails to install.
We will need an improved download api, probably involving sending requests
off to a dedicated download thread (which would serialise them).
--
Ticket URL: <http://hackage.haskell.org/trac/hackage/ticket/447>
Hackage <http://haskell.org/cabal/>
Hackage: Cabal and related projects
More information about the cabal-devel
mailing list