[Hackage] #447: do parallel builds

Hackage trac at galois.com
Sat Jan 10 11:01:35 EST 2009


#447: do parallel builds
---------------------------------+------------------------------------------
  Reporter:  duncan              |        Owner:       
      Type:  enhancement         |       Status:  new  
  Priority:  normal              |    Milestone:       
 Component:  cabal-install tool  |      Version:       
  Severity:  normal              |     Keywords:       
Difficulty:  normal              |   Ghcversion:  6.8.3
  Platform:                      |  
---------------------------------+------------------------------------------
 The latest version of the gentoo portage tool is rather slick. It can do
 parallel builds and it displays a nice summary on the command line, eg:

 {{{
 # emerge -uD system -j --load-average=4.5
 Calculating dependencies... done!
 >>> Verifying ebuild manifests
 >>> Starting parallel fetch
 >>> Emerging (1 of 14) dev-libs/expat-2.0.1-r1
 >>> Emerging (2 of 14) sys-devel/autoconf-wrapper-6
 >>> Emerging (3 of 14) sys-kernel/linux-headers-2.6.27-r2
 >>> Installing sys-devel/autoconf-wrapper-6
 >>> Jobs: 0 of 14 complete, 1 running  Load avg: 2.99, 1.59, 0.67
 }}}

 Note how they solve the problem of how to display what is going on when
 there are multiple builds happening. The answer is not to display it at
 all! This would have to go hand-in-hand with logging all builds so that we
 can still diagnose failures.

 Note the final line, that gets updated to display the current number of
 jobs running, the number completed etc. It also shows the load average.
 The job scheduler has two parameters, one is a maximum number of jobs (or
 unlimited) and the other is a load average. It will only launch new jobs
 if the load average is less than the given maximum. That allows it to
 interact reasonably well with builds that use `make -j` internally. In the
 example above I set the load average to be just slightly more than the
 number of CPUs I've got.

 It looks to me like it serialises some bits, like installing, since
 saturating the disk with multiple parallel installs is generally of no
 benefit, indeed it can be slower. Also downloads seem to be serialised,
 again because there is probably little benefit to making multiple
 connections to the same server.

 Anyway, the point is, cabal-install ought to be able to do all this. Some
 bits we can do now. We already have a graph representation of the install
 plan and we recalculate when a package fails to install.

 We will need an improved download api, probably involving sending requests
 off to a dedicated download thread (which would serialise them).

-- 
Ticket URL: <http://hackage.haskell.org/trac/hackage/ticket/447>
Hackage <http://haskell.org/cabal/>
Hackage: Cabal and related projects


More information about the cabal-devel mailing list