Plan of Attack for Parallel Builds

Simon Meier iridcode at
Wed Mar 30 10:37:03 CEST 2011

Hi all,

I'm very much looking forward to a future where cabal install exercises all
my core's with some heavy duty Haskell work ;-) Thanks Frank for taking this

I personally like progress reports on the individual builds very much. I
agree that they are not super important, but nevertheless I think that a
progress report significantly improves the user experience. I also have a
simple, ad-hoc scheme that should result in an OK progress report for most

Gather patterns of the form

  "["<integer>" of "<integer>"]"

in the program output and interpret the resulting sequence such that the
second to last "measurement" is a conservative estimate of the real
progress; i.e.,

progress :: [(Int,Int)] -> Maybe Double
progress xs = case reverse xs of
  (_:(i,n):_) -> return (fromIntegral i / fromIntegral n)
  _            -> mzero

Probably some more filtering of this sequence is required to cater for
repeated calls to GHC. I guess that, as long as progress never goes from
100% to something below, the user will be happy about the progress estimate.
Moreover, the chance that such a pattern occurs where it doesn't indicate
some interesting progress is reasonably low.

best regards,

2011/3/30 Johan Tibell <johan.tibell at>

> Hi Frank,
> Thanks for reaching out and gathering input.
> On Tue, Mar 29, 2011 at 9:54 PM, Frank Murphy <anirishduck at>
> wrote:
> > - Parallelize executeInstallPlan. When given a target load average as a
> flag it
> >  will determine whether it should spawn a worker (if below the target
> load
> >  average) or wait. If waiting, it will listen to all worker status
> channels
> >  and print out their current build status and the load average. Once a
> worker
> >  exits, it will again check the load average and spawn a new thread if
> >  necessary.
> I think the most important setting is the number of worker threads
> (e.g. -jX). Load average sounds like a cool idea but I don't know how
> well it'll work in practice. Gentoo's Portage uses it so you might
> snoop around there for more info.
> > - Rewrite install.*Package and their callees to use the CHP
> (Communicating
> >  Haskell Process) monad where possible. Use channels to communicate build
> >  status back to the main thread.
> CHP might be a bit overkill, an MVar and a Chan or two should be
> enough. At least start simple.
> > - It might be necessary to parse the output of external builds in some
> way so
> >  that meaningful status can be communicated back to the user.
> I'm not sure this is worth it and even possible in the general case. See
> below.
> > - Add a default parallel build log path template. Allow the user to
> specify one
> >  on the command line to override the default.
> I'm not quite sure what you mean here. Do you mean that we'd write
> "cabal install" logs to e.g. .cabal/logs or something along those
> lines?
> > - On single-threaded (sequential) builds, revert to the old output style.
> Sounds good. One possible policy would be: If you run "cabal build",
> you get the old output format (and a single threaded build). If you
> run "cabal install", you get the new output format, regardless of if
> the build runs in parallel or not.
> What do people think? Is it worth displaying all the build output for
> "cabal install" in the single threaded case? Does the user care to see
> it? Perhaps it's good for debugging to let single threaded "cabal
> install" show the old output (i.e. if a parellel build fails, run the
> single threaded one to get more output).
> >  On multi-threaded builds, display the current status of all running
> builds, load
> >  averages and nothing else. Possible output:
> >
> > Resolving dependencies...
> > Building derive-                                            [17
> of 58]
> > Building regex-base-0.93.1...
> [1 of 4]
> > Building dyre-0.8.6...
>  [5 of 7]
> > Configuring xdg-basedir-0.2...                                     [in
> progress]
> >
> >                                                  Dependencies Built:  [0
> of 9]
> >                                                        Load Average:
> [3.4/4.0]
> >                                                                Running 4
> Jobs.
> Cabal allows packages to use any build system they want (e.g. make),
> which means that we can't know the progress of a single build in the
> general case. Today, Cabal simply shows the stdout of the build
> process, whatever it is. This means that we cannot show progress of
> individual packages. I suggest something like (take from Gentoo's
> Portage):
> Building (1 of 9) derive-
> Building (2 of 9) regex-base-0.93.1...
> Building (3 of 9) dyre-0.8.6...
> Building (4 of 9) aeson-
> Building (5 of 9) binary-
> Installing derive-
> Installing regex-base-0.93.1
> Building (6 of 9) text-
> Installing dyre-0.8.6
> Jobs: 3 of 9 complete, 3 running               Load avg: 3.44, 1.46, 0.69
> We could perhaps make a special case for the Simple build type and
> parse the GHC output and show progress on individual builds. I don't
> think it's worth it, at least not initially.
> > A possible error message might look like:
> >
> > derive- failed during the building phase.
> > Log stored in /home/frank/cabal/logs/build/derive-
> For build failures I think we should output the content of the log
> file to stdout (as one chunk, using a lock to avoid interleaving).
> This will make it quicker for users to get to the build failure. For
> successful builds I don't think we need to output more than in the
> example above.
> Cheers,
> Johan
> _______________________________________________
> cabal-devel mailing list
> cabal-devel at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the cabal-devel mailing list