Plan of Attack for Parallel Builds
iridcode at gmail.com
Wed Mar 30 10:37:03 CEST 2011
I'm very much looking forward to a future where cabal install exercises all
my core's with some heavy duty Haskell work ;-) Thanks Frank for taking this
I personally like progress reports on the individual builds very much. I
agree that they are not super important, but nevertheless I think that a
progress report significantly improves the user experience. I also have a
simple, ad-hoc scheme that should result in an OK progress report for most
Gather patterns of the form
"["<integer>" of "<integer>"]"
in the program output and interpret the resulting sequence such that the
second to last "measurement" is a conservative estimate of the real
progress :: [(Int,Int)] -> Maybe Double
progress xs = case reverse xs of
(_:(i,n):_) -> return (fromIntegral i / fromIntegral n)
_ -> mzero
Probably some more filtering of this sequence is required to cater for
repeated calls to GHC. I guess that, as long as progress never goes from
100% to something below, the user will be happy about the progress estimate.
Moreover, the chance that such a pattern occurs where it doesn't indicate
some interesting progress is reasonably low.
2011/3/30 Johan Tibell <johan.tibell at gmail.com>
> Hi Frank,
> Thanks for reaching out and gathering input.
> On Tue, Mar 29, 2011 at 9:54 PM, Frank Murphy <anirishduck at gmail.com>
> > - Parallelize executeInstallPlan. When given a target load average as a
> flag it
> > will determine whether it should spawn a worker (if below the target
> > average) or wait. If waiting, it will listen to all worker status
> > and print out their current build status and the load average. Once a
> > exits, it will again check the load average and spawn a new thread if
> > necessary.
> I think the most important setting is the number of worker threads
> (e.g. -jX). Load average sounds like a cool idea but I don't know how
> well it'll work in practice. Gentoo's Portage uses it so you might
> snoop around there for more info.
> > - Rewrite install.*Package and their callees to use the CHP
> > Haskell Process) monad where possible. Use channels to communicate build
> > status back to the main thread.
> CHP might be a bit overkill, an MVar and a Chan or two should be
> enough. At least start simple.
> > - It might be necessary to parse the output of external builds in some
> way so
> > that meaningful status can be communicated back to the user.
> I'm not sure this is worth it and even possible in the general case. See
> > - Add a default parallel build log path template. Allow the user to
> specify one
> > on the command line to override the default.
> I'm not quite sure what you mean here. Do you mean that we'd write
> "cabal install" logs to e.g. .cabal/logs or something along those
> > - On single-threaded (sequential) builds, revert to the old output style.
> Sounds good. One possible policy would be: If you run "cabal build",
> you get the old output format (and a single threaded build). If you
> run "cabal install", you get the new output format, regardless of if
> the build runs in parallel or not.
> What do people think? Is it worth displaying all the build output for
> "cabal install" in the single threaded case? Does the user care to see
> it? Perhaps it's good for debugging to let single threaded "cabal
> install" show the old output (i.e. if a parellel build fails, run the
> single threaded one to get more output).
> > On multi-threaded builds, display the current status of all running
> builds, load
> > averages and nothing else. Possible output:
> > Resolving dependencies...
> > Building derive-188.8.131.52... [17
> of 58]
> > Building regex-base-0.93.1...
> [1 of 4]
> > Building dyre-0.8.6...
> [5 of 7]
> > Configuring xdg-basedir-0.2... [in
> > Dependencies Built: [0
> of 9]
> > Load Average:
> > Running 4
> Cabal allows packages to use any build system they want (e.g. make),
> which means that we can't know the progress of a single build in the
> general case. Today, Cabal simply shows the stdout of the build
> process, whatever it is. This means that we cannot show progress of
> individual packages. I suggest something like (take from Gentoo's
> Building (1 of 9) derive-184.108.40.206...
> Building (2 of 9) regex-base-0.93.1...
> Building (3 of 9) dyre-0.8.6...
> Building (4 of 9) aeson-0.3.2.1...
> Building (5 of 9) binary-0.5.0.2...
> Installing derive-220.127.116.11
> Installing regex-base-0.93.1
> Building (6 of 9) text-0.11.0.6...
> Installing dyre-0.8.6
> Jobs: 3 of 9 complete, 3 running Load avg: 3.44, 1.46, 0.69
> We could perhaps make a special case for the Simple build type and
> parse the GHC output and show progress on individual builds. I don't
> think it's worth it, at least not initially.
> > A possible error message might look like:
> > derive-18.104.22.168 failed during the building phase.
> > Log stored in /home/frank/cabal/logs/build/derive-22.214.171.124.log
> For build failures I think we should output the content of the log
> file to stdout (as one chunk, using a lock to avoid interleaving).
> This will make it quicker for users to get to the build failure. For
> successful builds I don't think we need to output more than in the
> example above.
> cabal-devel mailing list
> cabal-devel at haskell.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cabal-devel