Tests with compilation errors

Sun Nov 2 22:17:03 UTC 2014

Thanks for the detailed explanations. A few thoughts here:

Having multiple "configurations" of the source tree (in that some parts of
it may be missing or not) does not sound like a good idea, it just seems
like additional complexity for no particular reason. AFAIU, it means that
if I check out additional libraries into my repository (or build those
libraries somehow?), tests for other packages might start failing, which is
weird.

It sounds like the current decision of keeping "random" and a few other
packages not built is rather ad hoc. Is that the case?

Ideally there would be one ghc testsuite that would always include all
tests (when a faster test run is desired, a more generic mechanism of test
filtering should be used, some test suites react to a "fast" flag, right?).
If there are tests that we do not want to run as part of the global test
suite, it seems that they should live together with the library
implementation then and be maintained there, separately from ghc.

What's the compilation cost of the additional libraries relative to the
complete build? (If you don't know off the bat, how do I get them built to
measure the overhead?) Is it really significant? If it is, can we split off
related tests? If it isn't, let's just enable them by default.

On Thu, Oct 30, 2014 at 10:19 PM, Austin Seipp <austin at well-typed.com>
wrote:

> On Thu, Oct 30, 2014 at 6:48 AM, Gintautas Miliauskas
> <gintautas.miliauskas at gmail.com> wrote:
> > Going through some validate.sh results, I found compilation errors due to
> > missing libraries, like this one:
> >
> > =====> stm052(normal) 4088 of 4108 [0, 21, 0]
> > cd ../../libraries/stm/tests &&
> > 'C:/msys64/home/Gintas/ghc/bindisttest/install   dir/bin/ghc.exe'
> > -fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output
> -no-user-package-db
> > -rtsopt
> > s -fno-warn-tabs -fno-ghci-history -o stm052 stm052.hs  -package stm
> >>stm052.comp.stderr 2>&1
> > Compile failed (status 256) errors were:
> >
> > stm052.hs:10:8:
> >     Could not find module ‘System.Random’
> >     Use -v to see a list of the files searched for.
> >
> > I was surprised to see that these are not listed in the test summary at
> the
> > end of the test run, but only counted towards the "X had missing
> libraries"
> > row. That setup makes it really easy to miss them, and I can't think of a
> > good reason to sweep such tests under the rug; a broken test is a failing
> > test.
>
> Actually, these tests aren't broken in the way you think :) It's a bit
> long-winded to explain...
>
> Basically, GHC can, if you let it, build extra dependencies in its
> build process, one of which is the 'random' library. 'random' was not
> ever a true requirement to build GHC (aka a 'bootlib' as we call
> them). So why is this test here?
>
> Because 'random' was actually a dependency of the Data Parallel
> Haskell package, and until not too long ago (earlier this year),
> `./validate` built and compiled DPH - with all its dependencies;
> random, vector, primitive - by default. This actually adds a pretty
> noticeable time to the build (you are compiling 5-8 more libraries
> after all), and at the time, DPH was also not ready for the
> Applicative-Monad patch. So we turned it off, as well as the
> dependencies.
>
> Additionally, GHC does have some 'extra' libraries which you can
> optionally build during the build process, but which are turned off by
> default. Originally this was because the weirdo './sync-all' script
> used to not need everything, and 'stm' was a library that wasn't
> cloned by default.
>
> Now that we've submoduleified everything though, these tests and the
> extra libraries could be built by default. Which we could certainly
> do.
>
> > How about at least listing such failed tests in the list of failed tests
> of
> > the end?
>
> I'd probably be OK with this.
>
> > At least in this case the error does not seem to be due to some missing
> > external dependencies (which probably would not be a great idea
> anyway...).
> > The test does pass if I remove the "-no-user-package-db" argument. What
> was
> > the intention here? Does packaging work somehow differently on Linux?
> (I'm
> > currently testing on Windows.)
>
> I'm just guessing but, I imagine you really don't want to remove
> '-no-user-package-db' at all, for any platform, otherwise Weird Things
> Might Happen, I'd assume.
>
> The TL;DR here is that when you build a copy of GHC and all the
> libraries, it actually *does* register the built packages for the
> compiler... this always happens, *even if you do not install it*. The
> primary 'global' package DB just sits in tree instead, under
> ./inplace.
>
> When you run ./validate, what happens is that after the build, we
> actually create a binary distribution and then test *that* compiler
> instead, as you can see (obviously for a good reason - broken bindists
> would be bad). The binary distribution obviously has its own set of
> binary packages it came with; those are the packages you built into it
> after all. The reason we tell GHC to ignore the user package db here
> is precisely because we *do not* want to pick it up! We only want to
> test the binary distribution with the packages *it* has.
>
> Now you might say, well, Austin, the version numbers are different!
> How would it pick that up? Not always... What if I built a copy of GHC
> HEAD today, then built something with it using Cabal? Then that will
> install into my user package database. Now I go back to my GHC tree
> and hack away _on the same day_ and run './validate'... the version
> number hasn't changed *at all* because it's date based, meaning the
> binary distribution could certainly pick up the previously installed
> libraries, which I installed via the older compiler. But I don't want
> that! I only want to run those tests with the compiler I'm validating
> *now*.
>
> I imagine the reason you see this test pass if you remove this
> argument is precisely for this reason: it doesn't fail because it's
> picking up a package database in your existing environment. But that's
> really, really not what you want (I'd be surprised if it worked and
> didn't result in some horrible error or crash).
>
> > On a related note, how about separating test failures from failing
> > performance tests ("stat too good" / "stat not good enough")? The latter
> are
> > important, but they seem to be much more prone to fail without good
> reason.
> > Perhaps do some color coding of the test runner output? That would also
> > help.
>
> I also think this is a good idea.
>
> > --
> > Gintautas Miliauskas
> >
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://www.haskell.org/mailman/listinfo/ghc-devs
> >
>
> --
> Regards,
>
> Austin Seipp, Haskell Consultant
> Well-Typed LLP, http://www.well-typed.com/
>

-- 
Gintautas Miliauskas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141102/59a9e403/attachment.html>