A dashboard for fragile tests

Bryan Richter bryan at haskell.foundation
Mon Jun 10 13:42:59 UTC 2024


Hi all,

This weekend at Zurihac, I wanted to start making it possible to explore
fragile tests.

Fragile tests are tests that pass nondeterministically. Marking them
fragile means that they do not influence the overall success of the
testsuite. It's been a way of sweeping problems under the rug.

For a few years, the GHC test infrastructure has been recording fragile
test results into a database. We record whether the test passed or failed.

Now we can start peeking under the rug to see what kind of patterns have
developed.

I was mostly interested in whether or not fragile tests are truly fragile.
It looks like many are not.

The dashboard has bucketed results into months. Every cell shows the pass
rate for a given month (column) and test (row).

Red cells mean 0% success. Green cells, 100% success. Shades of yellow are
everything in between.

It's remarkable how much green there is. One would assume even fragile
tests would fail for legitimate reasons sometimes!

At this point, the visualization is mostly good for seeing large trends.
Any other observations would be appreciated!

The dashboard is interactive at
https://grafana.gitlab.haskell.org/goto/XcuO3ZUIg?orgId=2

[image: image.png]

-Bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20240610/25c5bdfe/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 65895 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20240610/25c5bdfe/attachment.png>


More information about the ghc-devs mailing list