[Haskell-cafe] Is there some tool for inferring wide dependency ranges?
Oleg Grenrus
oleg.grenrus at iki.fi
Thu May 27 17:30:08 UTC 2021
I (kind of) have such tool, it's not feasible for continuous integration
usage.
For an example let consider lens-package. It has some dependencies,
plenty for
manual check to be infeasible, but not extremely many.
My tool has a mode where it tries to find an /effective/ lower bound
constrained by present `build-depends` constraints. And additional flag, to
build-verify it. I.e. this is not even a proper search.
An algorithm is simple:
foreach dependency
check which existing dependency versions are in range
splitting the set (with some heuristics considering major versions etc).
figure out the effective lower bound for each dependency.
Note: the lower bound verification is "complete" in per-dependency approach,
but full range verification is more expensive due combinatorical explosion.
Another note: if yourr lower bound is wrong you have to figure out why by
your own. Looking at the error and scanning through changelogs has often
proven to be faster approach (humans are good at combining information).
And still, this "O(dependecies x versions in average range)" algorithm
is slow
if we decide to build/verify the found bounds.
I started with somewhat outdated cabal nix-style cache, i.e. I haven't run
this command previously. (When just working on stuff cabal picks ~latest
versions, i.e. skewed dependency install plans as we need here are not
present in the cache).
After the run we get a nice output (for current `lens` master), which
has colors
in the terminal telling some information beyond just versions.
8.0.2 8.2.2 8.4.4
8.6.5 8.8.4 8.10.4 9.0.1
array 0.5.1.1 0.5.2.0 0.5.2.0
0.5.3.0 0.5.4.0 0.5.4.0 0.5.4.0
assoc 1.0.2 1.0.2 1.0.2
1.0.2 1.0.2 1.0.2 1.0.2
base 4.9.1.0 4.10.1.0 4.11.1.0
4.12.0.0 4.13.0.0 4.14.1.0 4.15.0.0
base-orphans 0.5.2 0.5.4 0.5.4
0.5.4 0.8.1 0.8.1 0.8.1
bifunctors 5.5.7 5.5.7 5.5.7
5.5.7 5.5.7 5.5.7 5.5.8
bytestring 0.10.6.0 0.10.6.0 0.10.8.0
0.10.8.0 0.10.8.1 0.10.8.1 0.10.9.0
call-stack 0.1.0 0.1.0 0.1.0
0.1.0 0.1.0 0.1.0 0.1.0
comonad 5.0.7 5.0.7 5.0.7
5.0.7 5.0.7 5.0.7 5.0.7
containers 0.5.7.0 0.5.7.0 0.5.7.0
0.5.7.0 0.5.7.0 0.5.7.0 0.5.7.0
contravariant 1.4 1.4 1.4
1.5 1.5 1.5 1.5
distributive 0.5.1 0.5.2 0.5.3
0.5.3 0.5.3 0.5.3 0.5.3
exceptions 0.8.2.1 0.8.3 0.8.3
0.10.0 0.10.2 0.10.4 0.10.4
filepath 1.2.0.0 1.2.0.0 1.2.0.0
1.2.0.0 1.2.0.0 1.2.0.0 1.2.0.0
free 5.1.5 5.1.5 5.1.5
5.1.5 5.1.5 5.1.5 5.1.5
ghc-prim 0.5.0.0 0.5.1.1 0.5.2.0
0.5.3 0.5.3 0.6.1 0.7.0
hashable 1.2.7.0 1.2.7.0 1.2.7.0
1.2.7.0 1.3.0.0 1.3.0.0 1.3.1.0
indexed-traversable 0.1.1 0.1.1 0.1.1
0.1.1 0.1.1 0.1.1 0.1.1
indexed-traversable-instances 0.1 0.1 0.1
0.1 0.1 0.1 0.1
kan-extensions 5.1 5.1 5.1
5.1 5.1 5.1 5.1
mtl 2.2.1 2.2.1 2.2.1
2.2.1 2.2.1 2.2.1 2.2.1
parallel 3.2.1.0 3.2.1.1 3.2.1.1
3.2.2.0 3.2.2.0 3.2.2.0 3.2.2.0
profunctors 5.5.2 5.5.2 5.5.2
5.5.2 5.5.2 5.5.2 5.6
reflection 2.1 2.1 2.1.3
2.1.4 2.1.4 2.1.4 2.1.4
semigroupoids 5.0.1 5.2 5.2.2
5.2.2 5.3.2 5.3.2 5.3.4
strict 0.4 0.4 0.4
0.4 0.4 0.4 0.4
tagged 0.8.6 0.8.6 0.8.6
0.8.6 0.8.6 0.8.6 0.8.6
template-haskell 2.11.1.0 2.12.0.0 2.13.0.0
2.14.0.0 2.15.0.0 2.16.0.0 2.17.0.0
text 1.2.3.0 1.2.3.0 1.2.3.0
1.2.3.0 1.2.4.0 1.2.3.2 1.2.4.1
th-abstraction 0.4.1.0 0.4.1.0 0.4.1.0
0.4.1.0 0.4.1.0 0.4.1.0 0.4.1.0
these 1.1.1.1 1.1.1.1 1.1.1.1
1.1.1.1 1.1.1.1 1.1.1.1 1.1.1.1
transformers 0.5.0.0 0.5.0.0 0.5.5.0
0.5.5.2 0.5.6.0 0.5.6.0 0.5.6.0
transformers-compat 0.5.0.4 0.5.0.4 0.5.0.4
0.5.0.4 0.5.0.4 0.5.0.4 0.5.0.4
unordered-containers 0.2.10.0 0.2.10.0 0.2.10.0
0.2.10.0 0.2.10.0 0.2.10.0 0.2.10.0
vector 0.12.1.2 0.12.1.2 0.12.1.2
0.12.1.2 0.12.1.2 0.12.1.2 0.12.2.0
... after two hours of wall time, or a day of CPU time.
dry dep build
counts 1147 238 237
q50 3.932s 84.624s 166.224s
q90 5.223s 251.763s 269.377s
total 4583.699s 25147.444s 43101.014s
usr: 80390.500s sys: 6018.400s cpu: 1204.249% time: 7175.338s
Note that solving alone takes hour and a half of CPU time,
fortunately that is highly parallelisable task.
Building dependencies takes plenty of time,
there I have only single GHC building dependencies,
to avoid rebuilding the same ones and avoding store corruption
(which shouldn't happen, but happens).
Finally the building of package itself in various combinations
is again parallizable, as store is only read.
For `lens` that has considerable cost on its own.
(After all the package is built 7 times 30 times).
---
Funny coincidence is that kan-extensions-5.1 had wrong upper bound on
base/GHC, as it doesn't build with it.
For that my tool has a differnt mode of operations for building multiple
package versions at once, a bit like matrix.hackage.haskell.org, but
locally.
That's what I use to locally test that mass revision edits are ok. We get
9.2.0.20210422 9.0.1 8.10.4 8.8.4 8.6.5
8.4.4 8.2.2 8.0.2 7.10.3 7.8.4 7.6.3 7.4.2 7.2.2 7.0.4
kan-extensions-5.2.2 NO-IP OK OK OK OK
OK OK OK OK OK OK OK NO-IP NO-IP
kan-extensions-5.2.1 NO-IP OK OK OK OK
OK OK OK OK OK OK OK NO-IP NO-IP
kan-extensions-5.2 NO-IP NO-IP OK OK OK
OK OK OK OK OK OK OK NO-IP NO-IP
kan-extensions-5.1 NO-IP FAIL OK OK OK
OK OK OK OK OK OK OK NO-IP NO-IP
kan-extensions-5.0.2 NO-IP NO-IP NO-IP NO-IP NO-IP
NO-IP OK OK OK OK OK OK NO-IP NO-IP
That operation **is fast**, as dependencies are reused well,
so there is no cold cabal nix-style store cache issues.
dry dep build
counts 70 48 48
q50 0.707s 0.128s 0.632s
q90 4.844s 0.169s 14.290s
total 163.226s 31.414s 184.551s
usr: 342.020s sys: 37.060s cpu: 905.090% time: 41.883s
Under a minute of wall clock.
That hopefully highlights why `matrix.hackage.haskell.org` is feasible, but
"bounds check" isn't. The difference in need computational resources is
huge.
So after making a revision to turn FAIL into NO-IP (no install plan),
using Hackage Trustees'
super powers, and we can return to `lens`.
The result is the same, except the minimum kan-extensions version
on GHC-9.0 is 5.2.1
This time the run is significantly faster:
dry dep build
counts 1148 238 238
q50 6.278s 0.210s 1.000s
q90 7.225s 0.284s 1.268s
total 6937.215s 69.420s 389.536s
usr: 6825.120s sys: 554.600s cpu: 2163.272% time: 341.137s
but still two hours of CPU time. (I hope solving could be a lot faster if
cabal-install-solver were available as usable library, but we are quite far
from that). Also note that cpu usage is 2200%, it's a very good idea to
have
many cores for this kind of work.
Here we benefit from local cache of dependencies, but also the local
`dist-newstyle/` directories previous run left behind. As you can see,
build-step is quite fast. The previous many hour work would need to be
repeated, if we would make a change to `lens` itself (especially in a
way such
incremental build is not very fast).
There is 11GB of build products i.e. these 7 x 30 versions of lens.
That is a
thing ot be aware of too.
In summary, if you try something like this on a GitHub-hosted GitHub Actions
runner
https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners
Hardware specification for Windows and Linux virtual machines:
2-core CPU
7 GB of RAM memory
14 GB of SSD disk space
... it just won't work.
---
Do I have suggestions what to do instead?
For example haskell-ci allows to specify constraint sets,
and I have one for example in quickcheck-instances,
https://github.com/haskellari/qc-instances/blob/master/cabal.haskell-ci
That is a (rare) package where I do care about having as wide dependency
range as possible, as I don't want to run into problems due it.
That cost is just one additional build per CI run (note: also dependencies).
Honestly, I don't remember how I found those versions, whether using my tool
or by trial-an-error. As you can see from the history of that file, I don't
need to edit them often. (last modification is August 2020).
For ordinary packages, I'd just suggest using picking the current
versions as
lower bounds (e.g. what gen-bounds suggests), and sticking to them until
there
is a reason to not too. Don't spend too much time pushing them down from
what's currently possible, if there isn't real reason. You may pick older
Stackage LTS snapshot (for the oldest GHC you want to support) for some
inspiration, though I often don't do that, as dependency versions tend to be
quite old. For example there is no value in allowing older QuickCheck
versions in your test-suite, it's unluckily they will find you bugs that
newer
versions won't.
- Oleg
On 27.5.2021 16.07, Askar Safin via Haskell-Cafe wrote:
> Hi. Is there some tool for inferring dependency ranges for my cabal packages? I know about "cabal gen-bounds", but this command gives too short ranges. I want to get as wide range as possible. I. e. I want tool, which tries to build and test my package on various versions of dependencies, and reports resulting range. Successful testing should mean that the version of dependency is compatible. That tool should work also on "base" package, this means that it should test various ghc versions, too.
>
> If there is no such tool, then at least I want some tool, which will check already specified dependency range. Such tool should do this:
> - Extract dependencies from cabal file
> - Install smallest versions of specified dependencies (this will include "base" and thus ghc)
> - Test on this versions
> - Install largest versions
> - Test on this versions
>
> I will put invocation of such tool to continuous integration.
>
> I hope there are a lot of package maintainers here. How do you manage dependencies?
>
> ==
> Askar Safin
> http://safinaskar.com
> https://sr.ht/~safinaskar
> https://github.com/safinaskar
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
More information about the Haskell-Cafe
mailing list