[Haskell-cafe] Proposal: Technique to handle package dependency gridlock

Thu Aug 2 07:29:28 CEST 2012

We've got a problem with dependencies:
http://cdsmith.wordpress.com/2011/01/21/a-recap-about-cabal-and-haskell-libraries/
http://cdsmith.wordpress.com/2011/01/17/the-butterfly-effect-in-cabal/
http://www.reddit.com/r/haskell/comments/x4knd/what_is_the_reason_for_haskells_cabal_package/

I'd like to present a proto-proposal for another arrow in our quiver.

First, a few...
Principles:

  This problem isn't uniquely Haskell's
    ...Although it may be uniquely Haskell's to solve. Lots of
languages have a problem managing their package dependencies. To quote
Chris Smith, "it’s fair to say that perhaps Haskell is one of the few
environments where we’ve got a prayer at solving the problem."

  It's a magnitude problem
    The trick is to not find one perfect solution; it's to use enough
effective solutions that the remaining tricky cases can be swept away
individually. Solving the problem will most likely involve several
techniques (good versioning policy, a little work on the part of
package maintainers, smarter Cabal, new tools, etc.). Once we get the
problem down to a manageable level, then it's a problem that can be
dealt with package-by-package.

  Haskell packages are not black boxes
    Inheriting - from the imperative world - the idea of packages as
indivisible units may be a mistake. A language like Ruby may have to
import a full library, because it's nearly-impossible to reason about
the behavior of part of it in isolation from the rest of it. Haskell's
not like that, though! Referential transparency gives us awesome
powers to reason about a pure function's behavior in nearly all cases.
    We could define package dependency in terms of providing
functions-that-behave-like-the-ones-we-used.
    I mean, for god's sake, we can write QuickCheck properties like
`quickSort == bubbleSort` and test it with a very high degree of
assurance. The ability to pass the same suite of QuickCheck tests, or
even the ability to pass a test like `fmap-3.1.2 == fmap 3.1.3`, could
be strong enough proof - in the vast majority of cases - of functions'
equivalence and therefore interchangeability. This could ease a
significant number of package version constraints. Call it
Property-Based Versioning - where your packages dependencies are based
on which other versions of the package can provide the same
functionality *for the functions and types that your package uses*.

  Programmer time is precious
    Thus far, the package dependency issue has been more or less a
tug-of-war between package maintainers not wanting to do more work,
and package users not wanting to do more work. Users just want to use
the packages. Maintainers have a potentially-endless list of versions
to check compatibility for ("I'm using foo-2.5.3. I assume anything
2.3 or greater is fine. Should I check through and see if 2.4 works?
2.3? 2.2? [...]"). Not to mention they can't know what the future will
hold for later versions of the dependent packages, and they have to
make a usually-pretty-uneducated guess about the future.

    Instead of maintainers and users, maybe computers should lift some
of the load. Huge swaths of the most-used packages on Hackage are pure
functions and data types. These are relatively easy to reason about,
and if we could come close to reducing our dependency problem by, say,
30-50%, that would in my view put us within striking distance of
having a very smoothly-running package ecosystem.

  These are decisions the community has to make
    There are lots of decisions to make and tradeoffs to balance. The
main reason that I'm presenting such an embryonic description of this
technique is to see what the Haskell community values most.
    In my view the ideal would be to be able to write a dependency
description like "This package needs functions that behave like the
ones I used in foo-3.1.2". A potentially less hair-shirty way to get
some of the same functionality is for package authors to have a
cabal-install or pre-cabal-install tool that determines which
functions and data types are used from a package, finds equivalent
ones in other packages, and write a dependency spec based on its
findings (this wouldn't have anything to say about compatibility of
not-yet-written packages, of course).

Tom