Request for feedback on spec/proposal for distributing package collections via hackage

Edward Z. Yang ezyang at mit.edu
Tue Jul 14 19:02:12 UTC 2015


Hello Duncan,

In my eyes, this proposal looks like some sort of generalization of
Stackage; and one further use case is "special purpose" collection.  My
big question: how composable are these collections really?  I can't put
two collections with conflicting versions together (or can I? Do I
union?); and is there any point to having a collection without versions
in it? (If Cabal syntax is extended to support depending on collections
as well as packages, yes?)

The classic use-case for package collections is deployment settings, ala
Stack, or even Cargo lockfiles / Bundler Gemfile.lock (versioned
collections). In all these use-cases package collections are treated as
non-compositional things.  http://doc.crates.io/guide.html
http://bundler.io/v1.7/rationale.html#checking-your-code-into-version-control
Libraries (compositional) do NOT publish lockfiles: only executables
(non-compositional) DO.

Re the file format, it seems fine; suitable for the lockfile use-case
and the Stackage use-case.  Less sure about the unioning semantics.

Edward

Excerpts from Duncan Coutts's message of 2015-07-14 05:52:46 -0700:
> Hi folks,
> 
> I'd like to get feedback on a spec/proposal for distributing package
> collections via hackage. This is currently somewhere beyond vapourware
> but certainly not a fait accompli and hopefully it is at an appropriate
> point to get feedback.
> 
> The basic idea is that package collections are:
>       * useful (IMHO, one of the top two solutions to dependency hell,
>         alongside nix-style package management); and
>       * just as we distribute packages via hackage, we should also be
>         able to easily distribute package collections.
> 
> One would then use them with tools like cabal and stack. Distributing
> via hackage (both in the sense of the format/protocol and in the sense
> of the central community hackage instance) seems natural, and allows
> taking advantage of much of the infrastructure we have for packages
> already like:
>       * existing user accounts and management infrastructure on the
>         hackage website
>       * allowing anyone to host collections on their own servers, just
>         as they can host their own package archives currently (either as
>         static file sets or with smart servers)
>       * low barrier for distribution, potentially encouraging more
>         collections to be created potentially covering more use cases
>       * security infrastructure (currently in alpha)
>       * automatic mirroring (currently in alpha)
> 
> Two obvious examples are stackage-lts and stackage-nightly but if we
> lower the barrier for distribution then there may well be many more. For
> example, the existing Linux distros put a lot of effort into selecting
> and maintain package collections, and some of these collections could be
> distributed via hackage. In fast several Linux distributions already use
> Hackage's "distro" feature to advertise which versions of packages are
> provided by that distro. One can also imagine special-purpose
> collections, and there's probably cases we've not thought of yet.
> 
> Package collections are different things from packages, not like "meta
> packages" that one gets in some package systems. A package collection at
> it's simplest is just a set of source package identifiers (ie
> names-version pairs). Like packages, package collections have names and
> versions and are immutable once distributed.
> 
> The intention is that users can configure their tool to use
> collection(s), either by nailing down a specific collection version, or
> by not specifying a version it would default to the latest version of
> the named collection. (But the specific behaviour is up to the tool)
> 
> Use cases:
> 
>       * versioned collections. For some collections the policy by which
>         it's defined naturally uses meaningful versions.
>       * daily collections. These can have a date-form version number
>         imposed on them.
>       * "live" "rolling" collections. These could have a simple
>         monotonic increasing version with no particular meaning
>         attached. For such collections, clients might be configured to
>         use the latest (by not specifying a version), but it's always
>         possible to pick a specific revision.
>       * special-purpose collections. Not necessarily collections aiming
>         to cover a large number of common packages, but aiming to cover
>         some application area, or related stack of packages (e.g. some
>         of the web frameworks).
>       * negative collections. Collections of packages you may
>         specifically want to avoid (e.g. deprecated by their authors, or
>         known-broken). Using such collections would rely on clients that
>         can be configured to treat it negatively.
> 
> Specifics:
> 
> A package collection specifies a set of source package ids (id being
> name-version pair). It also optionally specifies a (partial) flag
> assignment for any package name.
> 
> The collection does not specify how tools should treat them. That is, a
> collection does not specify if it should be treated as a strong or a
> soft constraint, inclusive or exclusive, positive or negative. Such
> things are completely up to the client's policy and configuration.
> Similarly for flag assignments, collections do not specify whether tools
> should interpret these as strong or soft constraints.
> 
> Syntax:
> 
> Package collection names and versions exactly follow those of package
> names (but they live in a different namespace). For example,
> "stackage-lts-2.9", or "deprecated-343" (the latter being a "rolling"
> collection with a meaningless monotonically increasing version).
> 
> A collection distributed in the archive format is just a text file with
> one entry per line, such as:
> 
>         foo-1.0
>         foo-1.1
>         bar >= 3 && < 4
>         bar +this -that
> 
> So each line can be one of:
>       * a simple package id
>       * a package version range, using Cabal version range syntax
>       * a package name with a flag assignment, + for on, - for off
> 
> The interpretation of the above is that:
>       * both foo-1.0 and foo-1.1 are in the collection (ie union not
>         intersection)
>       * all versions of bar between 3 and 4 are in the collection
>       * the package bar has flag 'this' as True, and flag 'that' as
>         False
> 
> Of course for some collections the policy is that only one version of
> any package is included, but this is a policy question and the format
> itself does not impose this constraint.
> 
> Hackage archive format:
> 
> collection files live under a different prefix from package tarballs
> (but are still considered part of the archive) and are named after the
> collection id. The collection files are not compressed (but of course
> http clients and servers can negotiate transport compression). The
> collection files are not included in nor listed in the existing
> 00-index.tar.gz, but there's other json format metadata for a client to
> enumerate the available collections and versions. And like with package
> tarballs, a client that wants a specific collection version can
> construct the url and fetch it directly.
> 
> Security:
> 
> The hackage security system that's currently in alpha testing can easily
> be extended to cover collections, similarly to how it covers package
> tarballs.
> 
> Misc notes:
> 
> There is no requirement that a hackage-format repo containing
> collections be closed. That is, the collections may refer to packages
> not in that archive. This could be useful for private hackage repos that
> host a small number of private packages, but also host collections that
> refer both to the private packages and public ones from the community
> central hackage. The resolution of package names is done by the clients,
> and some clients may be configured to union/overlay multiple repos.
> 
> On the other hand, for the central community hackage it may be sensible
> to enforce a policy that the collections it distributes be closed (ie
> refer only to packages distributed via hackage).
> 
> Questions:
> 
> Is this sufficiently flexible to fully cover the obvious use cases? Are
> there any interesting use cases that are excluded?
> 
> Anything else?
> 
> 
> Duncan
> 


More information about the Libraries mailing list