[Haskell-cafe] [haskell-infrastructure] Improvements to package hosting and security

Michael Snoyman michael at snoyman.com
Wed Apr 15 04:47:56 UTC 2015


I'd like to ignore features of Hackage like "browsing code" for purposes of
this discussion. That's clearly something that can be a feature layered on
top of a real package store by a web interface. I'm focused on just that
lower level of actually creating a coherent set of packages.

In that realm, I think you've understated what trust we're putting in
Hackage today. We have to trust it to:

* Properly authenticate users
* Keep authorization lists of who can make uploads/revisions (and who can
grant those rights)
* Allow safe uploads of packages and metadata
* Distribute packages and metadata to users safely

I think we agree, but I'll say it outright: Hackage currently *cannot*
succeed at the last two points, since all interactions with it from
cabal-install are occurring over non-secure HTTP connections, making it
vulnerable to MITM attacks on both upload and download. The package signing
work- if completely adopted by the community- would address that.

What I'm raising here is the first two points. And even those points have
an impact on the other two points. To draw this out a bit more clearly:

* Currently, authorized uploaders are identified by a user name and a
password on Hackage. How do we correlate that to a GPG key? Ideally, the
central upload authority would be collecting GPG public keys for all
uploaders so that signature verification can happen correctly.
* There's no way for an outside authority to vet the 00-index.tar.gz file
downloaded from Hackage; it's a completely opaque, black box. Having the
set of authorization rules be publicly viewable, auditable, and verifiable
overcomes that.

I'd really like to make sure that we're separating two questions here: (1)
Is there a problem with the way we're trusting Hackage today? (2) Is the
strawman proposal I sent anywhere close to a real solution? I feel strongly
about (1), and very weakly about (2).

On Wed, Apr 15, 2015 at 7:07 AM Gershom B <gershomb at gmail.com> wrote:

> So I want to focus just on the idea of a “trust model” to hackage packages.
>
> I don’t think we even have a clear spec of the problem we’re trying to
> solve here in terms of security. In particular, the basic thing hackage is
> a central authority for is “packages listed on hackage” — it provides a
> namespace, and on top of that provides the ability to explore the code
> under each segment of the namespace, including docs and code listings.
> Along with that it provides the ability to search through that namespace
> for things like package descriptions and names.
>
> Now, how does security fit into this? Well, at the moment we can prevent
> packages from being uploaded by people who are not authorized. And whoever
> is authorized is the first person who uploaded the package, or people they
> delegate to, or people otherwise added by hackage admins via e.g. the
> orphaned package takeover process. A problem is this is less a guarantee
> than we would like since e.g. accounts may be compromised, we could be
> MITMed (or the upload could be) etc.
>
> Hence comes the motivation for some form of signing. Now, I think the
> proposal suggested is the wrong one — it says “this is a trustworthy
> package” for some notion of a web of trust of something. Webs of trust are
> hard and work poorly except in the small. It would be better, I think, to
> have something _orthogonal_ to hackage or any other package distribution
> system that attempts a _much simpler_ guarantee — that e.g. the person who
> signed a package as being “theirs” is either the same person that signed
> the prior version of the package, or was delegated by them (or hackage
> admins). Now, on top of that, we could also have a system that allowed for
> individual users, if they had some notion of “a person’s signature” such
> that they believed it corresponded to a person, to verify that _actual_
> signature was used. But there is no web of trust, no idea given of who a
> user does or doesn’t believe is who they say they are or anything like
> that. We don’t attempt to guarantee anything more than a “chain of
> custody,” which is all we now have (weaker) mechanisms to enforce.
>
> In my mind, the key elements of such a system are that it is orthogonal to
> how code is distributed and that it is opt-in/out.
>
> One model to look at might be Apple’s — distribute signing keys widely,
> but allow centralized revocation of a malicious actor is found. Another
> notion, somewhat similar, is ssl certificates. Anybody, including a
> malicious actor, can get such a certificate. But at least we have the
> guarantee that once we start talking to some party, malicious or otherwise,
> no other party will “swap in” for them midstream.
>
> In general, what I’m urging is to limit the scope of what we aim for. We
> need to give users the tools to enforce the level of trust that they want
> to enforce, and to verify certain specific claims. But if we shoot for
> more, we will either have difficult to use system, or will fail in some
> fashion. And furthermore I think we should have this discussion
> _independent_ of hackage, which serves a whole number of functions, and
> until recently hasn’t even _purported_ to even weakly enforce any
> guarantees about who uploaded the code it hosts.
>
> Cheers,
> Gershom
>
>
> On April 14, 2015 at 10:57:00 PM, Carter Schonwald (
> carter.schonwald at gmail.com) wrote:
> > any use of cryptographic primitives of any form NEEDS to articulate what
> > the trust model is, and what the threat model is
> >
> > likewise, i'm trying to understand who the proposed feature set is meant
> to
> > serve.
> >
> > Several groups are in the late stages of building prototypes at varying
> > points in the design space for improving package hosting right now for
> > haskell, and I'm personally inclined to let those various parties release
> > the tools, and then experiment with them all, before trying to push
> heavily
> > for any particular design that hasn't had larger community
> experimentation.
> >
> > I actually care most about being able to have the full package set be git
> > cloneable, both for low pain on premise hackage hosting for corporate
> > intranets, and also for when i'm on a plane or boat and have no wifi. At
> > my current job, ANY "host packages via s3" approach is totally untenable,
> > and i'm sure among haskell using teams/organizations, this isn't a unique
> > problem!
> >
> > The Author authentication/signing model question in an important one, but
> > I"m uncomfortable with just saying "SHA512 and GPG address that". Theres
> A
> > LOT of subtlety to designing a signing protocol thats properly audit-able
> > and secure! Indeed, GPG isn't even a darn asymmetric crypto algorithm,
> its
> > a program that happens to IMPLEMENT many of these algorithms. If we are
> > serious about having robust auditing/signing, handwaving about the
> > cryptographic parts while saying its important is ... kinda
> irresponsible.
> > And frustrating because it makes it hard to evaluate the hardest parts of
> > the whole engineering problem! The rest of the design is crucially
> > dependent on details of these choices, and yet its that part which isn't
> > specified.
> >
> > to repeat myself: there is a pretty rich design space for how we can
> evolve
> > future hackage, and i worry that speccing things out and design by
> > committee is going to be less effective than encouraging various parties
> to
> > build prototypes for their own visions of future hackage, and THEN come
> > together to combine the best parts of everyones ideas/designs. Theres so
> > much diversity in how different people use hackage, i worry that any
> other
> > way will run into failing to serve the full range of haskell users!
> >
> > cheers
> >
> > On Tuesday, April 14, 2015 at 1:01:17 AM UTC-4, Michael Snoyman wrote:
> > >
> > > That could work in theory. My concern with such an approach is that-
> > > AFAIK- the tooling around that kind of stuff is not very well
> developed, as
> > > opposed to an approach using Git, SHA512, and GPG, which should be
> easy to
> > > combine. But I could be completely mistaken on this point; if existing,
> > > well vetted technology exists for this, I'm not opposed to using it.
> > >
> > > On Mon, Apr 13, 2015 at 6:04 PM Arnaud Bailly | Capital Match <
> > > arn... at capital-match.com > wrote:
> > >
> > >> Just thinking aloud but wouldn't it be possible to take advantage of
> > >> cryptographic ledgers a la Bitcoin for authenticating packages and
> tracking
> > >> the history of change ? This would provide redundancy as the
> transactions
> > >> log is distributed and "naturally" create a web of trust or at least
> > >> authenticate transactions. People uploading or modifying a package
> would
> > >> have to sign a transactions with someone having enough karma to allow
> this.
> > >>
> > >> Then packages themselves could be completely and rather safely
> > >> distributed through standard p2p file sharing.
> > >>
> > >> I am not a specialist of crypto money, though.
> > >>
> > >> My 50 cts
> > >> Arnaud
> > >>
> > >> Le lundi 13 avril 2015, Dennis J. McWherter, Jr. > >> > a écrit :
> > >>
> > >>> This proposal looks great. The one thing I am failing to understand
> (and
> > >>> I recognize the proposal is in early stages) is how to ensure
> redundancy in
> > >>> the system. As far as I can tell, much of this proposal discusses the
> > >>> centralized authority of the system (i.e. ensuring secure
> distribution) and
> > >>> only references (with little detail) the distributed store. For
> instance,
> > >>> say I host a package on a personal server and one day I decide to
> shut that
> > >>> server down; is this package now lost forever? I do see this line:
> "backup
> > >>> download links to S3" but this implies that the someone is willing
> to pay
> > >>> for S3 storage for all of the packages.
> > >>>
> > >>> Are there plans to adopt a P2P-like model or something similar to
> > >>> support any sort of replication? Public resources like this seem to
> come
> > >>> and go, so it would be nice to avoid some of the problems associated
> with
> > >>> high churn in the network. That said, there is an obvious cost to
> > >>> replication. Likewise, the central authority would have to be
> updated with
> > >>> new, relevant locations to find the file (as it is currently
> proposed).
> > >>>
> > >>> In any case, as I said before, the proposal looks great! I am looking
> > >>> forward to this.
> > >>>
> > >>> On Monday, April 13, 2015 at 5:02:46 AM UTC-5, Michael Snoyman wrote:
> > >>>>
> > >>>> Many of you saw the blog post Mathieu wrote[1] about having more
> > >>>> composable community infrastructure, which in particular focused on
> > >>>> improvements to Hackage. I've been discussing some of these ideas
> with both
> > >>>> Mathieu and others in the community working on some similar
> thoughts. I've
> > >>>> also separately spent some time speaking with Chris about package
> > >>>> signing[2]. Through those discussions, it's become apparent to me
> that
> > >>>> there are in fact two core pieces of functionality we're relying on
> Hackage
> > >>>> for today:
> > >>>>
> > >>>> * A centralized location for accessing package metadata (i.e., the
> > >>>> cabal files) and the package contents themselves (i.e., the sdist
> tarballs)
> > >>>> * A central authority for deciding who is allowed to make releases
> of
> > >>>> packages, and make revisions to cabal files
> > >>>>
> > >>>> In my opinion, fixing the first problem is in fact very
> straightforward
> > >>>> to do today using existing tools. FP Complete already hosts a full
> Hackage
> > >>>> mirror[3] backed by S3, for instance, and having the metadata
> mirrored to a
> > >>>> Git repository as well is not a difficult technical challenge. This
> is the
> > >>>> core of what Mathieu was proposing as far as composable
> infrastructure,
> > >>>> corresponding to next actions 1 and 3 at the end of his blog post
> (step 2,
> > >>>> modifying Hackage, is not a prerequesite). In my opinion, such a
> system
> > >>>> would far surpass in usability, reliability, and extensibility our
> current
> > >>>> infrastructure, and could be rolled out in a few days at most.
> > >>>>
> > >>>> However, that second point- the central authority- is the more
> > >>>> interesting one. As it stands, our entire package ecosystem is
> placing a
> > >>>> huge level of trust in Hackage, without any serious way to vet
> what's going
> > >>>> on there. Attack vectors abound, e.g.:
> > >>>>
> > >>>> * Man in the middle attacks: as we are all painfully aware,
> > >>>> cabal-install does not support HTTPS, so a MITM attack on downloads
> from
> > >>>> Hackage is trivial
> > >>>> * A breach of the Hackage Server codebase would allow anyone to
> upload
> > >>>> nefarious code[4]
> > >>>> * Any kind of system level vulnerability could allow an attacker to
> > >>>> compromise the server in the same way
> > >>>>
> > >>>> Chris's package signing work addresses most of these
> vulnerabilities,
> > >>>> by adding a layer of cryptographic signatures on top of Hackage as
> the
> > >>>> central authority. I'd like to propose taking this a step further:
> removing
> > >>>> Hackage as the central authority, and instead relying entirely on
> > >>>> cryptographic signatures to release new packages.
> > >>>>
> > >>>> I wrote up a strawman proposal last week[5] which clearly needs
> work to
> > >>>> be a realistic option. My question is: are people interested in
> moving
> > >>>> forward on this? If there's no interest, and everyone is satisfied
> with
> > >>>> continuing with the current Hackage-central-authority, then we can
> proceed
> > >>>> with having reliable and secure services built around Hackage. But
> if
> > >>>> others- like me- would like to see a more secure system built from
> the
> > >>>> ground up, please say so and let's continue that conversation.
> > >>>>
> > >>>> [1] https://www.fpcomplete.com/blog/2015/03/composable-
> > >>>> community-infrastructure
> > >>>> [2] https://github.com/commercialhaskell/commercialhaskell/wiki/
> > >>>> Package-signing-detailed-propsal
> > >>>> [3] https://www.fpcomplete.com/blog/2015/03/hackage-mirror
> > >>>> [4] I don't think this is just a theoretical possibility for some
> point
> > >>>> in the future. I have reported an easily trigerrable DoS attack on
> the
> > >>>> current Hackage Server codebase, which has been unresolved for 1.5
> months
> > >>>> now
> > >>>> [5] https://gist.github.com/snoyberg/732aa47a5dd3864051b9
> > >>>>
> > >>> --
> > >>>
> > >> You received this message because you are subscribed to the Google
> Groups
> > >>> "Commercial Haskell" group.
> > >>> To unsubscribe from this group and stop receiving emails from it,
> send
> > >>> an email to commercialhaskell+unsubscribe at googlegroups.com.
> > >>> To post to this group, send email to
> commercialhaskell at googlegroups.com.
> > >>>
> > >> To view this discussion on the web visit
> > >>>
> https://groups.google.com/d/msgid/commercialhaskell/4487776e-b862-429c-adae-477813e560f3%40googlegroups.com
> > >>>
> > >>> .
> > >>
> > >>
> > >>> For more options, visit https://groups.google.com/d/optout.
> > >>>
> > >>
> > >>
> > >> --
> > >> *Arnaud Bailly*
> > >>
> > >> CTO | Capital Match
> > >>
> > >> CapitalMatch
> > >>
> > >> 71 Ayer Rajah Crescent | #06-16 | Singapore 139951
> > >>
> > >> (FR) +33 617 121 978 / (SG) +65 8408 7973 | arn... at capital-match.com
> > >> | www.capital-match.com
> > >>
> > >> Disclaimer:
> > >>
> > >> *Capital Match Platform Pte. Ltd. (the "Company") registered in
> Singapore
> > >> (Co. Reg. No. 201501788H), a subsidiary of Capital Match Holdings
> Pte. Ltd.
> > >> (Co. Reg. No. 201418682W), provides services that involve arranging
> for
> > >> multiple parties to enter into loan and invoice discounting
> agreements. The
> > >> Company does not provide any form of investment advice or
> recommendations
> > >> regarding any listings on its platform. In providing its services, the
> > >> Company's role is limited to an administrative function and the
> Company
> > >> does not and will not assume any advisory, fiduciary or other duties
> to
> > >> clients of its services.*
> > >>
> > >> _______________________________________________
> > haskell-infrastructure mailing list
> > haskell-infrastructure at community.galois.com
> > http://community.galois.com/mailman/listinfo/haskell-infrastructure
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Commercial Haskell" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to commercialhaskell+unsubscribe at googlegroups.com.
> To post to this group, send email to commercialhaskell at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/commercialhaskell/etPan.552de40d.3d1b58ba.f2%40mbp.local
> .
> For more options, visit https://groups.google.com/d/optout.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150415/072aa50d/attachment.html>


More information about the Haskell-Cafe mailing list