[Haskell-cafe] [haskell-infrastructure] Improvements to package hosting and security

Wed Apr 15 05:18:29 UTC 2015

Ok, to narrow it down, you are concerned about the ability to

> * Properly authenticate users
> * Keep authorization lists of who can make uploads/revisions (and who can grant those rights)

and more specifically:

> * Currently, authorized uploaders are identified by a user name and a
> password on Hackage. How do we correlate that to a GPG key? Ideally, the
> central upload authority would be collecting GPG public keys for all
> uploaders so that signature verification can happen correctly.
> * There's no way for an outside authority to vet the 00-index.tar.gz file
> downloaded from Hackage; it's a completely opaque, black box. Having the
> set of authorization rules be publicly viewable, auditable, and verifiable
> overcomes that.

On 1) now you have the problem “what if the central upload authority’s store of GPG keys is violated”. You’ve just kicked the can. “Web of Trust” is not a tractable answer. My answer is simpler: I can verify that the signer of version 1 of a package is the same as the signer of version 0.1. This is no small trick. And I can do so orthogonal to hackage. Now, if I really want to verify that the signer of version 1 is the person who is “Michael Snoyman” and is in fact the exact Michael Snoyman I intend, then I need to get your key by some entirely other mechanism. And that is my problem, and, definitionally, no centralized store can help me in that regard unless I trust it absolutely — which is precisely what I don’t want to do.

On 2) I would like to understand more of what your concern with regards to “auditing” is. What specific information would you like to know that you do not? Improved audit logs seem again orthogonal to any of these other security concerns, unless you are simply worried about a “metadata only” attack vector. In any case, we can incorporate the same signing practices for metadata as for packages — orthogonal to hackage or any other particular storage mechanism. It is simply an unrelated question. And, honestly, compared to all the other issues we face I feel it is relatively minor (the signing component, not a better audit trail).

In any case, your account of the first two points reveals some of the confusion I think that remains:

> * Allow safe uploads of packages and metadata
> * Distribute packages and metadata to users safely

What is the definition of “safe” here? My understanding is that in the field of security one doesn’t talk about “safe” in general, but with regards to a particular profile of a sort of attacker, and always only as a difference of degree, not kind.

So who do we want to prevent from doing what? How “safe” is “safe”? Safe from what? From a malicious script-kid, from a malicious collective “in it for the lulz,” from a targeted attack against a particular end-client, from just poorly/incompetently written code? What are we “trusting”? What concrete guarantees would we like to make about user interactions with packages and package repositories?

While I’m interrogating language, let me pick out one other thing I don’t understand: "creating a coherent set of packages” — what do you mean by “coherent”? Is this something we can specify? Hackage isn’t supposed to be coherent — it is supposed to be everything. Within that “everything” we are now attempting to manage metadata to provide accurate dependency information, at a local level. But we have no claims about any global coherence conditions on the resultant graphs. Certainly we intend to be coherent in the sense that the combination of a name/version/revision should indicate one and only one thing (and that all revisions of a version should differ at most in dependency constraints in their cabal file) — but this is a fairly minimal criteria. And in fact, it is one that is nearly orthogonal to security concerns altogether.

What I’m driving at is — it sounds like we _mainly_ want new decentralized security mechanisms, at the cabal level, but we also want, potentially, a few centralized mechanisms. However, centralization is weakness from a security standpoint. So, ideally, we want as few centralized mechanisms as possible, and we want the consequences of those mechanisms being broken to be “recoverable” at the point of local verification.

Let me spell out a threat model where that makes sense. An adversary takes control of the entire hackage server through some zero day linux exploit we have no control over — or perhaps they are an employee at the datacenter where we host hackage and secure control via more direct means, etc. They have total and complete control over the box. They can accept anything they want, and they can serve anything they want. And they are sophisticated enough to be undetected for say a week.

Now, we want it to be the case that _whatever_ this adversary does, they cannot “trick” someone who types “cabal install warp” into instead cabal installing something malicious. How do we do so? _Now_ we have a security problem that is concrete enough to discuss. And furthermore, I would claim that if we don’t have at least some story for this threat model, then we haven’t established anything much “safer” at all.

This points towards a large design space, and a lot of potential ideas, all of which feel entirely different than the “strawman” proposal, since the emphasis there is towards the changes to a centralized mechanism (even if in turn, the product of that mechanism itself is then distributed and git cloneable or whatever).

Cheers,
Gershom

On April 15, 2015 at 12:47:58 AM, Michael Snoyman (michael at snoyman.com) wrote:
> I'd like to ignore features of Hackage like "browsing code" for purposes of
> this discussion. That's clearly something that can be a feature layered on
> top of a real package store by a web interface. I'm focused on just that
> lower level of actually creating a coherent set of packages.
>  
> In that realm, I think you've understated what trust we're putting in
> Hackage today. We have to trust it to:
>  
> * Properly authenticate users
> * Keep authorization lists of who can make uploads/revisions (and who can
> grant those rights)
> * Allow safe uploads of packages and metadata
> * Distribute packages and metadata to users safely
>  
> I think we agree, but I'll say it outright: Hackage currently *cannot*
> succeed at the last two points, since all interactions with it from
> cabal-install are occurring over non-secure HTTP connections, making it
> vulnerable to MITM attacks on both upload and download. The package signing
> work- if completely adopted by the community- would address that.
>  
> What I'm raising here is the first two points. And even those points have
> an impact on the other two points. To draw this out a bit more clearly:
>  
> * Currently, authorized uploaders are identified by a user name and a
> password on Hackage. How do we correlate that to a GPG key? Ideally, the
> central upload authority would be collecting GPG public keys for all
> uploaders so that signature verification can happen correctly.
> * There's no way for an outside authority to vet the 00-index.tar.gz file
> downloaded from Hackage; it's a completely opaque, black box. Having the
> set of authorization rules be publicly viewable, auditable, and verifiable
> overcomes that.
>  
> I'd really like to make sure that we're separating two questions here: (1)
> Is there a problem with the way we're trusting Hackage today? (2) Is the
> strawman proposal I sent anywhere close to a real solution? I feel strongly
> about (1), and very weakly about (2).