[Haskell-cafe] [haskell-infrastructure] Improvements to package hosting and security

Wed Apr 15 04:07:25 UTC 2015

So I want to focus just on the idea of a “trust model” to hackage packages.

I don’t think we even have a clear spec of the problem we’re trying to solve here in terms of security. In particular, the basic thing hackage is a central authority for is “packages listed on hackage” — it provides a namespace, and on top of that provides the ability to explore the code under each segment of the namespace, including docs and code listings. Along with that it provides the ability to search through that namespace for things like package descriptions and names.

Now, how does security fit into this? Well, at the moment we can prevent packages from being uploaded by people who are not authorized. And whoever is authorized is the first person who uploaded the package, or people they delegate to, or people otherwise added by hackage admins via e.g. the orphaned package takeover process. A problem is this is less a guarantee than we would like since e.g. accounts may be compromised, we could be MITMed (or the upload could be) etc.

Hence comes the motivation for some form of signing. Now, I think the proposal suggested is the wrong one — it says “this is a trustworthy package” for some notion of a web of trust of something. Webs of trust are hard and work poorly except in the small. It would be better, I think, to have something _orthogonal_ to hackage or any other package distribution system that attempts a _much simpler_ guarantee — that e.g. the person who signed a package as being “theirs” is either the same person that signed the prior version of the package, or was delegated by them (or hackage admins). Now, on top of that, we could also have a system that allowed for individual users, if they had some notion of “a person’s signature” such that they believed it corresponded to a person, to verify that _actual_ signature was used. But there is no web of trust, no idea given of who a user does or doesn’t believe is who they say they are or anything like that. We don’t attempt to guarantee anything more than a “chain of custody,” which is all we now have (weaker) mechanisms to enforce.

In my mind, the key elements of such a system are that it is orthogonal to how code is distributed and that it is opt-in/out.

One model to look at might be Apple’s — distribute signing keys widely, but allow centralized revocation of a malicious actor is found. Another notion, somewhat similar, is ssl certificates. Anybody, including a malicious actor, can get such a certificate. But at least we have the guarantee that once we start talking to some party, malicious or otherwise, no other party will “swap in” for them midstream.

In general, what I’m urging is to limit the scope of what we aim for. We need to give users the tools to enforce the level of trust that they want to enforce, and to verify certain specific claims. But if we shoot for more, we will either have difficult to use system, or will fail in some fashion. And furthermore I think we should have this discussion _independent_ of hackage, which serves a whole number of functions, and until recently hasn’t even _purported_ to even weakly enforce any guarantees about who uploaded the code it hosts.

Cheers,
Gershom

On April 14, 2015 at 10:57:00 PM, Carter Schonwald (carter.schonwald at gmail.com) wrote:
> any use of cryptographic primitives of any form NEEDS to articulate what
> the trust model is, and what the threat model is
>  
> likewise, i'm trying to understand who the proposed feature set is meant to
> serve.
>  
> Several groups are in the late stages of building prototypes at varying
> points in the design space for improving package hosting right now for
> haskell, and I'm personally inclined to let those various parties release
> the tools, and then experiment with them all, before trying to push heavily
> for any particular design that hasn't had larger community experimentation.
>  
> I actually care most about being able to have the full package set be git
> cloneable, both for low pain on premise hackage hosting for corporate
> intranets, and also for when i'm on a plane or boat and have no wifi. At
> my current job, ANY "host packages via s3" approach is totally untenable,
> and i'm sure among haskell using teams/organizations, this isn't a unique
> problem!
>  
> The Author authentication/signing model question in an important one, but
> I"m uncomfortable with just saying "SHA512 and GPG address that". Theres A
> LOT of subtlety to designing a signing protocol thats properly audit-able
> and secure! Indeed, GPG isn't even a darn asymmetric crypto algorithm, its
> a program that happens to IMPLEMENT many of these algorithms. If we are
> serious about having robust auditing/signing, handwaving about the
> cryptographic parts while saying its important is ... kinda irresponsible.
> And frustrating because it makes it hard to evaluate the hardest parts of
> the whole engineering problem! The rest of the design is crucially
> dependent on details of these choices, and yet its that part which isn't
> specified.
>  
> to repeat myself: there is a pretty rich design space for how we can evolve
> future hackage, and i worry that speccing things out and design by
> committee is going to be less effective than encouraging various parties to
> build prototypes for their own visions of future hackage, and THEN come
> together to combine the best parts of everyones ideas/designs. Theres so
> much diversity in how different people use hackage, i worry that any other
> way will run into failing to serve the full range of haskell users!
>  
> cheers
>  
> On Tuesday, April 14, 2015 at 1:01:17 AM UTC-4, Michael Snoyman wrote:
> >
> > That could work in theory. My concern with such an approach is that-
> > AFAIK- the tooling around that kind of stuff is not very well developed, as
> > opposed to an approach using Git, SHA512, and GPG, which should be easy to
> > combine. But I could be completely mistaken on this point; if existing,
> > well vetted technology exists for this, I'm not opposed to using it.
> >
> > On Mon, Apr 13, 2015 at 6:04 PM Arnaud Bailly | Capital Match <
> > arn... at capital-match.com > wrote:
> >
> >> Just thinking aloud but wouldn't it be possible to take advantage of
> >> cryptographic ledgers a la Bitcoin for authenticating packages and tracking
> >> the history of change ? This would provide redundancy as the transactions
> >> log is distributed and "naturally" create a web of trust or at least
> >> authenticate transactions. People uploading or modifying a package would
> >> have to sign a transactions with someone having enough karma to allow this.
> >>
> >> Then packages themselves could be completely and rather safely
> >> distributed through standard p2p file sharing.
> >>
> >> I am not a specialist of crypto money, though.
> >>
> >> My 50 cts
> >> Arnaud
> >>
> >> Le lundi 13 avril 2015, Dennis J. McWherter, Jr. > >> > a écrit :
> >>
> >>> This proposal looks great. The one thing I am failing to understand (and
> >>> I recognize the proposal is in early stages) is how to ensure redundancy in
> >>> the system. As far as I can tell, much of this proposal discusses the
> >>> centralized authority of the system (i.e. ensuring secure distribution) and
> >>> only references (with little detail) the distributed store. For instance,
> >>> say I host a package on a personal server and one day I decide to shut that
> >>> server down; is this package now lost forever? I do see this line: "backup
> >>> download links to S3" but this implies that the someone is willing to pay
> >>> for S3 storage for all of the packages.
> >>>
> >>> Are there plans to adopt a P2P-like model or something similar to
> >>> support any sort of replication? Public resources like this seem to come
> >>> and go, so it would be nice to avoid some of the problems associated with
> >>> high churn in the network. That said, there is an obvious cost to
> >>> replication. Likewise, the central authority would have to be updated with
> >>> new, relevant locations to find the file (as it is currently proposed).
> >>>
> >>> In any case, as I said before, the proposal looks great! I am looking
> >>> forward to this.
> >>>
> >>> On Monday, April 13, 2015 at 5:02:46 AM UTC-5, Michael Snoyman wrote:
> >>>>
> >>>> Many of you saw the blog post Mathieu wrote[1] about having more
> >>>> composable community infrastructure, which in particular focused on
> >>>> improvements to Hackage. I've been discussing some of these ideas with both
> >>>> Mathieu and others in the community working on some similar thoughts. I've
> >>>> also separately spent some time speaking with Chris about package
> >>>> signing[2]. Through those discussions, it's become apparent to me that
> >>>> there are in fact two core pieces of functionality we're relying on Hackage
> >>>> for today:
> >>>>
> >>>> * A centralized location for accessing package metadata (i.e., the
> >>>> cabal files) and the package contents themselves (i.e., the sdist tarballs)
> >>>> * A central authority for deciding who is allowed to make releases of
> >>>> packages, and make revisions to cabal files
> >>>>
> >>>> In my opinion, fixing the first problem is in fact very straightforward
> >>>> to do today using existing tools. FP Complete already hosts a full Hackage
> >>>> mirror[3] backed by S3, for instance, and having the metadata mirrored to a
> >>>> Git repository as well is not a difficult technical challenge. This is the
> >>>> core of what Mathieu was proposing as far as composable infrastructure,
> >>>> corresponding to next actions 1 and 3 at the end of his blog post (step 2,
> >>>> modifying Hackage, is not a prerequesite). In my opinion, such a system
> >>>> would far surpass in usability, reliability, and extensibility our current
> >>>> infrastructure, and could be rolled out in a few days at most.
> >>>>
> >>>> However, that second point- the central authority- is the more
> >>>> interesting one. As it stands, our entire package ecosystem is placing a
> >>>> huge level of trust in Hackage, without any serious way to vet what's going
> >>>> on there. Attack vectors abound, e.g.:
> >>>>
> >>>> * Man in the middle attacks: as we are all painfully aware,
> >>>> cabal-install does not support HTTPS, so a MITM attack on downloads from
> >>>> Hackage is trivial
> >>>> * A breach of the Hackage Server codebase would allow anyone to upload
> >>>> nefarious code[4]
> >>>> * Any kind of system level vulnerability could allow an attacker to
> >>>> compromise the server in the same way
> >>>>
> >>>> Chris's package signing work addresses most of these vulnerabilities,
> >>>> by adding a layer of cryptographic signatures on top of Hackage as the
> >>>> central authority. I'd like to propose taking this a step further: removing
> >>>> Hackage as the central authority, and instead relying entirely on
> >>>> cryptographic signatures to release new packages.
> >>>>
> >>>> I wrote up a strawman proposal last week[5] which clearly needs work to
> >>>> be a realistic option. My question is: are people interested in moving
> >>>> forward on this? If there's no interest, and everyone is satisfied with
> >>>> continuing with the current Hackage-central-authority, then we can proceed
> >>>> with having reliable and secure services built around Hackage. But if
> >>>> others- like me- would like to see a more secure system built from the
> >>>> ground up, please say so and let's continue that conversation.
> >>>>
> >>>> [1] https://www.fpcomplete.com/blog/2015/03/composable-
> >>>> community-infrastructure
> >>>> [2] https://github.com/commercialhaskell/commercialhaskell/wiki/
> >>>> Package-signing-detailed-propsal
> >>>> [3] https://www.fpcomplete.com/blog/2015/03/hackage-mirror
> >>>> [4] I don't think this is just a theoretical possibility for some point
> >>>> in the future. I have reported an easily trigerrable DoS attack on the
> >>>> current Hackage Server codebase, which has been unresolved for 1.5 months
> >>>> now
> >>>> [5] https://gist.github.com/snoyberg/732aa47a5dd3864051b9
> >>>>
> >>> --
> >>>
> >> You received this message because you are subscribed to the Google Groups
> >>> "Commercial Haskell" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to commercialhaskell+unsubscribe at googlegroups.com.
> >>> To post to this group, send email to commercialhaskell at googlegroups.com.
> >>>
> >> To view this discussion on the web visit
> >>> https://groups.google.com/d/msgid/commercialhaskell/4487776e-b862-429c-adae-477813e560f3%40googlegroups.com  
> >>>  
> >>> .
> >>
> >>
> >>> For more options, visit https://groups.google.com/d/optout.
> >>>
> >>
> >>
> >> --
> >> *Arnaud Bailly*
> >>
> >> CTO | Capital Match
> >>
> >> CapitalMatch
> >>
> >> 71 Ayer Rajah Crescent | #06-16 | Singapore 139951
> >>
> >> (FR) +33 617 121 978 / (SG) +65 8408 7973 | arn... at capital-match.com
> >> | www.capital-match.com
> >>
> >> Disclaimer:
> >>
> >> *Capital Match Platform Pte. Ltd. (the "Company") registered in Singapore
> >> (Co. Reg. No. 201501788H), a subsidiary of Capital Match Holdings Pte. Ltd.
> >> (Co. Reg. No. 201418682W), provides services that involve arranging for
> >> multiple parties to enter into loan and invoice discounting agreements. The
> >> Company does not provide any form of investment advice or recommendations
> >> regarding any listings on its platform. In providing its services, the
> >> Company's role is limited to an administrative function and the Company
> >> does not and will not assume any advisory, fiduciary or other duties to
> >> clients of its services.*
> >>
> >> _______________________________________________
> haskell-infrastructure mailing list
> haskell-infrastructure at community.galois.com
> http://community.galois.com/mailman/listinfo/haskell-infrastructure
>