[Haskell-cafe] Ticking time bomb

Alexander Kjeldaas alexander.kjeldaas at gmail.com
Fri Feb 1 10:28:29 CET 2013


Forgot the list.


On Fri, Feb 1, 2013 at 10:21 AM, Alexander Kjeldaas <
alexander.kjeldaas at gmail.com> wrote:

>
> Trying to avoid the wrath of Ketil I'll refrain from suggesting to do
> anything, I'll just explain why git is good at this, and not arbitrary. :-)
>
> Most systems that I know of to verify *anything*, use merkle trees, or
> something very similar.
> http://en.wikipedia.org/wiki/Hash_tree
>
> For example the TPM chip on your motherboard, used for example to ensure
> the integrity of the Google Chromebook and Windows BitLocker
> http://en.wikipedia.org/wiki/Trusted_Platform_Module
> (simplified example: in secure memory it stores H1=hash(microcode), then
> H2=hash(H1 || BIOS), then H3=hash(H2 || MBR), then H4=hash(H3 || kernel),
> ...).
>
> Or the integrity of the bitcoin currency.
> https://en.bitcoin.it/wiki/Protocol_specification#Merkle_Trees
>
> So these are pretty different systems, but it all boils down to doing
> cryptographic secure hashes over a previous hash + new data to ensure
> integrity of the new combined data.  Given only one verified hash in such a
> system, no part of the data, nor its history of mutation can be forged.
>  "History" can mean which software runs on your computer (TPM), which
> transactions are valid (Bitcoin), or which commits have been done in a SCM
> (git, mercurial).
>
> So git is not magical, it is just a practical implementation of something
> that works.  Any other *general* solution will be based on similar basic
> principles.  Mercurial does this and there is a GPG extension for it.
>
> Bazaar does not use a SHA1-based content addressable storage, so while a
> signed commit signs the tree, it does not represent the history (no "hash
> of hash", only "hash" if you look at it as a merkle tree), but it does
> chain commits. To verify a tree + history, *all* commits must be signed,
> which is fragile IMO.
>
> Regarding Darcs, my understanding is that it deliberately rejects hashing
> the tree, so it is not clear to me how to verify tree+history.  Patches can
> be signed, but as long as patches are independent, there is no "hash of
> hash" component which makes it difficult to see how one can verify the
> tree.  My understanding of darcs is very limited though.
>
> But to be *practical* the rest of the workflow should be secure as well,
> so you need:
>
> 1. A way to distribute the merkle tree (git pull/clone/push).
>     Distribution is of the data that is to be signed is required for
> security, because otherwise the representation of the data itself (web view
> or 'git diff') can be compromised.  Signatures have no meaning if you
> cannot trust that you know what you sign.
> 2. A way to sign a change to the merkle tre (git commit -S, git tag -s etc)
> 3. A way to have multiple signatures on a given hash (i.e. commit, or tag,
> or whatever it is called in a particular merkle tree implementation).
>     This is required to avoid catastrophic "owning" of core developers.
>  If required, I do think that multiple signatures can be emulated by a
> structured set of commits that have single signatures though.
> 3. A way to reliably do code reviews on the changes to the data (git diff)
>     This is really the same as 1).  We cannot reliably do 'git diff'
> unless the developers do it on their own equipment, thus the system must be
> distributed.
> 4. Given the requirement for a distributed merkle tree, some merge
> strategy is needed.  It is thus practical, though not required, to have
> good support for this.
>     (Btw, even the bitcoin hash chain has a merge strategy - the tree with
> the most compute power will win, and others are forced to "rebase" their
> transactions on that tree)
>
>
> So my choice of git is not arbitrary.  The way git works is pretty
> fundamental to verifying the integrity of stuff.
>
> Though when I have looked through the other options, mercurial might be a
> better fit since it is supported on Windows.
>
> Trying to solve this problem from scratch might not be such a good idea,
> because it might be very close to a reimplementation of git or mercurial.
>  Or maybe it is a good idea for someone who has some time on their hands.
>  Just be aware that the requirements for verifying anything is very close
> to what existing distributed SCM systems do.
>
> Alexander
>
>
> On Fri, Feb 1, 2013 at 3:32 AM, Kevin Quick <quick at sparq.org> wrote:
>
>> Git has the ability to solve all of this.
>>>
>> ...
>>
>>  2. Uploads to hackage either happen through commits to the git
>>> repository,
>>> or an old-style upload to hackage automatically creates a new anonymous
>>> branch in the git repository.
>>> 3. The git repository is authorative.  Signing releases, code reviews
>>> etc.
>>> all happens through the git repositories.  This gives us all the
>>> flexibility of a git-style trust model.
>>>
>> ...
>>
>>  5. Who owns which package names can be held in a separate meta-tree git
>>> repository, and can have consensus requirements on commits.
>>> 6. This special meta-tree can also contain suggested verification keys
>>> for
>>> commits to the other hackage git trees.  It can even contain keys that
>>> protect Haskell namespaces in general, so that no hackage package can
>>> overwrite a protected Haskell namespace.
>>> 7. As backward compatibility, the meta-tree can sign simple hashes of
>>> already existing packages on hackage.
>>>
>> ...
>>
>>  1. There could be some git magic script that downloads the signed git tag
>>> objects only (small data set).  Then another script would generate a
>>> git-compatible SHA1 of the extracted tarball, given that the tarball was
>>> fetched from hackage.
>>> 2. Or cabal-install could fetch directly from git repositories and use
>>> standard git verification.
>>> 3. Or a trusted machine creates tarballs from the git repositories, signs
>>> them and uploads them to hackage.
>>>
>>
>> Without details of git's trust/verification model, it's difficult to see
>> how this particular SCM tool provides the trust capabilities being
>> discussed any better than a more focused solution.  Additionally, the use
>> of git is also difficult for many Windows users (80MB installed footprint,
>> last I tried).  git has a much broader solution space than simply ensuring
>> the integrity of package downloads, especially when "there could be some
>> git magic script" that is still not identified and appears to have the same
>> insecurities as the package download/upload itself.
>>
>> Instead of using the "git" solution and looking for problems to solve
>> with it, IMHO we should work from clearly defined problem to solution in
>> general terms as our class, and then determine what specific tools
>> represent an instance of that  solution class.
>>
>> --
>> -KQ
>>
>>
>> ______________________________**_________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130201/a319d50a/attachment-0001.htm>


More information about the Haskell-Cafe mailing list