Libraries and hierarchies

Fri, 1 Aug 2003 09:39:38 -0700

On Fri, Aug 01, 2003 at 05:00:55PM +0100, Keith Wansbrough wrote:
> [stuff about GUIDs]
> 
> Careful!  There are two things one might tie GUIDs to.
> 
> You could compute a hash of (or associate a GUID with) the *interface*, 
> or the *implementation*.  These are different things.  Until we know 
> which we want, we should support both.
yes, which is why each library can export an arbitrary number of GUIDs.
Oddly enough, I considered doing what you mentioned below (a hash of the
interface). but chose not to so that the same interface can be exported
under several GUIDs. this reflects the fact that there could be hidden
assumptions (such as relying on bugs) in interfaces which should be
reflected in the GUID. I also wanted the tool to have simple
semantics relative the haskell base language. thinking of it as modules
just having multiple Module declarations and export lists should make it
easy to adopt. 

plus, this was much easier to implement as a first stab at the problem :)

> One might reasonably say "this program needs version X of the 
> interface".  That is, we don't care how it's implemented, but it had 
> better export these functions at these types.

> But one might also reasonably say "this program needs version Y of the 
> implementation".  That is, it requests a particular *implementation* - 
> maybe it depends on certain bugs that are fixed in that version (or are 
> not fixed!).  Or it's only been tested against that implementation, and 
> it wants to provide some certification guarantees ("This software is 
> DoD-certified to give appropriate results with specified inputs").

The idea then is that you would hsguid -g twice to generate an
implementation ID and an interface id. old implementations could be
archived and their interface id deleted.

> A side point is that GUIDs can be generated in two ways:
> 
> 1. The traditional way: make up a 128-bit random number and insert it 
> into the interface or implementation (as appropriate).  Use this as the 
> name for that thing.
> 
> 2. The nifty way: have the compiler compute a hash (SHA-1 or MD5) of 
> the interface or implementation (as appropriate).  Use this as the name 
> for that thing.

Yeah, that might be a good way to do it if I could have compiler
support, but I would need to be able to append some 'salt' to take care
of hidden assumptions not reflected in the raw interface. I actually
implemented something like this for datatypes, based on XDR... hmm.. 

http://haskell.org/pipermail/haskell-cafe/2001-September/002215.html

> Option (2) has the advantage that it's one step shorter, and it's 
> safer: with (1), you can generate one GUID but accidentally use it on 
> two distinct interfaces / implementations; with (2), this is 
> (essentially) impossible (although it may be possible to achieve a 
> collision intentionally; malice is a separate issue that should be 
> addressed in other ways).

eh. I don't think the random collision is a real concern, at worst,
hsguid spits out an error and you angrily email the authors for doing
the equivalant of winning the lottery 10 times in a row :) problem
resolved just by copying the offending pragma with a new id and use that
one. 

> For (2), we need to agree what to hash.  The options are basically "the 
> source text of the module" (or some sub-part of it for the interface 
> case), or "the abstract syntax tree of the module".  The latter is 
> probably nicer, but requires some agreement between compiler writers if 
> it is to be valid across compilers.

what I did was implement a Hash class, which had a single function which
returned a hash, the hash was constant if the type was a terminal, and
 its type hashed with all its childrens types if it was a node.
recursive types were detected by drift and explicitly broken. 

although probably for this, the easiest thing would be to hash the
cannonicalized export list (with types). but not for hsguid. but perhaps
for some other tool which does a better job of generating hsguid
compatable pragmas.

> For background to this discussion, see our forthcoming ICFP paper,
> 
> James J. Leifer, Gilles Peskine, Peter Sewell, Keith Wansbrough (2003). 
> Global Abstraction-Safe Marshalling with Hash Types
> 
> which is available at
> 
> http://www.cl.cam.ac.uk/~kw217/research/paper-abstracts.html#Leifer*03:G
> lobal

ooh. looks interesting. will read. I have really wanted something like
this for haskell for a while.

        John

-- 
---------------------------------------------------------------------------
John Meacham - California Institute of Technology, Alum. - john@foo.net
---------------------------------------------------------------------------