FW: First Attempt at Crypto Library

Matt Harden matth@mindspring.com
Tue, 22 Apr 2003 23:00:05 -0500


David Roundy wrote:
> On Tue, Apr 22, 2003 at 12:15:41PM +0100, Malcolm Wallace wrote:
> 
>>I concur that Codec is a better name than FileFormat.  It is more
>>general, since the encodings may in fact never be stored in files -
>>they could just be transmitted directly to a decoder over a network
>>for instance.
> 
> 
> Me too.

Me too, too.

>>>If FileFormat became Codec, then the existing FileFormat.Encoding looks
>>>a bit odd.  But moving FileFormat.Encoding.Base64 up to Codec.Base64
>>>(similarly for FileFormat.Encoding.Yenc) would seem to make sense.
>>
>>Since we already have a little hierarchy based on the codec's purpose:
>>    Codec.Image			(e.g. Codec.Image.Jpeg)
>>    Codec.Video			(e.g. Codec.Video.Mpeg2)
>>then how about adding
>>    Codec.Text			(e.g. Codec.Text.Base64)
>>because its role, like uuencoding, is to convert binary streams to
>>7-bit ASCII text i.e. transmissible by email.
> 
> 
> I would have thought Codec.Binary.Base64, since in the other cases (Image
> and Video), the name is for the content you wish to code, not the format in
> which it is encoded.  e.g. Jpeg is a format for encoding an image in
> binary, while xpm is a format for encoding an image as text, but both would
> go under Codec.Image.

I would agree with Codec.Binary.Base64.  Codec.Text should be reserved 
for codecs that deal with text data; i.e. translating text between 
various encodings like unicode, ISO 8859-1, ascii, ebcdic, etc.

>>A codec that is truly multi-purposed would sit directly under Codec,
>>but I can't immediately think of an example that might fit.

I would prefer Codec.General for any truly general-purpose codecs, but 
we shouldn't need it.  It seems to me that any codec would have to be 
designed for some purpose, and it should be possible to classify them 
based on that purpose.

> gzip? Although that would probably go under Compression or something.

Yes, but it should probably be called by the generic term Deflate.  I 
would suggest Codec.Compression.Deflate.  The zip file format is a 
different story; I would suggest FileFormat.Archive.Zip for that.  We 
could also have FileFormat.Archive.Tar and FileFormat.Archive.Cab, as 
some other examples.  FileFormat.Archive could be defined as providing 
methods to access any file format designed to store other files of 
arbitrary type; Gzip files would fit that definition even though they 
can only store a single file.  So we could still have a 
FileFormat.Archive.Gzip, which would presumably use 
Codec.Compression.Deflate to do the bulk of the work.

>>>Question 2: what should the insides of the FileFormat.Encryption (or
>>>Codec.Encryption) hierarchy look like?
>>
>>I would have thought that its population would look something like:
>>
>>    Codec.Encryption.DES
>>    Codec.Encryption.RSA
>>    Codec.Encryption.Blowfish
>>
>>etc.  No need for any deeper structure.  Any attempt to further
>>classify crypto schemes by method or purpose would be confusing I
>>think.  Crypto is just crypto, i.e. binary to binary.

What about ROT13, or Enigma?  Those are text to text.  Not terribly 
secure today, but I can think of a more secure recent example: the 
solitaire cipher (see http://www.counterpane.com/solitaire.html).  I 
don't think that justifies another hierarchy level, though.

> I would imagine you'd want a Codec.Encryption.PublicKey and
> Codec.Encryption.Symmetric, since public key encryption supports a
> different set of operations from simple symmetric ciphers.  I'd think that
> all symmetric encryption could use a common interface of some sort, but and
> perhaps all PK encryption could do the same, and it would be nice to
> reflect this in the hierarchy.  Then you'd probably also want a
> Codec.Encryption.Hash for cryptographic hashes.  I'd think you'd also want
> to create somewhere a module for converting a passphrase into a key of an
> arbitrary number of characters (which would probably be used by all the
> symmetric ciphers if they want to use a common interface), but here I'm
> revealing my ignorance of cryptography, since I don't know what such a
> thing is called.  Some sort of variable-sized hash I guess...

I'm not sure public key vs. symmetric deserves an additional level in 
the hierarchy.  Each encryption codec is inherently either sym or PK, 
and should implement the appropriate interface for its class.  I think 
I'd like Codec.Encryption.PublicKey to be the name of the public key 
class, which each PK alg. would implement.

By the way, crypto hashes should go somewhere else, because they're not 
codecs.  Same goes for crypto random-number generators.

Regards,
Matt Harden