DRBG pre-announce and a discussion on RNG/Crypto infrastructure

Thomas DuBuisson thomas.dubuisson at gmail.com
Sun May 30 19:06:02 EDT 2010


All,
So I've done some initial work on the type classes and module layout.

I feel the digest class is in good shape.  DRGB hasn't been ported to
this interface, which would be decent validation, but I don't see any
major issues with either making instances or using the interface for
any generic algorithm that can operate with any hash function.  I've
previously required "(Pretty d) =>", but that's probably excessive and
adds another dependency.  Still, Digest authors __should__ have Pretty
instances for the digests!

===============
class (Binary d, Serialize d)
    => Hash ctx d | d -> ctx, ctx -> d where
  outputLength  :: d -> BitLength  -- outputLength must ignore its argument
  blockLength   :: d -> BitLength   -- blockLength must ignore its argument
  hash          :: ByteString -> d
  initialCtx    :: ctx
  updateCtx     :: ctx -> ByteString -> ctx
  finalize      :: ctx -> d
  strength      :: d -> Int  -- strength must ignore its argument
===============

The Cipher instance is in somewhat worse shape.  Aside from me using
the English term of 'Cipher' for the class then dropping to the
American terms of 'encrypt' and 'decrypt' for the functions, there is
also the issue of how to represent the cipher text type.  Using the
type system to enforce (or aiding analysis of) information flow has
always been tempting, but I've never found a convenient way to express
this.  Below is one attempt that falls apart if you A) want to receive
information (ByteString) and decrypt it or B) want your higher level
cryptographic operation implementations (cbc mode, etc) to look sane
(as I found out).

===============
class (Monoid ct, Binary ct, Serialize ct) => Cipher ct k | k -> ct where
  blockSize     :: k -> BitLength
  encrypt       :: k -> B.ByteString -> ct
  decrypt       :: k -> ct -> B.ByteString
  buildKey      :: B.ByteString -> Maybe k
  keyLength     :: k -> BitLength
================

Instead I'm thinking of just forcing all cipher text to be a strict
ByteString (below).  Higher level operations, such as modes, then can
use these and produce results :: lazy ByteString.  If someone wants to
build a Cipher instance for an algorithm that is fundamentally a
stream cipher (vs a block cipher + stream mode) then this won't
suffice, but I don't see any such algorithm in common use.

================
class Cipher k where
  blockSize     :: k -> BitLength       -- ^ blockSize must ignore its argument
  encrypt       :: k -> B.ByteString -> B.ByteString
  decrypt       :: k -> B.ByteString -> B.ByteString
  buildKey      :: B.ByteString -> Maybe k
  keyLength     :: k -> BitLength       -- ^ keyLength may inspect its
argument to return the length
================

I'm still debating how to implement "strength" for both Digest and
Cipher.  Two obvious choices are A) use different names in each class
"cipherStrength" and "digestStrength" B) Have another type class that
defines "strength".  I'm ambivalent here - if someone has an argument
one way or another please speak up.

The Module and Package breakdown:

PACKAGE: crypto
Data
        Crypto
                Types   (type Salt = Nonce, newtype Nonce, BitLength...)
                Classes (Digest, Cipher, Asymmetric?)  -- what would
Asymmetric look like?
                HMAC (hmac)
                Modes (cbc, ofb, cfb)
                Random?  Still considering
System
        Crypto
                Random (getEntropy, getTrueEntropy)
                -- urandom vs random.  On Windows just use
cryptGenRandom for both
Test
        Crypto
                Hash (KATs, Props)
                Cipher (KATs, Props)
                Asymmetric (KATs, Props)
                -- statistical tests for RNG if someone wants to do the work

PACKAGE: crypto-algs

Data.Crypto.Algs (all algorithms are exported from the base module)
                .PBKDF2 -- from pbkdf2
                .AES  -- from simpleAES
                .DES  -- from ???
                .SHA  -- from SHA2 pkg
                .MD5  -- perhaps from pureMD5, improved with FFI for accumulator
                .arc4  -- ???
                .DRBG.HMAC  -- from DRBG
                .DRBG.Hash    -- from DRBG


PACKAGE: Cryptography
Data.Crypto exports all of crypto-algs and crypto.  It's a metapackage.


Cheers,
Thomas


On Tue, May 25, 2010 at 3:11 PM, Thomas DuBuisson
<thomas.dubuisson at gmail.com> wrote:
> The DRBG package [1] is a partial implementation of NIST SP 800-90
> section 10, specifically the hash and HMAC based cryptographically
> secure random number generators are implemented.  Before doing further
> work I feel the state of affairs of cryptography on hackage needs to
> change.
>
> As has became too common, DRBG implements (or should implement)
> several features that could be done once for all time in a quality
> Crypto package.  Features such as type classes for hashes and ciphers,
> an hmac definition, and a MonadRandom or something similar for random
> number generators that can fail.  What existing code implements one or
> more of these features that should be able to use a central Crypto
> package?  Happstack, SHA2, DRBG, SHA, hS3 (actually uses crypto
> despite [Word8] where ByteString should be used), hoauth, and a few
> partial-bindings to OpenSSL satisfy several needs.
>
> Future Crypto Library Principles
> I propose two "new" libraries be built and receive some higher level
> of community attention.  One is "Crypto" which will define type
> classes and a limited set of common generic algorithms (ex: HMAC,
> cipher modes of operation).  The other is "crypto-algs" which
> implements (and instantiates type classes for) as many common
> algorithms as we can reasonably group together with similar
> interfaces.  I feel the breakdown is important - if the
> community-accepted interface is separated from any algorithms then
> maintainers of alternate implementations are more likely to accept the
> interface.
>
> Enumerating principles I support:
> * Lazy ByteStrings should be used for all input data
> * While "over-typeclassification" should be avoided, two or three
> basic classes of "Digest", "Cipher", and "CryptoRandom" make good
> sense.
> * CryptoRandom, and related infrastructure would closely follow Random
> and MonadRandom except for allowing clean failure.
> * Digest result types should be unique to the digest (a la pureMD5,
> contrary to SHAs use of "Digest" for 4 different hashs and contrary to
> SHA2s use of ByteString for all digests)
> * For Crypto-Algs pure Haskell code may be preferred but should not be
> used when costing over ~20% performance degradation when compared to
> other available implementations.
> * No dependence on external libraries (e.g. no OpenSSL bindings)
> * Keys should receive their own data type declarations (no
> cryptographic information floating around in a naked ByteString)
> * Data types should have Binary and Serialize instances.
> * All code should be licensed BSD3 or equivalent
>
> Open Questions on Class Design
> What should be made instances of our classes.  Should an empty data
> declaration have an instance, and the class have a way to "extract"
> the actual digest [2]?  Perhaps it should be the data type containing
> the digest that is the instance, so the hash algorithm can be inferred
> based on the result type.
>
> Required methods:  I'd propose we follow [3] (see [2] for instances)
> except removing the empty data declarations, using the digest/key
> types for the instance.  In brief, this includes encrypt/decrypt (or
> 'hash), strength, outputLength, blockLength,
>
> What algorithms are generic enough to include in Crypto?  I'd say
> anything that works for all instances of {Cipher, Digest}.
>
> Minimum quality for inclusion: A suite of KATs and some straight
> forward QuickCheck tests.
>
> Anticipating Concerns
> 1) (Re-) Implementing Crypto is hard / dangerous / a waste of time
> Most the benefit I aim for is in the common interface, not the
> algorithms.  That said, the -Algs package can largely be a gathering
> of best-of-hackage with sanitized interface and continually-updated as
> higher-performance implementations come available.  This might have
> been how the current "Crypto" package started, I don't know, but it is
> no longer in a sane or useful state.
>
> 2) OpenSSL is fine
> For some people, yes - absolutely.  Not just fine but insane to
> replace, seeing as it's (partly) FIPS certified.  For other people,
> such as basically all the packages I listed above, it's obviously
> lacking something be it portability, footprint, or interface.
>
> 3) The Monad class /  MonadRandom can fail - it's called "error" and "fail".
> This is a matter of taste.  I personally wish Haskell didn't have
> exceptions or had an explicitly exceptionless type.  At any rate, the
> random work would probably come last, especially if I'm doing the
> implementation.
>
> I intend to tackle this once I get some personal time freed up (around
> mid June).  If you're interested then please feel free to lay claim to
> some part of this problem or comment on the proposal.
>
> Cheers,
> Thomas
>
> [1] http://code.haskell.org/~tommd/DRBG/
> Remaining tasks include optimization, test suite (NIST provides a
> large body of KATs), self test procedures, and the higher level
> interface (section 9).
> [2] http://code.haskell.org/~tommd/DRBG/Test/KAT.hs
> [3] http://code.haskell.org/~tommd/DRBG/Data/Crypto/Classes.hs
>


More information about the Libraries mailing list