crypto-api Status + Semantics (was: DRBG pre-announce and a discussion on RNG/Crypto infrastructure)

Thomas DuBuisson thomas.dubuisson at
Mon Jun 28 15:10:33 EDT 2010

This is a status update and a discussion on semantics.

Given classes that imply a blockSize, how are instances of those
classes expected to behave?  Some options (I prefer option 1.1):

1.1 Single block (bare bones implementation)
Each call to 'update' (for Hash) or 'blockEncrypt/Decrypt' (for
BlockCipher) will consume exactly 'blockSize/8' bytes.  That is, this
follows most hash/cipher definitions quite strictly by performing
exactly one block encrypt or hash update.  Data over blockSize is
ignored, if the input is empty then return Data.ByteString.empty or
the current context (for BlockCipher, Hash respectively).  Obvious
options for non-empty input of length less than blockSize are: throw
an exception or make all returns something like "Maybe a"... I'll
probably throw an exception.

Users wouldn't / shouldn't care as they will be accessing these
instances via the "hash", "hash'", "hmac", "hmac'" or mode-specific
encryption methods - all of which are part of crypto-api and handle
the chunking of data already. IOW this saves duplicate work at the
slight risk of a dev doing something dumb and having a runtime

1.2 Multiple of blockSize bytes
Implementations are encouraged to consume data (continue updating,
encrypting, or decrypting) until there is less than blockSize bits
available.  There is duplicate work here as both the "hash"
implementation and the "update" instance check the bytestring length
and chunkify.

1.3 Any input size works
Much like accepting any number of blocks except in the case of Hashes,
the 'update' function must track a remainder (bytes < blockSize) -
such accounting is done by "hash" already and only matters if, for
some odd reason, people start using the class methods directly.  For
encryption/decryption we would just truncate.

Because the example instances below were built from pre-existing
packages this is exactly what they do now.  I like this option least
of all.

Addressing each item in the todo list:

> - Decide on package name, replace "Crypto" or select a new name?
I think "crypto-api" is best, which means this would exist in parallel
with the "Crypto" package.  I still encourage people to move away from
/ eventually depricate Crypto!

> - Look harder at the other classes including "BlockCipher", "AsymCipher", "StreamCipher"
A stab at these classes are now in the repo.  Nothing is in stone -
chiseling is hard work compared to typing.

> NEW ITEM: Copy Data.LargeWord from Crypto into Crypto-API

Word{128,192,256} seem useful enough and I don't think people should
rely on both Crypto and crypto-api. Perhaps I should be asking where
it truly belongs - with FFI and directly in Data.Word?  I'm ok with it
being in crypto-api but it should be in FFI - pulled into Data.Word.

> - example instances of each class

from the pureMD5 repo:
--- BEGIN Hash instance ---
instance Hash MD5Context MD5Digest where
        outputLength = Tagged 128
        blockLength  = Tagged 512
        initialCtx   = md5InitialContext
        updateCtx    = md5Update
        finalize     = md5Finalize
        strength     = Tagged 24
-- END Hash instance ---

For block ciphers I built an instance for twofish (see below).  I
realized part way in that making TwofishCipher have an extra type tag
(TwofishCipher ⍺, to allow separate instance for each key length) is
entirely a hack-job wrt the rest of the twofish code. A better long
term solution is to make a field in the TwofishCipher data declaration
which indicates the size of the key (keySize :: Int) and have a single
BlockCipher instance for "TwofishCipher" (keyLength = keySize).

-- BEGIN BlockCipher instance ---
instance BlockCipher (TwofishCipher Word128) where
        blockSize = Tagged 128
        encryptBlock t = encode . liftCryptor (eb t) . decode
        decryptBlock t = encode . liftCryptor (db t) . decode
        buildKey bs    = mkStdCipher (decode bs :: Word128)
        keyLength _    = 128
--- END BlockCipher instance ---

I'm not aware of any Haskell implementation of / bindings to a stream
cipher.  If someone points one out I might try to make an instance.

WRT AsymCipher, I'll probably make an example instance for the RSA
package (PKCS1) later... feel free to beat me to it

> - example uses of each class
A Hash example can be found in DRBG.  Also, crypto-api uses Hash to
build the functions hash and hmac.

> - Collecting tests, building a test framework
I'll be adding both agnostic and algorithm-specific tests to
crypto-api.  These will be included/excluded depending on a cabal

> - Move "for" and (.::.) into the Tagged library (?)
Helper functions were added to tagged, but they didn't seem to suffice
in place of "for".  I'd be happy to be corrected.

> - Decide what we want on padding
Could someone who needed the padding in Crypto tell me their use
cases?  Are people mainly wanting PKCS5?  I'm tempted to leave this
out unless there is some need for them in the generalization / API.

> - Decide what we want with crypto-related items that aren't directly a
> cipher or hash (ex: pbkdf2).
This will be put off till I consider crypto-algs more carefully.

> - Implement modes
Trying to figure out the API here. I think we want:
cbc, cbc', unCbc, unCbc'
ecb, ecb', unEcb, unEcb'
ctr, ctr', unCtr, unCtr',
ofb, ofb', unOfb, unOfb'

These would have types in the form of:
cbc :: (Cipher k) => k -> IV -> ByteString -> (ByteString, IV)

I've considered a suggestion to make a custom State monad to track
IVs, but don't see the need and why users of crypto-api wouldn't be
better served by simpler definitions that can be wrapped for their own
State monad for a given program.  At any rate, a State wrapper can be
made (part of crypto-api or not) using these underlying functions.

3.0 Next Time
Hopefully I'll have done (2 weeks):
- implement modes
- Copy Data.LargeWord unless there are convincing objections such as
"These will be in FFI Data.Word next week".
- Implement CTR based DRBG from NIST SP 800-90 (use BlockCipher class).
- clean pureMD5 - make it jive with semantics from section 1.1

And later (~4 weeks)
- Define, document exceptions (if we choose to throw any as a result
of non-blockSized inputs)
- Test cases

And even later
(~6 weeks)
- Release on hackage
- Talk about crypto-algs

Comments, patches, and suggestions are all welcome.


More information about the Libraries mailing list