[Haskell-cafe] Encrypting streamed data

David Turner dct25-561bs at mythic-beasts.com
Tue Jul 11 14:35:56 UTC 2017


I can't think of a terribly good way to achieve GPG/PGP-compatibility
without simply using GPG/PGP, since the file format is quite involved.

That said, here is how to implement a CBC-mode block cipher encryption
using Conduit, which is suitable for something like AES256 encryption. It
is almost certainly vulnerable to side-channel attacks (timing,
cache-poisoning, etc) but as a pure function from input to output it is
equivalent to `openssl aes-256-cbc -e -K <KEY-IN-HEX> -iv <IV-IN-HEX> -in
data/plain-text.txt` which I should hope would be standard enough for

This leaves you with the problem of storing the key and IV securely,
encrypted using the asymmetric key that you first thought of, but hopefully
that problem is surmountable!



import           Control.Monad
import           Control.Monad.IO.Class
import           Control.Monad.Trans.Resource
import           Crypto.Cipher.AES
import           Crypto.Cipher.Types
import           Crypto.Data.Padding
import           Crypto.Error
import qualified Data.ByteString              as B
import           Data.Conduit
import           Data.Conduit.Binary
import           Data.Monoid

loadKey :: IO B.ByteString
loadKey = B.readFile "data/key.dat"

loadIV :: IO (IV AES256)
loadIV = do
  bytes <- B.readFile "data/iv.dat"
  maybe (error "makeIV failed") return $ makeIV bytes

loadCipher :: IO AES256
loadCipher = throwCryptoErrorIO =<< cipherInit <$> loadKey

loadPlainText :: IO B.ByteString
loadPlainText = B.readFile "data/plain-text.txt"

encryptConduit :: (BlockCipher c, Monad m) => c -> IV c -> B.ByteString ->
Conduit B.ByteString m B.ByteString
encryptConduit cipher iv partialBlock = await >>= \case
  Nothing -> yield $ cbcEncrypt cipher iv $ pad (PKCS7 (blockSize cipher))
  Just moreBytes -> let
          fullBlocks           = (B.length moreBytes + B.length
partialBlock) `div` blockSize cipher
          (thisTime, nextTime) = B.splitAt (fullBlocks * blockSize cipher)
(partialBlock <> moreBytes)
    in do
      iv' <- if B.null thisTime then return iv else do
        let cipherText            = cbcEncrypt cipher iv thisTime
            lastBlockOfCipherText = B.drop (B.length cipherText - blockSize
cipher) cipherText
        yield cipherText
        maybe (error "makeIV failed") return $ makeIV lastBlockOfCipherText
      encryptConduit cipher iv' nextTime

go :: IO ()
go = do
  c <- loadCipher
  iv <- loadIV
  pt <- loadPlainText
  let padded = pad (PKCS7 (blockSize c)) $ pt
      encrypted = cbcEncrypt c iv padded
  B.writeFile "data/haskell-oneshot.dat" encrypted

  runResourceT $ runConduit
     $  sourceFile "data/plain-text.txt"
    =$= encryptConduit c iv mempty
    =$= sinkFile   "data/haskell-streaming.dat"

On 6 July 2017 at 23:29, Ivan Lazar Miljenovic <ivan.miljenovic at gmail.com>

> On 7 July 2017 at 01:44, Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
> >
> >> On Jul 6, 2017, at 12:58 AM, Ivan Lazar Miljenovic <
> ivan.miljenovic at gmail.com> wrote:
> >>
> >> I have a use case for needing to use public key cryptography to
> >> encrypt a large amount of data in a streaming fashion (get it out of a
> >> DB, encrypt, put into an AWS S3 bucket).
> >
> > What are the data-format requirements?  Do you need (binary) CMS output?
> > GPG-compatible output?  Or just roll your own?
> The intent is to be able to transfer data between two parties such
> that only the recipient is able to view it (hence the usage of public
> key cryptography).  GPG/PGP compatability is preferable as it's
> common, but anything that is sufficiently standardised (as this will
> potentially be used by others that aren't me doing so with Haskell and
> thus can't just use a library to do so) will suffice.
> (The other advantage of GPG/PGP is that the security testing team is
> more familiar with it and thus likely to sign off on it.)
> >
> > Integrity protection can be tricky with large data streams.  Most data
> > formats for enveloped data have a single MAC at the end, which means
> > that the decoder has to consume all the data before it is known to be
> > valid!
> >
> > So if you're in a position to avoid a standard all-in-one format, it
> > makes sense to "packetize" the stream, with integrity protection for
> > each "packet", and packet sequence numbers to preserve overall stream
> > integrity.  With vast amounts of data, you'll want to be careful with
> > the symmetric cipher modes, AEAD (AES-GCM, for example) protects only
> > a limited amount of data before you need to rekey.  It may be simplest
> > to just generate a new symmetric key for every N megabytes of data.
> >
> > With a careful design of the "packet" format, you can use in-memory
> > crypto for each packet.  Don't forget to include an "end-of-stream"
> > packet to defeat truncation attacks.
> This sounds good in theory, but in practice I'm not versed enough in
> security to want to try and roll my own if I could avoid it, and
> trying to document such a format for others to use could be
> problematic.allowed to post.
> --
> Ivan Lazar Miljenovic
> Ivan.Miljenovic at gmail.com
> http://IvanMiljenovic.wordpress.com
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170711/8d71383b/attachment.html>

More information about the Haskell-Cafe mailing list