[Haskell-cafe] Encrypting streamed data
David Turner
dct25-561bs at mythic-beasts.com
Tue Jul 11 14:35:56 UTC 2017
Hi,
I can't think of a terribly good way to achieve GPG/PGP-compatibility
without simply using GPG/PGP, since the file format is quite involved.
That said, here is how to implement a CBC-mode block cipher encryption
using Conduit, which is suitable for something like AES256 encryption. It
is almost certainly vulnerable to side-channel attacks (timing,
cache-poisoning, etc) but as a pure function from input to output it is
equivalent to `openssl aes-256-cbc -e -K <KEY-IN-HEX> -iv <IV-IN-HEX> -in
data/plain-text.txt` which I should hope would be standard enough for
analysis.
This leaves you with the problem of storing the key and IV securely,
encrypted using the asymmetric key that you first thought of, but hopefully
that problem is surmountable!
Cheers,
David
import Control.Monad
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import Crypto.Cipher.AES
import Crypto.Cipher.Types
import Crypto.Data.Padding
import Crypto.Error
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.Binary
import Data.Monoid
loadKey :: IO B.ByteString
loadKey = B.readFile "data/key.dat"
loadIV :: IO (IV AES256)
loadIV = do
bytes <- B.readFile "data/iv.dat"
maybe (error "makeIV failed") return $ makeIV bytes
loadCipher :: IO AES256
loadCipher = throwCryptoErrorIO =<< cipherInit <$> loadKey
loadPlainText :: IO B.ByteString
loadPlainText = B.readFile "data/plain-text.txt"
encryptConduit :: (BlockCipher c, Monad m) => c -> IV c -> B.ByteString ->
Conduit B.ByteString m B.ByteString
encryptConduit cipher iv partialBlock = await >>= \case
Nothing -> yield $ cbcEncrypt cipher iv $ pad (PKCS7 (blockSize cipher))
partialBlock
Just moreBytes -> let
fullBlocks = (B.length moreBytes + B.length
partialBlock) `div` blockSize cipher
(thisTime, nextTime) = B.splitAt (fullBlocks * blockSize cipher)
(partialBlock <> moreBytes)
in do
iv' <- if B.null thisTime then return iv else do
let cipherText = cbcEncrypt cipher iv thisTime
lastBlockOfCipherText = B.drop (B.length cipherText - blockSize
cipher) cipherText
yield cipherText
maybe (error "makeIV failed") return $ makeIV lastBlockOfCipherText
encryptConduit cipher iv' nextTime
go :: IO ()
go = do
c <- loadCipher
iv <- loadIV
pt <- loadPlainText
let padded = pad (PKCS7 (blockSize c)) $ pt
encrypted = cbcEncrypt c iv padded
B.writeFile "data/haskell-oneshot.dat" encrypted
runResourceT $ runConduit
$ sourceFile "data/plain-text.txt"
=$= encryptConduit c iv mempty
=$= sinkFile "data/haskell-streaming.dat"
On 6 July 2017 at 23:29, Ivan Lazar Miljenovic <ivan.miljenovic at gmail.com>
wrote:
> On 7 July 2017 at 01:44, Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
> >
> >> On Jul 6, 2017, at 12:58 AM, Ivan Lazar Miljenovic <
> ivan.miljenovic at gmail.com> wrote:
> >>
> >> I have a use case for needing to use public key cryptography to
> >> encrypt a large amount of data in a streaming fashion (get it out of a
> >> DB, encrypt, put into an AWS S3 bucket).
> >
> > What are the data-format requirements? Do you need (binary) CMS output?
> > GPG-compatible output? Or just roll your own?
>
> The intent is to be able to transfer data between two parties such
> that only the recipient is able to view it (hence the usage of public
> key cryptography). GPG/PGP compatability is preferable as it's
> common, but anything that is sufficiently standardised (as this will
> potentially be used by others that aren't me doing so with Haskell and
> thus can't just use a library to do so) will suffice.
>
> (The other advantage of GPG/PGP is that the security testing team is
> more familiar with it and thus likely to sign off on it.)
>
> >
> > Integrity protection can be tricky with large data streams. Most data
> > formats for enveloped data have a single MAC at the end, which means
> > that the decoder has to consume all the data before it is known to be
> > valid!
> >
> > So if you're in a position to avoid a standard all-in-one format, it
> > makes sense to "packetize" the stream, with integrity protection for
> > each "packet", and packet sequence numbers to preserve overall stream
> > integrity. With vast amounts of data, you'll want to be careful with
> > the symmetric cipher modes, AEAD (AES-GCM, for example) protects only
> > a limited amount of data before you need to rekey. It may be simplest
> > to just generate a new symmetric key for every N megabytes of data.
> >
> > With a careful design of the "packet" format, you can use in-memory
> > crypto for each packet. Don't forget to include an "end-of-stream"
> > packet to defeat truncation attacks.
>
> This sounds good in theory, but in practice I'm not versed enough in
> security to want to try and roll my own if I could avoid it, and
> trying to document such a format for others to use could be
> problematic.allowed to post.
>
> --
> Ivan Lazar Miljenovic
> Ivan.Miljenovic at gmail.com
> http://IvanMiljenovic.wordpress.com
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170711/8d71383b/attachment.html>
More information about the Haskell-Cafe
mailing list