[Haskell-cafe] haskell crypto is reaaaaaaaaaally slow
Donald Bruce Stewart
dons at cse.unsw.edu.au
Wed Jun 20 00:54:09 EDT 2007
dons:
> aeyakovenko:
> > $ time ./md5sum ./md5sum
> > [105,252,52,138,187,192,216,17,225,123,185,3,124,101,86,132]
> >
> > real 0m4.790s
> > user 0m3.688s
> > sys 0m0.492s
> >
> > $ time md5sum ./md5sum
> > 69fc348abbc0d811e17bb9037c655684 ./md5sum
> >
> > real 0m0.023s
> > user 0m0.000s
> > sys 0m0.008s
> >
> > this is my implementation using crypto
> > (http://www.haskell.org/crypto/). Am I doing something wrong?
> >
> > module Main where
> >
> > import System
> > import qualified Data.Digest.MD5 as MD5
> > import qualified Data.ByteString as BS
> >
> > main = do
> > args <- getArgs
> > dt <- BS.readFile $ head args
> > putStrLn $ show $ MD5.hash . BS.unpack $ dt
> ^^^^ ^^^^^^^^^
>
> not a good idea.
>
> You need an MD5 over bytestrings, not [Word8].
>
> -- Don
I note a couple of other issues:
crypto is compiled with:
Ghc-options: -fglasgow-exts
that is, no optimisations, although it would certainly benefit from
Ghc-options: -O2 -fexcess-precision -funbox-strict-fields
So I'd recompile the crypto package with that first.
Then be sure to compile your code with ghc -O2 -- bytestrings love -O2.
Finally, to actually get C speed, use a C md5. Here's an example Haskell
binding to the OpenSSL libraries 'md5' function, which you can compile
and run like so:
$ ghc MD5.hs -lcrypto -o hsmd5
$ time ./hsmd5 /usr/share/dict/words
MD5 (/usr/share/dict/words) = e5c152147e93b81424c13772330e74b3
./hsmd5 /usr/share/dict/words 0.01s user 0.02s system 80% cpu 0.029 total
versus my system's 'md5' program:
$ time md5 /usr/share/dict/words
MD5 (/usr/share/dict/words) = e5c152147e93b81424c13772330e74b3
md5 /usr/share/dict/words 0.02s user 0.00s system 29% cpu 0.052 total
Oh huh, that's interesting...
And the code:
{-# OPTIONS -O2 -fffi -#include "openssl/md5.h" #-}
--
-- A few imports, should tidy these up one day.
--
import System.Environment
import qualified Data.ByteString.Base as B (unsafeUseAsCStringLen)
import qualified Data.ByteString as B
import Foreign
import Foreign.C.Types
import Numeric
import Text.Printf
main = do
(f:_) <- getArgs
src <- B.readFile f
printf "MD5 (%s) = %s \n" f (md5sum src)
-- ---------------------------------------------------------------------
--
-- Fast md5 using OpenSSL and zero-copying bytestrings
--
--
-- The md5 hash should be referentially transparent..
--
md5sum :: B.ByteString -> String
md5sum p = unsafePerformIO $ B.unsafeUseAsCStringLen p $ \(ptr,n) -> do
digest <- c_md5 ptr (fromIntegral n) nullPtr
go digest 0 []
where
-- print it in 0-padded hex format
go :: Ptr Word8 -> Int -> [String] -> IO String
go p n acc
| n >= 16 = return $ concat (reverse acc)
| otherwise = do w <- peekElemOff p n
go p (n+1) (draw w : acc)
draw w = case showHex w [] of
[x] -> ['0', x]
x -> x
-- unsigned char *MD5(const unsigned char *d, unsigned long n, unsigned char *md);
--
foreign import ccall "openssl/md5.h MD5" c_md5
:: Ptr CChar -> CULong -> Ptr CChar -> IO (Ptr Word8)
Happy hacking.
-- Don
More information about the Haskell-Cafe
mailing list