[Haskell-cafe] Performance: MD5
Don Stewart
dons at galois.com
Tue May 20 14:57:34 EDT 2008
andrewcoppin:
> Salvatore Insalaco wrote:
> >Hi Andrew,
> >just a profiling suggestion: did you try to use the SCC cost-centre
> >annotations for profiling?
> >If you want to know precisely what takes 60% of time, you can try:
> > bn = {-# SCC "IntegerConversion" #-} 4 * fromIntegral wn
> > b0 = {-# SCC "ByteStringIndexing" #-} B.index bi (bn+0)
> > b1 = {-# SCC "ByteStringIndexing" #-} B.index bi (bn+1)
> > b2 = {-# SCC "ByteStringIndexing" #-} B.index bi (bn+2)
> > b3 = {-# SCC "ByteStringIndexing" #-} B.index bi (bn+3)
> > w = foldl' (\w b -> shiftL w 8 .|. fromIntegral b) 0
> > [b3,b2,b1,b0]
> > in {-# SCC "ArrayWriting" #-} unsafeWrite temp wn w
> >
> >In profiling the time of all expressions with the same SCC "name"
> >will
> >be summed.
> >You can get more information about SCC here:
> >http://www.haskell.org/ghc/docs/latest/html/users_guide/profiling.html#cost-centres
> >
>
> OK, I'll give that a try...
>
> >One advice: I've seen various md5sum implementations in C, all using
> >about the same algorithms and data structures, and they performed
> >even
> >with 10x time differences between them; md5summing fast is not a
> >really simple problem. If I were you I would drop the comparison with
> >ultra-optimized C and concentrate on "does my
> >high-level-good-looking-super-readable implementation perform
> >acceptably?".
> >
> >What "acceptably" means is left as an exercise to the reader.
> >
>
> Well, so long as it can hash 500 MB of data in a few minutes without
> using absurd amounts of RAM, that'll do for me. ;-)
>
> [I actually wanted to do this for a project at work. When I
> discovered that none of the available Haskell implementations are
> fast enough,
How hard did you look?
import System.Environment
import Data.Digest.OpenSSL.MD5
import System.IO.Posix.MMap
main = do
[f] <- getArgs
putStrLn . md5sum =<< mmapFile f
Take the md5 of a 600M file:
$ time ./A /home/dons/tmp/600M
24a04fdf3f629a42b5baed52ed793a51
./A /home/dons/tmp/600M 3.61s user 1.65s system 20% cpu 25.140 total
Easy.
-- Don
More information about the Haskell-Cafe
mailing list