[Haskell-cafe] MD5 performance optimizations,
and GHC -via-C producing segfaulting binary
Salvatore Insalaco
kirby81 at gmail.com
Tue May 20 05:41:32 EDT 2008
2008/5/17 Andrew Coppin <andrewcoppin at btinternet.com>:
> Hi folks.
>
> OK, try this:
>
> darcs get http://darcs.orphi.me.uk/MyMD5
> cd MyMD5
> ghc -O2 --make md5sum
> md5sum <some large filename>
I've got some time to take a look at the code. It's very nice,
readable and declarative, but obviously not optimized for "raw speed".
There're a few things to do to have a speed-up of 2x, without going "low-level".
1) There's a lazyness leak in Pad.hs. The sum in:
else make_block_bs bs0 : work (n + block_size_bytes) bs1
is not strict. With a very large file (e.g. 100 mb) it means stack overflow.
[roxas:~/Desktop/test2/MyMD5] kirby% ghc -O2 --make md5sum.hs
[roxas:~/Desktop/test2/MyMD5] kirby% time ./md5sum ../../jboss-4.2.2.GA.zip
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize' to increase it.
To solve it just add a bang (!) before the n parameter:
work !n bs =
You've got to add {-# LANGUAGE BangPatterns #-} at the top of the file
too. There're solutions that don't imply BangPatterns, and use only
seq, but I like bang patterns!
Now it works:
[roxas:~/Desktop/test2/MyMD5] kirby% time ./md5sum ../../jboss-4.2.2.GA.zip
E6542028D538BAEC180A96A5D1E6EC3A
11.958u 0.210s 0:12.19 99.7% 0+0k 0+2io 0pf+0w
2) make_block_bs is sub-optimal, and very critical to performance. I
decided to use Data.Binary for it (it's also more readable, and you
get rid of the unsafeWrite):
import Data.Binary
import Data.Binary.Get
import Control.Monad
// ...
instance Binary Block where
put _ = undefined
get = do
xs <- replicateM 16 getWord32le
return $ Block $ listArray (0, 15) xs
make_block_bs :: B.ByteString -> Block
make_block_bs = decode
Now we are getting faster:
[roxas:~/Desktop/test2/MyMD5] kirby% time ./md5sum ../../jboss-4.2.2.GA.zip
E6542028D538BAEC180A96A5D1E6EC3A
8.063u 0.174s 0:08.23 100.0% 0+0k 0+0io 0pf+0w
3) You are doing a lot of access to fields of a strict data type
(State). You can at least ask the compiler to help you a bit with
-funbox-strict-fields.
Now we have:
[roxas:~/Desktop/test2/MyMD5] kirby% time ./md5sum ../../jboss-4.2.2.GA.zip
E6542028D538BAEC180A96A5D1E6EC3A
6.780u 0.133s 0:06.91 100.0% 0+0k 0+1io 0pf+0w
We have got a good, nearly 2x, speed-up with very few optimizations,
and we run in a very small constant amount of memory.
Probably compiling with -fvia-C could help even more, but strangely:
[roxas:~/Desktop/test2/MyMD5] kirby% ghc -fvia-C -funbox-strict-fields
-O2 --make md5sum.hs
[roxas:~/Desktop/test2/MyMD5] kirby% time ./md5sum ../../jboss-4.2.2.GA.zip
Segmentation fault
Is this supposed to happen? There're no "unsafe" functions or imports
used in the program. Maybe a bug in GHC?
Salvatore
More information about the Haskell-Cafe
mailing list