[Haskell-cafe] Re: MD5 performance optimizations,
and GHC -via-C producing segfaulting binary
Andrew Coppin
andrewcoppin at btinternet.com
Tue May 20 15:05:52 EDT 2008
Salvatore Insalaco wrote:
> I've got some time to take a look at the code. It's very nice,
> readable and declarative, but obviously not optimized for "raw speed".
> There're a few things to do to have a speed-up of 2x, without going "low-level".
>
You have my individed attention...
> 1) There's a lazyness leak in Pad.hs. The sum in:
> else make_block_bs bs0 : work (n + block_size_bytes) bs1
> is not strict. With a very large file (e.g. 100 mb) it means stack overflow.
>
Oops! Well spotted...
> To solve it just add a bang (!) before the n parameter:
>
> work !n bs =
>
> You've got to add {-# LANGUAGE BangPatterns #-} at the top of the file
> too. There're solutions that don't imply BangPatterns, and use only
> seq, but I like bang patterns!
>
Ah, so *that's* what bang patterns do?
> 2) make_block_bs is sub-optimal, and very critical to performance.
Yeah, so I see. (60% time spent here... ouch!) I'll bet that's where C
is beating me...
> I decided to use Data.Binary for it
I'm not familiar with that library (i.e. what it does or how you use it).
> import Data.Binary
> import Data.Binary.Get
> import Control.Monad
>
> // ...
>
> instance Binary Block where
> put _ = undefined
> get = do
> xs <- replicateM 16 getWord32le
> return $ Block $ listArray (0, 15) xs
>
> make_block_bs :: B.ByteString -> Block
> make_block_bs = decode
>
Mmm, OK. Doesn't look too bad...
> 3) You are doing a lot of access to fields of a strict data type
> (State). You can at least ask the compiler to help you a bit with
> -funbox-strict-fields.
>
I did try that, but it didn't seem to make any difference for me. [Maybe
it does now because of your other improvements? Which version of GHC and
which platform are you running on? I'm GHC 6.8.2 on Windows...]
> We have got a good, nearly 2x, speed-up with very few optimizations,
> and we run in a very small constant amount of memory.
>
I'm liking the sound of that. :-D
> Probably compiling with -fvia-C could help even more
...oh yeah, that's no longer default for -O2. I forgot about that!
> but strangely:
>
> Segmentation fault
>
> Is this supposed to happen? There're no "unsafe" functions or imports
> used in the program. Maybe a bug in GHC?
>
GAH! o_O That's not good(tm)...
Any chance you could email me a Darcs patch (or 2 seperate ones?) I'll
add it to the repo on my website, and test to see what numbers I get on
my PC.
More information about the Haskell-Cafe
mailing list