[Haskell-cafe] More fun with micro-benchmarks and optimizations. (GHC vs Perl)

Wed Jul 23 15:37:00 EDT 2008

And against gawk 3.1.5:

$ time awk -F: '{sum += 1 / $2} END{print sum}' test.out
3155.63

real    0m0.197s
user    0m0.184s
sys     0m0.004s

compared to Don's Haskell version:

$ time ./fastSum < test.out
3155.626666664377

real    0m0.072s
user    0m0.056s
sys     0m0.004s

compared to the Corey's perl version:

$ time perl Sum.pl
Duration (sec): 3155.62666666438

real    0m0.181s
user    0m0.164s
sys     0m0.012s

Regards,
Brad Larsen

On Wed, 23 Jul 2008 15:01:24 -0400, Don Stewart <dons at galois.com> wrote:

> coreyoconnor:
>> I have the need to regularly write tiny programs that analyze output
>> logs. The output logs don't have a consistent formatting so I
>> typically choose Perl for these tasks.
>>
>> The latest instance of having to write such a program was simple
>> enough I figured I'd try my hand at using Haskell instead. The short
>> story is that I could quickly write a Haskell program that achieved
>> the goal. Yay! However, the performance was ~8x slower than a
>> comparable Perl implementation. With a lot of effort I got the Haskell
>> version to only 2x slower. A lot of the optimization was done with
>> guesses that the performance difference was due to how each line was
>> being read from the file. I couldn't determine much using GHC's
>> profiler.
>
>     {-# OPTIONS -fbang-patterns #-}
>
>     import qualified Data.ByteString.Char8 as S
>     import Data.ByteString.Lex.Double
>     import Debug.Trace
>
>     main = print . go 0 =<< S.getContents
>       where
>         go !n !s = case readDouble str of
>                 Nothing     -> n
>                 Just (k,t)  -> let delta = 1.0 / k in go (n+delta) t
>             where
>                 (_, xs) = S.break ((==) ':') s
>                 str     = S.drop 2 xs
>
> It uses the bytestring-lexing package on Hackage to read the Doubles
> out,
>
>     $ ghc --make Fast.hs -O2 -fvia-C -optc-O2
>     $ time  ./Fast < test.out
>     3155.626666664377
>     ./Fast < test.out  0.07s user 0.01s system 97% cpu 0.078 total
>
> So that's twice as fast as the perl entry on my box,
>
>     $ time perl Sum.pl < test.out
>     Duration (sec): 3155.62666666438
>     perl Sum.pl < test.out  0.15s user 0.03s system 100% cpu 0.180 total
>
> Note that the safe Double lexer uses a bit of copying, so
> we can in fact do better still with a non-copying Double parser,
> but that's only for the hardcore.
>
> -- Don
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>