[Haskell-cafe] More fun with micro-benchmarks and optimizations. (GHC vs Perl)

Wed Jul 23 14:18:49 EDT 2008

coreyoconnor:
> I have the need to regularly write tiny programs that analyze output
> logs. The output logs don't have a consistent formatting so I
> typically choose Perl for these tasks.
> 
> The latest instance of having to write such a program was simple
> enough I figured I'd try my hand at using Haskell instead. The short
> story is that I could quickly write a Haskell program that achieved
> the goal. Yay! However, the performance was ~8x slower than a
> comparable Perl implementation. With a lot of effort I got the Haskell
> version to only 2x slower. A lot of the optimization was done with
> guesses that the performance difference was due to how each line was
> being read from the file. I couldn't determine much using GHC's
> profiler.
> 
> I still have two questions after all this:
>  - Can I get a Haskell implementation as fast as the Perl?
>  - What do I need to do to get GHC's profiler to provide me usable
> information? Telling me that 98% of the time was in "main" is not very
> enlightening ;-)
> 
> All the code and data for this little experiment I've placed here:
> http://tothepowerofdisco.com/repo/sum_optimization/
> My first, and shortest, version is SumTiny.hs. The optimized version
> is SumFast.hs.
> 
> The long version for the curious:
> 
> The (cleaned up) data was a 78k line file consisting of lines like the
> following:
> framerate (prev == no pts): 15
> framerate (delta): 25
> framerate (invalid timebase): 12.5
> ... and so on.
> 
> The need was for a program that calculated 1.0 / framerate for each
> line and produced the sum. Easy no?
> 
> My straightforward Haskell solution was:
> -------------------------------------------------------------------
> import Text.Regex.Posix
> 
> main = do
>     f_lines <- readFile "test.out" >>= return . lines
>     let duration = foldl add_line 0.0 f_lines
>         add_line sum line =
>             let [[_,m]] = line =~ "([0-9.]+)"
>                 framerate = read m
>                 delta = 1.0 / framerate
>             in sum + delta
>     putStrLn $ "Duration (sec): " ++ show duration

Could you try using lazy and/or strict bytestrings. For *any* IO
performance issue, that should be your first stop.

See, e.g., this sum-file example,

    http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=ghc&id=6

which outperforms C on the benchmark.

The second point would be to avoid Text.Regex.Posix, which is
algorithmically much worse than the PCRE regexes used by perl. Instead,
try the pcre-light or regex-pcre packages, which have both better
complexity, and operate on efficient bytestrings.

Finally, there's some small details about writing efficient loops. Give
type declarations for atomic types like Int and Double, and use strict
folds or explicit recursion, as in this post,

    http://cgi.cse.unsw.edu.au/~dons/blog/2008/05/16#fast

For your task here, with those simple changes, it should be relatively
easy to produce competitive-with-C performance.

-- Don