[Haskell-cafe] More fun with micro-benchmarks and optimizations.
(GHC vs Perl)
Corey O'Connor
coreyoconnor at gmail.com
Wed Jul 23 14:10:05 EDT 2008
I have the need to regularly write tiny programs that analyze output
logs. The output logs don't have a consistent formatting so I
typically choose Perl for these tasks.
The latest instance of having to write such a program was simple
enough I figured I'd try my hand at using Haskell instead. The short
story is that I could quickly write a Haskell program that achieved
the goal. Yay! However, the performance was ~8x slower than a
comparable Perl implementation. With a lot of effort I got the Haskell
version to only 2x slower. A lot of the optimization was done with
guesses that the performance difference was due to how each line was
being read from the file. I couldn't determine much using GHC's
profiler.
I still have two questions after all this:
- Can I get a Haskell implementation as fast as the Perl?
- What do I need to do to get GHC's profiler to provide me usable
information? Telling me that 98% of the time was in "main" is not very
enlightening ;-)
All the code and data for this little experiment I've placed here:
http://tothepowerofdisco.com/repo/sum_optimization/
My first, and shortest, version is SumTiny.hs. The optimized version
is SumFast.hs.
The long version for the curious:
The (cleaned up) data was a 78k line file consisting of lines like the
following:
framerate (prev == no pts): 15
framerate (delta): 25
framerate (invalid timebase): 12.5
... and so on.
The need was for a program that calculated 1.0 / framerate for each
line and produced the sum. Easy no?
My straightforward Haskell solution was:
-------------------------------------------------------------------
import Text.Regex.Posix
main = do
f_lines <- readFile "test.out" >>= return . lines
let duration = foldl add_line 0.0 f_lines
add_line sum line =
let [[_,m]] = line =~ "([0-9.]+)"
framerate = read m
delta = 1.0 / framerate
in sum + delta
putStrLn $ "Duration (sec): " ++ show duration
-------------------------------------------------------------------
Just for fun I decided to compare it to a Perl implementation:
-------------------------------------------------------------------
#!/usr/bin/perl
my @f_lines = split(/\n/,`cat test.out`);
my $sum = 0.0;
foreach(@f_lines)
{
/([0-9.]+)/;
my $delta = 1.0 / $1;
$sum += $delta;
}
print("Duration (sec): ", $sum, "\n");
-------------------------------------------------------------------
(I'm sure there are other ways to write this same program in both languages.)
I was pretty happy with how the Haskell implementation's code compared
to the Perl implementation's just in terms of looks. Though I think
the Perl implementation is easier to understand at the part where the
regex match is extracted.
For fun I compared the performance of the two:
(64bit linux running on a 2.4ghz Core2Duo)
$ time perl ./Sum.pl
Duration (sec): 3155.62666666438
real 0m0.121s
user 0m0.103s
sys 0m0.016s
$ time ./SumTiny
Duration (sec): 3155.626666664377
real 0m1.099s
user 0m1.073s
sys 0m0.009s
Youch! ~1s is fast enough that I don't care, but I'm still curious why
there is a 8x performance difference. Profiling with manual cost
center annotations (See SumTinyProf.hs) indicated the "do" expression
main is equal to was responsible 90% of the tiny. Which isn't
revealing at all!
Some experimenting led me to find a 2x slower implementation
(SumFast.hs). Quick notes ont he changes I made:
- Used ByteStrings instead of String.
- Used hGetLine instead of reading the entire file lazily and splitting on "\n"
- Fusing the foldl' with the loop of reading each line from the file.
- Using bang patterns to make the fused loop strict on the
accumulator argument.
I think the largest performance gain was from changing how each line
was read from the file. The lazy read and split seemed very slow
compared to a loop that used hGetLine.
--
-Corey O'Connor
More information about the Haskell-Cafe
mailing list