GHC 6.8.1 is impressive!
peter at syncad.com
Fri Nov 9 12:47:21 EST 2007
Each test I mention here is actually 3 or 4 application runs.
If there were 4 runs then the first one was discarded, so
there are still only 3 results available in one test. The
idea is that I discard the first test if it got significantly
higher page fault count.
Ok, it is not any more 100% probability that -O2 is
slower than -O with my application. I rerun the tests
I was running before again 3 times and in one case the
-O2 variant was quicker. Before I did the comparisons
about 3 times, so that would indicate that -O2 is slower
with about 83% probability :-)
The time differences are minuscule, but they do not seem
to be a result of a bad/good luck only. I did optimize
the code only once to reduce memory consumption. Speed
was always good enough for me.
It is a Gtk2Hs application which draws charts. Data are
read from a text file, preprocessed, and a chart is shown.
Ignore real times in the results since I need to fill in
one edit box to run it for a longer time (to process more
input data) and differences in my typing speed are most of
the real time differences.
Before each compile the project was cleaned.
Options were always like this:
--make -Wall <theOptimizationOptions> <fileList>
Windows XP 64bit (running 32 bit Haskell and the app.)
2GiB DDR400 RAM
Athlon XP64 X2 4800+
C&Q disabled (but it does not seem to have impact)
real 25.391 user 19.109 system 0.359 cpu 19.469 page_faults 79315
real 25.188 user 19.141 system 0.453 cpu 19.594 page_faults 79314
real 25.000 user 19.031 system 0.375 cpu 19.406 page_faults 79302
real 24.922 user 19.141 system 0.438 cpu 19.578 page_faults 78550
real 25.266 user 18.984 system 0.484 cpu 19.469 page_faults 78538
real 25.000 user 19.109 system 0.563 cpu 19.672 page_faults 78539
-O2 -fno-liberate-case -fexcess-precision
real 24.516 user 18.844 system 0.453 cpu 19.297 page_faults 79310
real 24.219 user 18.875 system 0.438 cpu 19.313 page_faults 78203
real 24.375 user 18.656 system 0.516 cpu 19.172 page_faults 79305
-O2 -fno-spec-constr -fexcess-precision
real 24.203 user 18.641 system 0.719 cpu 19.359 page_faults 78543
real 24.719 user 18.781 system 0.625 cpu 19.406 page_faults 78536
real 24.688 user 19.000 system 0.500 cpu 19.500 page_faults 78536
So it looks like liberate-case hurts my app a bit and
something else in -O2 is helping a bit. But I do not mind
since it is quick enough. I just found it interesting
that -O2 is not helping. If you would like some more
tests let me know.
Simon Peyton-Jones wrote:
> O2 mainly switches on two transformations: "liberate case" and "call-pattern specialisation". (I think it also gets passed on to gcc.)
> Trying -O2 -fno-liberate-case,
> and -O2 -fno-spec-constr
> might tell which was making the difference.
More information about the Glasgow-haskell-users