[Haskell-cafe] parsec2 vs. parsec3... again

Sat Jan 15 02:54:37 CET 2011

> Now attoparsec-text is more than twice faster, allocates even less
> memory and the total memory figures seem right.
>
> Bottom line: I think this benchmark doesn't really represent the kind
> of workload your parser has.  Can you reproduce these results on your
> system?

I spent quite a bit of time trying to reduce this down to a minimal
reproduction and getting confusing results.  Then I found out that
compiling with profiling enabled makes attoparsec slow and parsec
fast.  When I compile without any profiling, here's what I get, in CPU
time:

parsec run 1000000 - time: 1.22s -
atto bs run 1000000 - time: 0.38s -
atto text run 1000000 - time: 0.78s -

This looks more like I expect it to.  I don't understand the parsec
thing... one of the first things I did was recompile and reinstall
parsec2, making sure to pass -p to configure, and verify that there is
a /usr/local/lib/parsec-2.1.0.1/ghc-6.12.3/libHSparsec-2.1.0.1_p.a.
However, on closer inspection, I believe I've found the culprit.
Compiling with 'build -v' for attoparsec reveals a ghc cmdline line:
'-prof -hisuf p_hi -osuf p_o -auto-all'.  Compiling parsec has: '-prof
-hisuf p_hi -osuf p_o'.  And indeed, attoparsec cabal has
'ghc-prof-options: -auto-all', which parsec's cabal does not.  And in
fact, parsec3 also has this -auto-all, which both explains why the
profile is full of internal functions and why parsec3 was so much
slower than parsec2.

I'm glad to have finally tracked this down, but unhappy that I spent
so much time on it.  It seems like a trap waiting to be sprung if
various libraries are compiled with their individually specified
flags, which have major effects on performance.  Maybe I should have
noticed, but it seems pretty subtle to me.  GHC will refuse to compile
non-profiling libs against a profiling build, but doesn't go down to
the level of flags.

I think my short term solution is going to be remove -auto-all from
attoparsec's cabal---I'm not profiling attoparsec and so I don't want
my entire profile output to be internal attoparsec functions.  But
presumably the flag was added there for a reason, so maybe there are
people who really want that.  Is there a better solution?  GHC warns
when linking a profiling lib compiled with different profiling flags?
A separate .p_auto-all_o suffix?  Removal of ghc-prof-options from
cabal?  A consensus to standardize on a set of flags?

BTW, yes my situation is a little different from your test.  It's lots
and lots of little expressions for a simple language in an in-memory
structure that get parsed individually.  So I don't care about file
reading speed, but I do care about parser startup overhead, since it's
lots and lots of little parses.  The numbers above are how long it
takes to parse "2.34" 1m times.