[Haskell-cafe] Testing parsing failures with Parsec

Mon Jul 9 19:02:10 EDT 2007

All -
	I tried emailing this to Daan, but I can't find an up-to-date
email address, so I hope this is an acceptable alternative :).  I'm
looking for someone who knows the guts of Parsec a bit, or has done some
automated testing of a Parsec parser.

	I'm interested in using Parsec for a number of projects, so I'm
giving it a try.  So far, it has really served me well.  However, I've
run into an issue with testing for proper parsing failures.  As an
example, I've been parsing a simplified version of scheme's
s-expressions.  When parsing "(1 2n3)", I expect a parse error
"unexpected \"n\"".  Therefore, I've made a test case which attempts to
inspect the resulting ParseError to make sure it's correct.  However,
I'm confused by this list of Messages:

[SysUnExpect "\"n\"",SysUnExpect "\"n\"",Expect "space",Expect "\")\""]

Why is SysUnExpect "\"n\"" in the list twice?  I can't figure that out
from looking over the source for Parsec.  It seems like the following
rules should hold, but I can't be sure:
1) There should only be one occurrence of SysUnExpect *or* UnExpect in
the list
2) UnExpect is only used for an unexpected reserved word or operator,
and SysUnExpect is used for parse errors
3) There can be any number of Expect in the list, which indicate what
was expected to occur in the stream

If the above rules are correct, I was thinking it might make more sense
to use a type with more structure than a list to hold those data.

Also, the fact that neither Message nor ParseError derives Eq makes it
difficult to test values of those types.  Why not derive Eq?

Finally, do you have any suggestions as to how I might better test for
correct parse errors from Parsec?

Thanks very much for your help!

- Sean Smith

P.S. I'm more than willing to dig into the Parsec source and provide
patches for any changes I might make, if someone will accept them!