[Haskell-cafe] attoparsec and backtracking
Erik de Castro Lopo
mle+hs at mega-nerd.com
Sat Mar 16 02:54:08 CET 2013
Evan Laforge wrote:
> The first is that it's hard to get the right error msg out. For
> instance, I have a parser that tries to parse a number with an
> optional type suffix. It's an error if the suffix is unrecognized:
>
> p_num :: A.Parser Score.TypedVal
> p_num = do
> num <- p_untyped_num
> suffix <- A.option "" ((:"") <$> A.letter_ascii)
> case Score.code_to_type suffix of
> Nothing -> fail $ "p_num expected suffix in [cdsr]: " ++ show suffix
> Just typ -> return $ Score.Typed typ num
I think the mistake here is to parse something and then decide if
its it valid. It should be the parser which decides whether its
valid. So rather than:
suffix <- A.option "" ((:"") <$> A.letter_ascii)
try:
typ <- A.choice [ {- list or valid suffix parsers -} ]
return $ Score.Typed typ num
> However, which error msg shows up depends on the order of the (<|>)
> alternatives, and in general the global structure of the entire
> parser, because I think it just backtracks and then picks the last
> failing backtrack.
I'm not sure if what I've offered will help, but its worth a try.
> Even after carefully rearranging all the parsers
> it seems impossible to get this particular error to bubble up to the
> top.
Yes, I've found it impossible to force attoparsec to fail a parse.
I think that is intended as a feature.
> The thing is, as soon as I see an unexpected suffix I know I can
> fail entirely right there, with exactly that error msg, but since
> there's no way to turn off backtracking I think there's no way to do
> that.
Yes, that's my impression.
<snip>
> I
> originally used parsec, but parsing speed is my main bottleneck, so I
> don't want to give ground there.
We you using Parsec as a token parser or as a Char parser. Obviously
the second is going to be slow in comparison to the first.
> I've heard some good things about traditional alex+happy...
> of course it would mean a complete rewrite but might be interesting.
I've user Alex with both Parsec and Happy, where speed was strong
secondary goal. Personally I much prefer Parsec; IMO its easier to debug
and extend and modify that Happy based parsers. I also know some people
prefer Happy.
> Has anyone compared the performance of attoparsec vs. alex+happy?
I haven't, nor have I compared those two with alex+parsec. It would
be an interesting experiment.
Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
More information about the Haskell-Cafe
mailing list