# [Haskell-cafe] [Parsec] A combinator to match between M and N times?

Tue Aug 29 19:07:08 EDT 2006

Udo Stenzel wrote:
> Chris Kuklewicz wrote:
>> Again, Parsec requires you to put "try" where you need it
>
> I'm pretty sure it does, although this
>
>> Udo Stenzel wrote:
>>> countBetween 0 n p = p <:> countBetween   0   (n-1) p <|> return []
>
> is a place where it's not needed in general.

No, see my example below.  It is not needed in the specific case where p can
only consume a single token.

> You should know what
> you're doing, though.  And I like ReadP better in general, exactly
> because 'try' is a bit trippy.

And I have not used ReadP enough to be good with it.

>
>
> Udo.

As long as you test with 'p' that consumes a single token, there is no
difference using try

(<:>) = liftM2 (:)

countBetween 0 0 _ = return []
countBetween 0 n p = p <:> countBetween   0   (n-1) p <|> return []
countBetween m n p = p <:> countBetween (m-1) (n-1) p

Expand

countBetween 0 n p = (do value_p <- p
value_rest <- countBetween   0   (n-1) p
return (value : value_rest)
<|> (return [])

If p fails without consuming tokens, then the <|> runs (return []).
If p fails after consuming tokens, then the <|> will fail instead of running
(return []).

By replacing p with (try p) this will work.

> test = countBetween 3 5 (string "ab")
> go = runParser test () "" "abababac"

Without try this fails to parse with an error message:
Left (line 1, column 7):
unexpected "c"
expecting "ab"

With the try this succeeds:
Right ["ab","ab","ab"]

I assume you think the former is problem.  If you do not then we are trying to
define different operations.

The problem is that string "ab" consumes the 'a' character/token before it
fails.  The 'try' wraps this failure, pretending p did not consume any input, so
the <|> will then run (return []) as we want.

It is not needed in the line
countBetween m n p = p <:> countBetween (m-1) (n-1) p
because if p fails then the whole operation should fail.

There is a benefit to this difficult behavior! The <|> choice point for
backtracking is expensive.  By committing to the first branch when it consumes a
token, the Parsec monad can completely forget the other branch of <|> which
allows it to be garbage collected.  This default is handy for preventing memory
leaks, but it has to be overridden by 'try'.  One can package <|> and 'try' in
useful combinators such as countBetween so the use of them becomes slightly easier.

--
Chris