[Haskell-cafe] [Parsec] A combinator to match between M and N
times?
Chris Kuklewicz
haskell at list.mightyreason.com
Tue Aug 29 19:07:08 EDT 2006
Udo Stenzel wrote:
> Chris Kuklewicz wrote:
>> Again, Parsec requires you to put "try" where you need it
>
> I'm pretty sure it does, although this
>
>> Udo Stenzel wrote:
>>> countBetween 0 n p = p <:> countBetween 0 (n-1) p <|> return []
>
> is a place where it's not needed in general.
No, see my example below. It is not needed in the specific case where p can
only consume a single token.
> You should know what
> you're doing, though. And I like ReadP better in general, exactly
> because 'try' is a bit trippy.
And I have not used ReadP enough to be good with it.
>
>
> Udo.
As long as you test with 'p' that consumes a single token, there is no
difference using try
(<:>) = liftM2 (:)
countBetween 0 0 _ = return []
countBetween 0 n p = p <:> countBetween 0 (n-1) p <|> return []
countBetween m n p = p <:> countBetween (m-1) (n-1) p
Expand
countBetween 0 n p = (do value_p <- p
value_rest <- countBetween 0 (n-1) p
return (value : value_rest)
<|> (return [])
If p fails without consuming tokens, then the <|> runs (return []).
If p fails after consuming tokens, then the <|> will fail instead of running
(return []).
By replacing p with (try p) this will work.
> test = countBetween 3 5 (string "ab")
> go = runParser test () "" "abababac"
Without try this fails to parse with an error message:
Left (line 1, column 7):
unexpected "c"
expecting "ab"
With the try this succeeds:
Right ["ab","ab","ab"]
I assume you think the former is problem. If you do not then we are trying to
define different operations.
The problem is that string "ab" consumes the 'a' character/token before it
fails. The 'try' wraps this failure, pretending p did not consume any input, so
the <|> will then run (return []) as we want.
It is not needed in the line
countBetween m n p = p <:> countBetween (m-1) (n-1) p
because if p fails then the whole operation should fail.
There is a benefit to this difficult behavior! The <|> choice point for
backtracking is expensive. By committing to the first branch when it consumes a
token, the Parsec monad can completely forget the other branch of <|> which
allows it to be garbage collected. This default is handy for preventing memory
leaks, but it has to be overridden by 'try'. One can package <|> and 'try' in
useful combinators such as countBetween so the use of them becomes slightly easier.
--
Chris
More information about the Haskell-Cafe
mailing list