[Haskell-cafe] Parsec bug, or...?

Derek Elkins derek.a.elkins at gmail.com
Mon Oct 12 22:33:09 EDT 2009


On Mon, Oct 12, 2009 at 9:28 PM, Uwe Hollerbach <uhollerbach at gmail.com> wrote:
> a brain fart?
>
> Hi, cafe, I've been playing a little bit with a small command
> processor, and I decided it'd be nice to allow the user to not have to
> enter a complete command, but to recognize a unique prefix of it. So I
> started with the list of allowed commands, used filter and isPrefixOf,
> and was happy. But then I increased the complexity a little bit and it
> got hairier, so I decided to rewrite the parser for this bit in
> parsec. The function I came up with is
>
> parsePrefixOf n str =
>  string (take n str) >> opts (drop n str) >> return str
>  where opts [] = return ()
>        opts (c:cs) = optional (char c >> opts cs)
>
> which I call as
>
> parseFoo = parsePrefixOf 1 "foo"
>
> and it recognizes all of "f", "fo", and "foo" as "foo".
>
> OK so far, this also seems to work fine. But during the course of
> writing this, I made a stupid mistake at one point, and the result of
> that seemed odd. Consider the following program. It's stupid because
> the required prefix of "frito" is only 2 characters, which isn't
> enough to actually distinguish this from the next one, "fromage". (And
> if I change that to 2 to 3 characters, everything works fine.) So
> here's the complete program
>
> module Main where
>
> import Prelude
> import System
> import Text.ParserCombinators.Parsec as TPCP
>
> myPrefixOf n str =
>  string (take n str) >> opts (drop n str) >> return str
>  where opts [] = return ()
>        opts (c:cs) = optional (char c >> opts cs)
>
> myTest = myPrefixOf 1 "banana"
>      <|> myPrefixOf 1 "chocolate"
>      <|> TPCP.try (myPrefixOf 2 "frito")
>      <|> myPrefixOf 3 "fromage"
>
> myBig = spaces >> myTest >>= (\g -> spaces >> eof >> return g)
>
> parseTry input =
>  case parse myBig "test" input of
>       Left err -> return (show err)
>       Right val -> return ("success: '" ++ val ++ "'")
>
> main = getArgs >>= (\a -> parseTry (a !! 0)) >>= putStrLn
>
> If I compile this, say as program "opry", and run it as shown below, I
> expect the results I get for all but the last one:
>
> % ./opry b
> success: 'banana'
>
> % ./opry c
> success: 'chocolate'
>
> % ./opry fr
> success: 'frito'
>
> % ./opry fri
> success: 'frito'
>
> % ./opry fro
> "test" (line 1, column 3):
> unexpected "o"
> expecting "i", white space or end of input
>
> Sooo... why do I get that last one? My expectation was that  parsec
> would try the string "fro" with the parser for "frito", it would fail,
> having consumed 2 characters, but then the TPCP.try which is wrapped
> around all of that should restore everything, and then the final
> parser for "fromage" should succeed. The same reasoning seems to me to
> apply if I specify 3 characters as the required initial portion for
> "frito", and if I do that it does succeed as I expect.
>
> So is this a bug in parsec, or a bug in my brain?

Move the try to the last alternative.


More information about the Haskell-Cafe mailing list