[Haskell-beginners] Clearing Parsec error messages

Sat Jun 6 08:26:14 EDT 2009

On Sat, Jun 6, 2009 at 8:36 AM, Daniel Fischer <daniel.is.fischer at web.de>wrote:

> Am Samstag 06 Juni 2009 02:05:13 schrieb Giuliano Vilela:
> > Hi all,
> >
> > In a Parsec project I used the *fail* parser, wanting to show a message
> to
> > the user and halt the parsing process. That's okay, but the error message
> > showed included some other "unexpected" and "expecting" messages that did
> > not seem related to the fail.
>
> I suppose your parser is not
>
> do return 'a'
>   fail "No dice"
>
> but rather something like
>
> do commonPreamble
>   foo <|> bar <|> baz <|> fail message
>
> and fail is only called if none of the possibilities succeed?
>

Close, but not quite. I'm actually using the Parsec monad with my own state
to build a symbol table during parsing (for a Pascal sub-language I
mentioned earlier in this list). So the fail is deep in the "recursion
chain", but it's something like what you mentioned above. It seemed to me
that *fail* was the best way to report a error, like "undefined type
identifier used".

> Then each of the failing parsers foo, bar and baz may add messages what
> input would have
> allowed them to proceed:
> ---------------------------------
> module FailTest where
>
> import Text.ParserCombinators.Parsec
>
> pa = char 'a'
>
> pb = char 'b'
>
> pc = char 'c'
>
> parser1 = pa <|> pb <|> pc <|> fail "Sorry, no parse"
>
> test1 = parse parser1 "test1" "d'oh"
>
> ------------------------------------------------------------
>
> *FailTest> test1
> Left "test1" (line 1, column 1):
> unexpected "d"
> expecting "a", "b" or "c"
> Sorry, no parse
>
> That's probably the kind of output you get. But I'd say the messages are
> very much related
> to the fail, most likely it's better to keep them.

Nice, I understand now how those messages are built. But, as you can see,
the errors I mentioned won't be related to the parsing itself. That's why
those "expected" and "unexpected" messages are undesirable.

>
> But if you absolutely want to get rid of them, you need a custom fail that
> consumes some
> input to remove the earlier expect messages. To avoid breaking the actual
> input or falling
> afoul of end of input, first inject a dummy token into the input, then
> consume that, and
> only thereafter fail:
> -----------------------------------------------------------------
>
> myfail msg = do
>    inp <- getInput
>    setInput ('x':inp)
>    anyToken
>    fail msg
>
> parser2 = pa <|> pb <|> pc <|> myfail "sorry, doesn't parse"
>
> test2 = parse parser2 "test2" "d'oh"
> --------------------------------------------------------------------------
>
> *FailTest> test2
> Left "test2" (line 1, column 1):
> sorry, doesn't parse

That worked :)

But thinking about it now, I see my solution probably isn't optimal. In some
cases, I will report type errors even when there is bad syntax further in
the source, which is not common behavior. You got any suggestions for my use
case?

The source code for the interpreter is here:
http://code.google.com/p/hpascal/ (pretty immature, my group just started
writing it) if you want to take a look. Important files are: Parsing.hs (the
parser itself) and TypeChecker.hs (parsers that access the internal monad
state and build the table).

-- 
[]'s

Giuliano Vilela.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090606/6fb995ef/attachment.html