[Haskell-cafe] Parsec line number off-by-one
Roman Cheplyaka
roma at ro-che.info
Wed Sep 21 08:37:40 CEST 2011
Hi Ben,
This is indeed a bug in parsec.
I have written a patch that fixes this. Currently Antoine Latter (current
parsec's maintainer) and I are working on getting these patches into the
next parsec release.
As a workaround until then, you can apply the attached patch manually.
darcs get http://code.haskell.org/parsec3
cd parsec3
darcs apply parsec.dpatch
cabal install
With this patch, the error message is:
Left "(unknown)" (line 18, column 1):
expecting space or atom name
* Ben Gamari <bgamari.foss at gmail.com> [2011-09-20 23:32:34-0400]
> Recently I've been playing around with Parsec for a simple parsing
> project. While I was able to quickly construct my grammar (simplified
> version attached), getting it working has been a bit tricky. In
> particular, I am now stuck trying to figure out why Parsec is
> mis-reporting line numbers. Parsec seems convinced that line 12 of my
> input (also attached) has a "%" character,
>
> $ runghc Test.hs
> Left "(unknown)" (line 12, column 1):
> unexpected "%"
> expecting space or atom name
>
> while my file clearly disagrees,
>
> 10 %FLAG ATOM_NAME
> 11 %FORMAT(20a4)
> 12 C1 H1 C2 H2 C3 H3 C4 H4 C5 C6 C7 C8 N1 C9 H9 C10 H10 C11 H11 C12
> 13 H12 C13 H13 C14 C15 N2 C16 C17 C29 H18 C19 H19 C20 H20 C21 H21 C22 H221H222H223
> ...
> 18 %FLAG CHARGE
> 19 %FORMAT(5E16.8)
>
> The task here is to identify the block of data lines (lines 12-17),
> ending at the beginning of the next block (starting with "%"). It seems
> likely that my problem stems from the fact that I use "try" to
> accomplish this but this is as far as I can reason.
>
> Any ideas what might cause this sort of off-by-one? Does anyone see a
> better (i.e. working) way to formulate my grammar? Any and all help
> would be greatly appreciated. Thanks.
>
> Cheers,
>
> - Ben
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
--
Roman I. Cheplyaka :: http://ro-che.info/
-------------- next part --------------
4 patches for repository http://code.haskell.org/parsec3:
Sun Feb 20 18:24:22 EET 2011 Roman Cheplyaka <roma at ro-che.info>
* Choose the longest match when merging error messages
Sun Feb 20 18:24:49 EET 2011 Roman Cheplyaka <roma at ro-che.info>
* try: do not reset the error position
Sun Feb 20 18:29:20 EET 2011 Roman Cheplyaka <roma at ro-che.info>
* lookAhead: do not consume input on success; update documentation
Sun Feb 20 19:30:26 EET 2011 Roman Cheplyaka <roma at ro-che.info>
* Improve <?>
New patches:
[Choose the longest match when merging error messages
Roman Cheplyaka <roma at ro-che.info>**20110220162422
Ignore-this: 54e2733159a1574abb229e09ff6935c1
] hunk ./Text/Parsec/Error.hs 137
= ParseError pos (msg : filter (msg /=) msgs)
mergeError :: ParseError -> ParseError -> ParseError
-mergeError (ParseError pos msgs1) (ParseError _ msgs2)
- = ParseError pos (msgs1 ++ msgs2)
+mergeError (ParseError pos1 msgs1) (ParseError pos2 msgs2)
+ = case pos1 `compare` pos2 of
+ -- select the longest match
+ EQ -> ParseError pos1 (msgs1 ++ msgs2)
+ GT -> ParseError pos1 msgs1
+ LT -> ParseError pos2 msgs2
instance Show ParseError where
show err
[try: do not reset the error position
Roman Cheplyaka <roma at ro-che.info>**20110220162449
Ignore-this: 8508bc41fc6dcd9b7c06aac762f12c71
] hunk ./Text/Parsec/Prim.hs 435
try :: ParsecT s u m a -> ParsecT s u m a
try p =
- ParsecT $ \s@(State _ pos _) cok _ eok eerr ->
- let pcerr parseError = eerr $ setErrorPos pos parseError
- in unParser p s cok pcerr eok eerr
+ ParsecT $ \s cok _ eok eerr ->
+ unParser p s cok eerr eok eerr
-- | The parser @tokenPrim showTok posFromTok testTok@ accepts a token @t@
-- with result @x@ when the function @testTok t@ returns @'Just' x at . The
[lookAhead: do not consume input on success; update documentation
Roman Cheplyaka <roma at ro-che.info>**20110220162920
Ignore-this: e884771490209b93e9fec044543a18ef
] {
hunk ./Text/Parsec/Combinator.hs 279
<|>
do{ x <- p; xs <- scan; return (x:xs) }
--- | @lookAhead p@ parses @p@ without consuming any input.
-
-lookAhead :: (Stream s m t) => ParsecT s u m a -> ParsecT s u m a
-lookAhead p = do{ state <- getParserState
- ; x <- p
- ; setParserState state
- ; return x
- }
-
hunk ./Text/Parsec/Prim.hs 40
, (<|>)
, label
, labels
+ , lookAhead
, Stream(..)
, tokens
, try
hunk ./Text/Parsec/Prim.hs 439
ParsecT $ \s cok _ eok eerr ->
unParser p s cok eerr eok eerr
+-- | @lookAhead p@ parses @p@ without consuming any input.
+--
+-- If @p@ fails and consumes some input, so does @lookAhead at . Combine with 'try'
+-- if this is undesirable.
+
+lookAhead :: (Stream s m t) => ParsecT s u m a -> ParsecT s u m a
+lookAhead p = do{ state <- getParserState
+ ; x <- p'
+ ; setParserState state
+ ; return x
+ }
+ where
+ p' = ParsecT $ \s cok cerr eok eerr ->
+ unParser p s eok cerr eok eerr
+
-- | The parser @tokenPrim showTok posFromTok testTok@ accepts a token @t@
-- with result @x@ when the function @testTok t@ returns @'Just' x at . The
-- source position of the @t@ should be returned by @posFromTok t@ and
}
[Improve <?>
Roman Cheplyaka <roma at ro-che.info>**20110220173026
Ignore-this: b147dad8729f0a8f22b7149559d6b243
] hunk ./Text/Parsec/Prim.hs 315
-- letter', which is less friendly.
(<?>) :: (ParsecT s u m a) -> String -> (ParsecT s u m a)
-p <?> msg = label p msg
+p <?> msg = ParsecT $ \s cok cerr eok eerr ->
+ let parseError = newErrorMessage (Expect msg) (statePos s)
+ eok' x s' _ = eok x s' parseError
+ eerr' _ = eerr parseError
+ in unParser p s cok cerr eok' eerr'
-- | This combinator implements choice. The parser @p \<|> q@ first
-- applies @p at . If it succeeds, the value of @p@ is returned. If @p@
Context:
[TAG 3.1.1
Antoine Latter <aslatter at gmail.com>**20110129160030
Ignore-this: 42ddc9e7316d68945c2c1260c2acd403
]
Patch bundle hash:
52077d12002c2820c33c39701e01ead8de62c61e
More information about the Haskell-Cafe
mailing list