[Haskell-cafe] Parsec question (new user): unexpected end of input

Antoine Latter aslatter at gmail.com
Wed Sep 29 00:23:01 EDT 2010


On Tue, Sep 28, 2010 at 10:35 PM, Peter Schmitz <ps.haskell at gmail.com> wrote:
> I am a new Parsec user, and having some trouble with a relatively
> simple parser.
>
> The grammar I want to parse contains tags (not html) marked by
> angle brackets (e.g., "<some tag>"), with arbitrary text (no angle
> brackets allowed) optionally in between tags.
>
> Tags may not nest, but the input must begin and end with a tag.
>
> Whitespace may occur anywhere (beginning/end of input,
> inside/between tags, etc.), and is optional.
>
> I think my problem may be a lack of using "try", but I'm not sure
> where.
>
> At runtime I get:
>
> Error parsing file: "...\sampleTaggedContent.txt" (line 4, column 1):
> unexpected end of input
> expecting "<"
>
> The input was:
>
> <tag1>stuff<tag 2>
> more stuff < tag 3 > even more
> <lastTag>
>
> The code is below. (I'm using Parsec-2.1.0.1.) I don't really want
> to return anything meaningful yet; just parse okay.
>
> Any advice about the error (or how to simplify or improve the code)
> would be appreciated.
>
> Thanks much,
> -- Peter
>
>
>> -- Parsers:
>> taggedContent = do
>>    optionalWhiteSpace
>>    aTag
>>    many tagOrContent
>>    aTag
>>    eof
>>    return "Parse complete."
>>
>> tagOrContent = aTag <|> someContent <?> "tagOrContent"
>>
>> aTag = do
>>    tagBegin
>>    xs <- many (noneOf [tagEndChar])
>>    tagEnd
>>    optionalWhiteSpace
>>    return ()
>>
>> someContent = do
>>    manyTill anyChar tagBegin
>>    return ()
>>
>> optionalWhiteSpace = spaces   -- i.e., any of " \v\f\t\r\n"
>> tagBegin = char tagBeginChar
>> tagEnd = char tagEndChar
>>
>> -- Etc:
>> tagBeginChar = '<'
>> tagEndChar = '>'
>
> --------
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

Here's something I put together:
http://hpaste.org/40201/parsec_question_new_user_un?pid=40201&lang_40201=Haskell

It doesn't have the whitespace handling you want.

The big difference in what I did was that when parsing content, it
needs to stop on EOF as well as the signal char. Otherwise it won't
allow the document to end :-)

Antoine


More information about the Haskell-Cafe mailing list