[web-devel] [Newbie][Parsec] Skipping to desired phrase
David McBride
dmcbride at neondsl.com
Fri Jul 29 14:58:06 CEST 2011
It skips a lot of characters, then when it gets to nicelinks, it skips
that, then continues to skip characters and nicelinks, then it hits
eof, and says hey I need either nicelinks or some more characters to
continue skipping or a string to capture but there aren't any, so it
barfs.
p_rest = do
manyTill anyChar (try (string "nicelinks")) <?> "fdsa"
text <- many1 anyChar <?> "asdf"
return [text]
This works, however I have a funny feeling you want the anyChar to be
something more complex than a single character, which is why you went
down this route. I had the same problem and some fellow helped me on
stack overflow with a solution. This is a case where you pretty much
have to use recursion to get what you want.
import Text.Parsec
html = "<head>nicelinks:123</head>"
p_rest = do
string "nicelinks" <|> anyHeadString <?> "fdsa"
p_rest <|> manyTill anyChar (try anyHeadString) <?> "asdf"
anyHeadString = try (string "<head>") <|> string "</head>"
main = do
print $ parse p_rest [] html
On Fri, Jul 29, 2011 at 4:27 AM, Kamil Ciemniewski
<ciemniewski.kamil at gmail.com> wrote:
> Hi all,
> I've got a String containing html and I'd like
> to extract from it some informations..
> Specifically these informations start at point
> "after" some phrase ( let say "nicelinks").
> How do I skip all the html up to the point
> of this phrase?
> I've done that much already:
> p_rest = do
> skipMany ((try (string "nicelinks")) <|> anyHeadString)
> text <- many1 anyChar
> return [text]
> anyHeadString = do
> c <- anyChar
> return [c]
> But after doing:
> parse p_rest [] html
> I get:
> Left (line 112, column 15):
> unexpected end of input
> expecting "nicelinks"
> What am I doing wrong?
> Best regards
> _______________________________________________
> web-devel mailing list
> web-devel at haskell.org
> http://www.haskell.org/mailman/listinfo/web-devel
>
>
More information about the web-devel
mailing list