[Haskell-cafe] Parsec and destructuring HTML content

Joel Reymont joelr1 at gmail.com
Sun Jun 18 08:30:35 EDT 2006


Has anyone explored destructuring HTML with Parsec? Any other ideas  
on how to best do this?

I'm looking to scrape bits of information from more or less  
unstructured HTML pages. I'm looking to structure, tag and classify  
the content afterwards.

I think that developing HTML scrapers requires short tweak-compile- 
run cycles and is probably best done in Perl, Python, Ruby, i.e.  
dynamic languages but I wonder if someone has found otherwise.

	Thanks, Joel

--
http://wagerlabs.com/







More information about the Haskell-Cafe mailing list