[Haskell-cafe] Parse HTML that is contain javascript

Andras Slemmer 0slemi0 at gmail.com
Tue Dec 24 19:52:36 UTC 2013


The html-conduit package (http://hackage.haskell.org/package/html-conduit)
can parse the above snippet easily: http://lpaste.net/97491
This code reads from stdin and prints out the parsed HTML. Try it out! For
documentation on the returned AST take a look at xml-conduit (
http://hackage.haskell.org/package/xml-conduit)


On 24 December 2013 19:42, Brandon Allbery <allbery.b at gmail.com> wrote:

> On Tue, Dec 24, 2013 at 2:20 PM, akira kawata <a.kawashiro at gmail.com>wrote:
>>
>> Did you mean HaXmL?
>>
>
> Pick an XML parser. CDATA is an XML construct. Well-formed HTML *should*
> be XML compatible, although it's very rare to find proper well-formed HTML
> these days....
>
> --
> brandon s allbery kf8nh                               sine nomine
> associates
> allbery.b at gmail.com
> ballbery at sinenomine.net
> unix, openafs, kerberos, infrastructure, xmonad
> http://sinenomine.net
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20131224/b53f8c18/attachment.html>


More information about the Haskell-Cafe mailing list