[Haskell-cafe] Parse HTML that is contain javascript

akira kawata a.kawashiro at gmail.com
Tue Dec 24 19:20:54 UTC 2013


Did you mean HaXmL?
I am sorry that I can't explain what I want well.
I think this module cannot parse HTML file like this.
I don't mind the javascript code.

I want to trancelate following code

<html>
<p> hogehoge </p>
<script>if(window.mw){
mw.loader.state({"<script>":"</script>","user":"ready","user.groups":"ready"});
}
</script>
</html>


to like this

<html>
     <p>
         hogehoge
     <script>

in short, I want structure of HTML  excludeing javascript.

2013/12/25 Brandon Allbery <allbery.b at gmail.com>

> On Tue, Dec 24, 2013 at 2:03 PM, akira kawata <a.kawashiro at gmail.com>wrote:
>
>> <html>
>> <script>
>> //<![CDATA[
>> <!-- -->
>> //]]>
>> </script>
>> </html>
>>
>
> An XML parser might help with CDATA blocks.
>
> --
> brandon s allbery kf8nh                               sine nomine
> associates
> allbery.b at gmail.com
> ballbery at sinenomine.net
> unix, openafs, kerberos, infrastructure, xmonad
> http://sinenomine.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20131225/23c49608/attachment.html>


More information about the Haskell-Cafe mailing list