[Haskell] ANN: TagSoup library 0.1

Neil Mitchell ndmitchell at gmail.com
Wed Apr 11 09:18:38 EDT 2007


Hi

TagSoup is a library for extracting information out of unstructured
HTML code, sometimes known as tag-soup. The HTML does not have to be
well formed, or render properly within any particular framework. This
library is for situations where the author of the HTML is not
cooperating with the person trying to extract the information, but is
also not trying to hide the information.

The library provides a basic data type for a list of unstructured
tags, a parser to convert HTML into this tag type, and useful
functions and combinators for finding and extracting information.

Home page: http://www-users.cs.york.ac.uk/~ndm/tagsoup/
darcs: darcs get --partial http://www.cs.york.ac.uk/fp/darcs/tagsoup/
Haddock: http://www.cs.york.ac.uk/fp/haddock/tagsoup/
Manual: http://www.cs.york.ac.uk/fp/darcs/tagsoup/tagsoup.htm -
"Drinking TagSoup by Example"

If you are interested in this library I suggest reading the manual,
that contains 4 examples including getting the haskell.org hit count,
and a list of Simon Peyton Jones' recent papers.

I am unable to upload this library to Hackage until the Cabal sdist
command works on my machine, so if someone else wants to, I would be
grateful.

This library is required for Hoogle 4, hence the reason it was written.

Thanks

Neil


More information about the Haskell mailing list