[Haskell-cafe] Stripping text of xml tags and special symbols
Benja Fallenstein
benja.fallenstein at gmail.com
Tue Aug 5 17:48:28 EDT 2008
Hi Pieter,
2008/8/5 Pieter Laeremans <pieter at laeremans.org>:
> But the sphinx indexer complains that the xml isn't valid. When I look at
> the errors this seems due to some documents containing not well formed
> html.
If you need to cope with non-well-formed HTML, try HTML Tidy:
http://tidy.sourceforge.net/
All the best,
- Benja
More information about the Haskell-Cafe
mailing list