[Haskell-cafe] Parsing unstructured data

Olivier Boudry olivier.boudry at gmail.com
Wed Dec 5 09:22:14 EST 2007

On Nov 29, 2007 5:31 AM, Reinier Lamers <reinier.lamers at phil.uu.nl> wrote:

> Especially in the fuzzy cases like this one, NLP often turns to machine
> learning models. One could try to train a hidden Markov model or support
> vector machines to label parts of the string as "name", "street",
> "number", "city", etc. These techniques work very well for part of
> speech tagging in natural language, and this seems similar. However, you
> need a manually annotated set of examples to train the models. If you
> really have a big load of data and it seems like a good solution, you
> could use an off-the-shelf part-of-speech tagger like SVMTool
> (http://www.lsi.upc.edu/~nlp/SVMTool/<http://www.lsi.upc.edu/%7Enlp/SVMTool/>)
> to do it.
> Reinier

Hi Reinier,

Thanks for the link to SVMTool. I don't have the basis to understand most of
the NLP articles I found and get stuck on the first NLP's slang words. For
me using an existing tool will be easier than build a new one. I'm currently
looking at the tool's documentation and it looks quite promising. It seems
to be very generic and highly reusable.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071205/3ac49a91/attachment.htm

More information about the Haskell-Cafe mailing list