[Haskell-cafe] NLP libraries and tools?

wren ng thornton wren at freegeek.org
Thu Jul 7 04:46:53 CEST 2011


On 7/6/11 8:46 PM, Richard O'Keefe wrote:
>> I've been working over the last year+ on an optimized HMM-based POS
>> tagger/supertagger with online tagging and anytime n-best tagging. I'm
>> planning to release it this summer (i.e., by the end of August), though
>> there are a few things I'd like to polish up before doing so. In
>> particular, I want to make the package less monolithic. When I release it
>> I'll make announcements here and on the nlp@ list.
>
> One of the issues I've had with a POS tagger I've been using is that it
> makes some really stupid decisions which can be patched up with a few
> simple rules, but since it's distributed as a .jar file I cannot add
> those rules.

How horrid. I assume the problem is really that the trained model is in
the jar and you can't do your own training? Or is this a Brill-like tagger
where you really mean to add new rules?

If an HMM-based tagger is amenable, you could try switching to Daniël de
Kok's Java port of TnT:

    https://github.com/danieldk/jitar


The tagger I'm working on does support being hooked up to a Java client
(i.e., consumer of tagging info), but it's fairly ugly due to Java's
refusal to believe in IPC.

-- 
Live well,
~wren




More information about the Haskell-Cafe mailing list