|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectca.uottawa.balie.NamedEntityRecognition
Named Entity Recognition (NER) NER operates on the Balie TokenList. It consists of two stages: 1) Lexicon lookup: identifying matches in large automatically generated lexicons of entities. 2) Resolve ambiguity: applying simple heuristics to disambiguiate and classify entities. 2.1) Entity-Noun Ambiguity: ex.: Jobs (person) vs jobs (noun) 2.2) Entity Boundary Detection: AAAA Stevenson (person) where AAAA is an unknow first name 2.3) Entity-Entity Ambiguity: France (location) vs France (person) This class is not optimized for speed. It is optimized for readability and conformance to published experiments: Nadeau, D., Turney, P. D. and Matwin, S. (2006) Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. Proc. Canadian Conference on Artificial Intelligence. (submitted)
Constructor Summary | |
NamedEntityRecognition(LexiconOnDisk pi_Lexicon,
TokenList pi_TokenList)
Initialize NER on a tokenlist. |
Method Summary | |
TokenList |
GetTokenList()
Get the resulting tokenlist. |
void |
RecognizeEntities()
Process the token list and tag entities |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public NamedEntityRecognition(LexiconOnDisk pi_Lexicon, TokenList pi_TokenList)
pi_Lexicon
- lexicons used for NER. see LexiconOnDisk
pi_TokenList
- a tokenlist (must be English)Method Detail |
public void RecognizeEntities()
public TokenList GetTokenList()
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |