ca.uottawa.balie
Class NamedEntityRecognition

java.lang.Object
  extended byca.uottawa.balie.NamedEntityRecognition

public class NamedEntityRecognition
extends java.lang.Object

Named Entity Recognition (NER) NER operates on the Balie TokenList. It consists of two stages: 1) Lexicon lookup: identifying matches in large automatically generated lexicons of entities. 2) Resolve ambiguity: applying simple heuristics to disambiguiate and classify entities. 2.1) Entity-Noun Ambiguity: ex.: Jobs (person) vs jobs (noun) 2.2) Entity Boundary Detection: AAAA Stevenson (person) where AAAA is an unknow first name 2.3) Entity-Entity Ambiguity: France (location) vs France (person) This class is not optimized for speed. It is optimized for readability and conformance to published experiments: Nadeau, D., Turney, P. D. and Matwin, S. (2006) Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. Proc. Canadian Conference on Artificial Intelligence. (submitted)

Author:
nadeaud

Constructor Summary
NamedEntityRecognition(LexiconOnDisk pi_Lexicon, TokenList pi_TokenList)
          Initialize NER on a tokenlist.
 
Method Summary
 TokenList GetTokenList()
          Get the resulting tokenlist.
 void RecognizeEntities()
          Process the token list and tag entities
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NamedEntityRecognition

public NamedEntityRecognition(LexiconOnDisk pi_Lexicon,
                              TokenList pi_TokenList)
Initialize NER on a tokenlist.

Parameters:
pi_Lexicon - lexicons used for NER. see LexiconOnDisk
pi_TokenList - a tokenlist (must be English)
Method Detail

RecognizeEntities

public void RecognizeEntities()
Process the token list and tag entities


GetTokenList

public TokenList GetTokenList()
Get the resulting tokenlist. Named entities are tagged if the method "RecognizeEntities" was called.

Returns:
The tokenlist.