|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectca.uottawa.balie.TokenList
List of Tokens to represent a text. Comes with a bunch of manipulation functions. Also an XML representation.
Constructor Summary | |
TokenList(boolean pi_DetectSentenceBoundaries)
Construct an empty TokenList. |
Method Summary | |
boolean |
Add(Token pi_Token,
SentenceBoundariesRecognition pi_SBR,
WekaLearner pi_SBRModel)
Add a token a the end of the TokenList. |
boolean |
equals(java.lang.Object pi_Obj)
|
Token |
Get(int pi_Index)
Gets the token at the given index. |
int |
getSentenceCount()
Gets the number of sentences found. |
java.util.Hashtable |
HashAccess()
|
int |
hashCode()
|
TokenListIterator |
Iterator()
Gets an iterator for the tokenList |
java.lang.String |
SentenceText(int pi_Index,
boolean pi_Canonic,
boolean pi_PrintNewLines)
Gets the text version of the sentence at the given index. |
void |
SetEntityType(int pi_Index,
int pi_Type)
|
void |
SetPOS(int pi_Index,
int pi_POS)
Sets the Part-of-speech of the token at the given index. |
int |
Size()
Gets the size (number of tokens) of the TokenList. |
java.util.Hashtable |
TermFrequencyTable()
Gets the TF table. |
java.lang.String |
TokenRangeText(int pi_Start,
int pi_Stop,
boolean pi_Canonic,
boolean pi_PrintNewLines,
boolean pi_TagEntities)
|
java.lang.StringBuffer |
ToXML()
Gets the tokenlist in XML format |
java.util.ArrayList |
WordList()
|
Methods inherited from class java.lang.Object |
getClass, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public TokenList(boolean pi_DetectSentenceBoundaries)
pi_DetectSentenceBoundaries
- True if the sentences boundaries must be detectedMethod Detail |
public boolean Add(Token pi_Token, SentenceBoundariesRecognition pi_SBR, WekaLearner pi_SBRModel)
pi_Token
- A new tokenpi_SBR
- The SBR objectpi_SBRModel
- The learned SBR model
public int Size()
public Token Get(int pi_Index)
pi_Index
- Index of the token to get.
public boolean equals(java.lang.Object pi_Obj)
public int hashCode()
public java.lang.String SentenceText(int pi_Index, boolean pi_Canonic, boolean pi_PrintNewLines)
pi_Index
- Index of the sentence to get (in number of sentences)pi_Canonic
- True if the text must be returned in its canonical version
public java.lang.String TokenRangeText(int pi_Start, int pi_Stop, boolean pi_Canonic, boolean pi_PrintNewLines, boolean pi_TagEntities)
public int getSentenceCount()
public java.util.Hashtable TermFrequencyTable()
public java.util.Hashtable HashAccess()
public java.util.ArrayList WordList()
public void SetPOS(int pi_Index, int pi_POS)
pi_Index
- Index of the token to updatepi_POS
- Part-of-speech of this token (see TokenConsts
for the enumeration)TokenConsts
public void SetEntityType(int pi_Index, int pi_Type)
public java.lang.StringBuffer ToXML()
public TokenListIterator Iterator()
TokenListIterator
)TokenListIterator
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |