ca.uottawa.balie
Class LanguageSpecific

java.lang.Object
  extended byca.uottawa.balie.LanguageSpecific
Direct Known Subclasses:
LanguageSpecificEnglish, LanguageSpecificFrench, LanguageSpecificGerman, LanguageSpecificRomanian, LanguageSpecificSpanish

public abstract class LanguageSpecific
extends java.lang.Object

Squeleton of language specific routines.

Author:
nadeaud

Method Summary
abstract  java.lang.String[] Decompound(java.lang.String pi_Composed)
          Decomposes a word in its parts (ex.: French decomposition on apostrophe.
abstract  java.util.Hashtable GetAbbreviations()
          Gets the list of abbreviations (mainly for SBD).
 java.util.Hashtable GetPOSLookup()
          Gets a table of hard-coded POS (those common to all languages).
abstract  java.util.Hashtable GetQTagEquivalence()
          Gets equivalence table that matches qTag output with Balie TokenConsts.
 qtag.Tagger GetQTagger()
          Gets the qTag instance (part-of-speech tagger).
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

Decompound

public abstract java.lang.String[] Decompound(java.lang.String pi_Composed)
Decomposes a word in its parts (ex.: French decomposition on apostrophe. German word split, ...).

Parameters:
pi_Composed - A string to decompose
Returns:
Array containing each word part

GetQTagEquivalence

public abstract java.util.Hashtable GetQTagEquivalence()
Gets equivalence table that matches qTag output with Balie TokenConsts.

Returns:
A table that maps qTag tags with Balie Tags (see TokenConsts for enumeration)
See Also:
TokenConsts

GetQTagger

public qtag.Tagger GetQTagger()
Gets the qTag instance (part-of-speech tagger).

Returns:
a Tagger

GetAbbreviations

public abstract java.util.Hashtable GetAbbreviations()
Gets the list of abbreviations (mainly for SBD).

Returns:
Table of abbreviation (for fast lookup)

GetPOSLookup

public java.util.Hashtable GetPOSLookup()
Gets a table of hard-coded POS (those common to all languages). For instance, the token "C#" is a proper noun in any language.

Returns:
Table of POS (for fast lookup)