Alistair Kennedy


Enhancing Roget's Thesaurus with Labeled Relationships


 

Abstract:
 

Lexical resources are essentially large dictionaries where instead of simply defining terms, they indicate relationships between terms, and often the type of relationship.  They have been used for many tasks such as word sense disambiguation and determining semantic similarity between terms.  In recent years some research has been put into automatically building lexical resources from large corpora.  In this presentation I examine methods of not constructing a lexical resource from scratch, but rather building onto one.  Roget’s Thesaurus is a lexical resource which groups terms together based on different degrees of semantic similarity.  One of Roget’s Thesaurus’ weaknesses is that it does not specify the nature of the relationships between terms; it only indicates that there is a relationship.  I will attempt to label the relationships between terms in the thesaurus.  These relationships could include: synonym, hyponym/hypernym and meronym/holonym.  Sources of these relationships include other lexical resources such as WordNet, and also large corpora.