CSci4152 Student Presentations

CSci4152: Statistical Natural Language Processing

Student Presentations

Comparison of algorithms for keyphrase extraction
Atreya Basu

In my project I compare; Microsoft Word's AutoSummarize feature, the NRC's Extractor algorithm, Teranet Software's Metabot program, the KEA program from the University of Waikato, and an algorithm of my own, to see how they perform based on the following three measures: Precision, recall and F-measure. The algorithms are tested with the text of two engineering books, and marked by its author.

Discovering Rules For The Use Of Locative Prepositions
Matthew Hogg

My project will be an attempt to discover instances of locative prepositions (on, at, in) and the words they reference. I will be restricted to the prepostion "on" in specific cases. Once the table of occurences of "on" is found, I will use similarity measures described in the book to classify these occurences of "on". Possible applications of this knowledge are in developing a less naive translation system where locative prepositions are involved.

Application of Non-Linear Auto-Associative Neural Networks to Discovering Word Similarity
Chris Maxwell

The objective of my project is to explore the effect of a non-linear transfer function on the efficacy of an Autoassociative Neural Network in discovering similarities in words. The original inspiration was that Latent Semantic Indexing, equivalent in some ways to a linear autoassociater, can be thought to squeeze similar words together to achieve its results, hence brief mention will be made of this concept also.

Word Sense Disambiguation (and Synonym Retrieval) Using WordNet
Adam Nickerson

The WordNet API is used to implement a dictionary-based word sense disambigution method (Lesk, 86). The goal of this project is to obtain a faster synonym retrieval time than previous work (MacCara, 99) by directly accessing the WordNet API rather than calling the command line version. Such dictionary-based methods often yield mediocre results; however, if they can be preformed quick enough, they may make a worthwhile contribution to a combined method approach (Stevenson & Wilks, 99)