Partial Word Sense Tagger # 1: Dictionary Definitions
Lesk, 1986’s Idea: Find the dictionary definition that overlaps most with the ambiguous word’s context. Problem: too much computation.
Cowie et al, 1992’s Improvements: Use a simulated annealing optimization approach. Hill-climbing kind of solution with a stochastic element based on the system’s “temperature” to allow the system to move away from local minima. Results: 47% accuracy at the sense level; 72% accuracy at the homograph level. Problem: longer definitions favored
The author’s improvements: 1) contribution of each overlapping word is normalized by the number of words in the definition; 2) Several choices are returned along with their confidence level and combined. Results: 65% accuracy