Supervised Disambiguation: Bayesian Classification I
(Gale et al, 1992)’s Idea: to look at the words around an ambiguous word in a large context window. Each content word contributes potentially useful information about which sense of the ambiguous word is likely to be used with it. The classifier does no feature selection. Instead, it combines the evidence from all features.
Bayes decision rule: Decide s’ if P(s’|C) > P(sk|C) for sk ? s’.
P(sk|C) is computed by Bayes’ Rule.