Questions Related with this Task and Tackled in the Paper

Should we design a General method that applies to a large vocabulary or a method that applies only to a small trial selection of words?

Given that different people assign different senses to words in contexts (some people make finer, more inter-subjective sense distinctions than others), how can we reliably evaluate our design?

Word-Sense disambiguation seems to draw upon a number of seemingly different information sources. How can we combine this information?

Should we design a General method that applies to a large vocabulary or a method that applies only to a small trial selection of words?