Likelihood Ratios I: Within a single corpus (Dunning, 1993)
Likelihood ratios are more appropriate for sparse data than the Chi-Square test. In addition, they are easier to interpret than the Chi-Square statistic.
In applying the likelihood ratio test to collocation discovery, we examine the following two alternative explanations for the occurrence frequency of a bigram w1 w2:
- The occurrence of w2 is independent of the previous occurrence of w1
- The occurrence of w2 is dependent of the previous occurrence of w1