Statistical Estimators V: Robust Techniques: Held Out Estimation
For each n-gram, w1,..,wn , we compute C1(w1,..,wn) and C2(w1,..,wn), the frequencies of w1,..,wn in training and held out data, respectively.
Let Nr be the number of bigrams with frequency r in the training text.
Let Tr be the total number of times that all n-grams that appeared r times in the training text appeared in the held out data.
An estimate for the probability of one of these n-gram is: Pho(w1,..,wn)= Tr/(NrN) where C(w1,..,wn) = r.