Combining Estimators II: Simple Linear Interpolation
One way of solving the sparseness in a trigram model is to mix that model with bigram and unigram models that suffer less from data sparseness.
This can be done by linear interpolation (also called finite mixture models). When the functions being interpolated all use a subset of the conditioning information of the most discriminating function, this method is referred to as deleted interpolation.
Pli(wn|wn-2,wn-1)=?1P1(wn)+ ?2P2(wn|wn-1)+ ?3P3(wn|wn-1,wn-2) where 0??i ?1 and ?i ?i =1
The weights can be set automatically using the Expectation-Maximization (EM) algorithm.