Statistical Estimators IV: Smoothing Techniques:Lidstone and Jeffrey-Perks
Since the adding one process may be adding too much, we can add a smaller value ?.
PLID(w1,..,wn)=(C(w1,..,wn)+?)/(N+B?), where C(w1,..,wn) is the frequency of n-gram w1,..,wn and B is the number of bins training instances are divided into, and ?ɬ. ==> Lidstone’s Law
If ?=1/2, Lidstone’s Law corresponds to the expectation of the likelihood and is called the Expected Likelihood Estimation (ELE) or the Jeffreys-Perks Law.