Statistical Estimators III: Smoothing Techniques: Laplace
PLAP(w1,..,wn)=(C(w1,..,wn)+1)/(N+B), where C(w1,..,wn) is the frequency of n-gram w1,..,wn and B is the number of bins training instances are divided into. ==> Adding One Process
The idea is to give a little bit of the probability space to unseen events.
However, in NLP applications that are very sparse, Laplace’s Law actually gives far too much of the probability space to unseen events.