Statistical Estimators VI: Related Approach: Good-Turing Estimator
If C(w1,..,wn) = r > 0, PGT(w1,..,wn) = r*/N where r*=((r+1)S(r+1))/S(r) and S(r) is a smoothed estimate of the expectation of Nr.
If C(w1,..,wn) = 0, PGT(w1,..,wn) ? N1/(N0N)
Simple Good-Turing [Gale & Sampson, 1995]: As a smoothing curve, use Nr=arb (with b < -1) and estimate a and b by simple linear regression on the logarithmic form of this equation: log Nr= log a + b log r, if r is large. For low values of r, use the measured Nr directly.