The Entropy of English
We can model English using n-gram models (also known a Markov chains).
These models assume limited memory, i.e., we assume that the next word depends only on the previous k ones [kth order Markov approximation].
What is the Entropy of English?