Training a PCFG

Restrictions: We assume that the set of rules is given in advance and we try to find the optimal probabilities to assign to different grammar rules.

Like for the HMMs, we use an EM Training Algorithm called the Inside-Outside Algorithm which allows us to train the parameters of a PCFG on unannotated sentences of the language.

Basic Assumption: a good grammar is one that makes the sentences in the training corpus likely to occur ==> we seek the grammar that maximizes the likelihood of the training data.

Restrictions: We assume that the set of rules is given in advance and we try to find the optimal probabilities to assign to different grammar rules.