Length-Based Methods I: General Approach
Goal: Find alignment A with highest probability given the two parallel texts S and T: arg maxA P(A|S, T)=argmaxA P(A, S, T)
To estimate the above probabilities, the aligned text is decomposed in a sequence of aligned beads where each bead is assumed to be independent of the others. Then P(A, S, T) ? ?k=1K P(Bk).
The question, then, is how to estimate the probability of a certain type of alignment bead given the sentences in that bead.