Things that can be done with Text Corpora III: Collocations
A collocation is any turn of phrase or accepted usage where somehow the whole is perceived as having an existence beyond the sum of its parts (e.g., disk drive, make up, bacon and eggs).
Collocations are important for machine translation.
Collocation can be extracted from a text (example, the most common bigrams can be extracted). However, since these bigrams are often insignificant (e.g., “at the”, “of a”), they can be filtered.