Evaluation of word alignment quality is difficult, because for many words correspondence to words in the other language is not straightforward, especially function words or words that are part of idiomatic expressions of other phrasal constructions.

To better understand the word alignment problem, parallel corpora have been annotated with word alignments for language pairs such as German–English, French–English, and Romanian–English, etc. These have been the basis for competitions on word alignment (Mihalcea and Pedersen, 2003; Martin et al., 2005).
The relationship between alignment quality and machine translation performance is under some discussion (Langlais et al., 1998; Fraser and Marcu, 2007). Vilar et al. (2006) point to mismatches between alignment error rate and machine translation performance. New measures have been proposed to overcome the weakness of alignment error rate (Carl and Fissaha, 2003). Giving less weight to alignment points that connect multiple aligned words improves correlation (Davis et al., 2007). Lopez and Resnik (2006) shows impact of word alignment quality on phrase-based models. Ayan and Dorr (2006) compare alignment quality and machine translation performance and also stress the interaction with the phrase extraction method. Albeit computationally very expensive, word alignment quality may also be directly optimized on machine translation performance (Lambert et al., 2007).



