Search Descriptions

Main Topics

Search Publications





Discriminative Word Alignment

Viewed from machine learning, word alignment is an interesting structured prediction problem, with the interesting angle of having small amounts of supervised and large amount of unsupervised data.

Discriminative Word Alignment is the main subject of 22 publications.


Statistical machine translation systems achieve better quality with manually labeled word alignments (Callison-Burch et al., 2004), but such data does not exist in large quantities. Discriminative word alignment methods typically generate statistics over a large unlabeled corpus which may have been aligned with some baseline method such as the IBM models, which form the basis for features that are optimized during machine learning over a much smaller labeled corpus. Fraser and Marcu (2007) extend their generative model that allows many-to-many alignments by a discriminative optimization step that uses small amounts of labeled data.
Discriminative approaches may use the perceptron algorithm (Moore, 2005; Moore et al., 2006), maximum entropy models (Ittycheriah and Roukos, 2005), neural networks (Ayan et al., 2005), max-margin methods (Taskar et al., 2005), boosting (Wu and Wang, 2005; Wu et al., 2006), support vector machines (Cherry and Lin, 2006), conditional random fields (Blunsom and Cohn, 2006; Niehues and Vogel, 2008) or MIRA (Venkatapathy and Joshi, 2007).
Such methods allow the integration of features such as a more flexible fertility model and interactions between consecutive words (Lacoste-Julien et al., 2006). Especially smaller parallel corpora benefit from more attention to less frequent words (Zhang et al., 2005). Discriminative models open a path to add additional features such as ITG constraint (Chao and Li, 2007).
Related to the discriminative approach, posterior methods use agreement in the n-best alignments to adjust alignment points (Kumar and Byrne, 2002).



Related Topics

New Publications

  • Tomeh et al. (2010)
  • Liu et al. (2010)
  • Liu et al. (2009)
  • Setiawan et al. (2010)
  • Dyer et al. (2011)