Search Descriptions

Main Topics

Search Publications


author

title

other

year

Pre-Reordering

Since reordering is a hard problem, there have been efforts to handle it in a separate prior translation stage, so that the main translation model can focus on the lexical aspects.

Publications

Reordering in pre-processing by a hand-crafted component has been explored for German–English (Collins et al., 2005), Japanese–English (Komachi et al., 2006), Chinese–English (Wang et al., 2007), and English–Hindi (Ramanathan et al., 2008). Zwarts and Dras (2007) point out that translation improvements are due to both a reduction of reordering needed during decoding and the increased learning of phrases of syntactic dependents. Nguyen and Shimazu (2006) also use manual rules for syntactic transformation in a preprocessing step. Such a reordering component may also be learned automatically from parsed training data, as shown for French–English (Xia and McCord, 2004), Arabic–English (Habash, 2007), and Chinese–English (Crego and Mariño, 2007) — the latter work encodes different orderings in a input lattice to the decoder. Li et al. (2007) propose a maximum entropy pre-reordering model based on syntactic parse trees in the source language. It may be beneficial to train different such pre-reordering models for different sentence types (questions etc.) (Zhang et al., 2008). Preprocessing the input to a machine translation system may also include splitting it up into smaller sentences (Lee et al., 2008).
Reordering patterns may also be learned over part-of-speech tags, allowing the input to be converted into a reordering graph (Crego and Mariño, 2006) or enabling a rescoring approach with the patterns as features (Chen et al., 2006). The reordering rules may also be integrated into an otherwise monotone decoder (Tillmann, 2008). Such rules may also be used in a separate reordering model. Such rules may be based on automatic word classes (Costa-jussà and Fonollosa, 2006; Crego et al., 2006), which was shown to outperform part-of-speech tags (Costa-jussà and Fonollosa, 2007), or they may be based on syntactic chunks (Zhang et al., 2007; Zhang et al., 2007b; Crego and Habash, 2008). Scoring for rule applications may be encoded in the reordering graph, or done once the target word order is established which allows for rewarding reorderings that happened due to phrase-internal reordering (Elming, 2008; Elming, 2008b).

Benchmarks

Discussion

Related Topics

New Publications

  • Nguyen and Shimazu (2006)
  • Holmqvist et al. (2009)
  • Li et al. (2009)
  • Xu et al. (2009)
  • Elming and Habash (2009)
  • Genzel (2010)
  • Khalilov and Sima'an (2010)
  • Goh et al. (2011)
  • Bisazza and Federico (2010)
  • Isozaki et al. (2010)
  • Badr et al. (2009)
  • Jiang et al. (2010)
  • Katz-Brown et al. (2011)
  • Howlett and Dras (2011)
  • Andreas et al. (2011)

Actions

Download

Contribute