Search Descriptions


Neural machine Translation

Statistical Machine Translation

Search Publications





Monolingual Data

Monolingual data is much more plentiful than parallel data and has been been proven valuable for informing models of fluency and informing the representation of words.

Monolingual Data is the main subject of 20 publications. 11 are discussed here.



Sennrich et al. (2016) back-translate the monolingual data into the input language and use the obtained synthetic parallel corpus as additional training data. Hoang et al. (2018) show that the quality of the machine translation system matters and can be improved by iterative back-translation. Burlot and Yvon (2018) also show that backtranslation quality matters and carry out additional analysis. Edunov et al. (2018) observe gains when back-translating with sampling search, instead of greedy search or beam search. Edunov et al. (2018) show better results with Monte Carlo search to generate the backtranslation data, i.e., randomly selecting word translations based on the predicted probability distribution. Currey et al. (2017) show that in low resource conditions simple copying of target side data to the source side also generates beneficial training data. Fadaee and Monz (2018) see gains with synthetic data generated by forward-translation (also called self-training). They also report gains when subsampling backtranslation data to favor rare or difficult to generate words (words with high loss during training).

Dual Learning

He et al. (2016) use monolingual data in a dual learning setup. Machine translation engines are trained in both directions, and in addition to regular model training from parallel data, monolingual data is translated in a round trip (e to f to e) and evaluated with a language model for language f and reconstruction match back to e as cost function to drive gradient descent updates to the model. Tu et al. (2017) augment the translation model with a reconstruction step. The generated output is translated back into the input language and the training objective is extended to not only include the likelihood of the target sentence but also the likelihood to the reconstructed input sentence. Niu et al. (2018) simultaneously train a model in both translation directions (with the identity of the source language indicated by marker token. Niu et al. (2019) extend this work to roundtrip translation training on monolingual data, allowing the forward translation and the reconstruction step to operate on the same model. They use Gumbel softmax to make the roundtrip differentiable.



Related Topics

New Publications

  • Domingo and Casacuberta (2018)
  • Marie and Fujita (2019)
  • Wu et al. (2019)
  • Wang et al. (2017)
  • Shen et al. (2017)
  • Ramesh and Sankaranarayanan (2018)
  • Imamura et al. (2018)
  • Prabhumoye et al. (2018)
  • Domhan and Hieber (2017)