Search Descriptions

Main Topics

Search Publications


author

title

other

year

Monolingual Data

Monolingual data is much more plentiful than parallel data and has been been proven valuable for informing models of fluency and informing the representation of words.

Monolingual Data is the main subject of 6 publications.

Publications

Sennrich et al. (2016) back-translate the monolingual data into the input language and use the obtained synthetic parallel corpus as additional training data. Hoang et al. (2018) show that the quality of the machine translation system matters and can be improved by iterative back-translation. Edunov et al. (2018) observe gains when back-translating with sampling search, instead of greedy search or beam search.
Xia et al. (2016) use monolingual data in a dual learning setup. Machine translation engines are trained in both directions, and in addition to regular model training from parallel data, monolingual data is translated in a round trip (e to f to e) and evaluated with a language model for language f and reconstruction match back to e as cost function to drive gradient descent updates to the model.

Benchmarks

Discussion

Related Topics

New Publications

  • Domhan and Hieber (2017)
  • Currey et al. (2017)

Actions

Download

Contribute