Monolingual Data

Monolingual data is much more plentiful than parallel data and has been been proven valuable for informing models of fluency and informing the representation of words.

Monolingual Data is the main subject of 33 publications. 22 are discussed here.

Topics in NeuralNetworkModels

Publications

Backtranslation

Sennrich et al. (2016) back-translate the monolingual data into the input language and use the obtained synthetic parallel corpus as additional training data. Hoang et al. (2018) show that the quality of the machine translation system matters and can be improved by iterative back-translation. Burlot and Yvon (2018) also show that backtranslation quality matters and carry out additional analysis. Edunov et al. (2018) show better results with Monte Carlo search to generate the backtranslation data, i.e., randomly selecting word translations based on the predicted probability distribution. Imamura et al. (2018); Imamura and Sumita (2018) also confirm that better translation quality can be obtained when backtranslating with such sampling and offer some refinements. Caswell et al. (2019) argue that the noise introduced by this type of stochastic search flags to the model that it is backtranslated data, something that can also be accomplished with an explicit special token, to the same effect.

Currey et al. (2017) show that in low resource conditions simple copying of target side data to the source side also generates beneficial training data. Fadaee and Monz (2018) see gains with synthetic data generated by forward-translation (also called self-training). They also report gains when subsampling backtranslation data to favor rare or difficult to generate words (words with high loss during training).

Dual Learning

He et al. (2016) use monolingual data in a dual learning setup. Machine translation engines are trained in both directions, and in addition to regular model training from parallel data, monolingual data is translated in a round trip (e to f to e) and evaluated with a language model for language f and reconstruction match back to e as cost function to drive gradient descent updates to the model. Tu et al. (2017) augment the translation model with a reconstruction step. The generated output is translated back into the input language and the training objective is extended to not only include the likelihood of the target sentence but also the likelihood to the reconstructed input sentence. Niu et al. (2018) simultaneously train a model in both translation directions (with the identity of the source language indicated by marker token. Niu et al. (2019) extend this work to roundtrip translation training on monolingual data, allowing the forward translation and the reconstruction step to operate on the same model. They use Gumbel softmax to make the roundtrip differentiable.

Unsupervised Machine Translation

The idea of backtranslation is also crucial for the ambitious goal of unsupervised machine translation, i.e., the training of machine translation systems with monolingual data only. These methods typically start with multilingual word embeddings, which may also be induced from monolingual data. Given such a word translation model, Lample et al. (2018) propose to translate sentences in one language with a simple word-by-word translation model into another language, using a shared encoder and decoder for both languages involved. They define three different objectives in their setup: the ability to reconstruct a source sentence form its intermediate representation, even with added noise (randomly dropping words), the ability to reconstruct a source sentence from its translation into the target language, and an adversarial component that attempts to classify the identity of the language from intermediate representation of a sentence in either language. Artetxe et al. (2018) use a similar setup, with a shared encoder and language-specific decoder, relying on the idea of a denoising auto-encoder (just like the first objective above), and the ability to reconstruct the source sentence from a translation into the target language. Sun et al. (2019) note that during training of the neural machine translation model, the bilingual word embedding deteriorates. They add the training objective for the induction of the bilingual word embeddings into the objective function of neural machine translation training. Yang et al. (2018) use language-specific encoders with some shared weights in a similar setup. Artetxe et al. (2018) show better results when inducing phrase translations from phrase embeddings and use them in statistical phrase-based machine translation model, which includes an explicit language model. They refine their model with synthetic data generated by iterative backtranslation. Lample et al. (2018) combine unsupervised statistical and neural machine translation models. Their phrase-based model is initialized with word translations obtained from multilingual word embeddings and then iteratively refined into phrase translations. Ren et al. (2019) more closely tie together training of unsupervised statistical and neural machine translation systems by using the statistical machine translation model as a regularizer for the neural model training. Artetxe et al. (2019) improve their unsupervised statistical machine translation model with a feature that favors similarly spelled translations and a unsupervised method to tune the weights for the statistical components. Circling back to bilingual lexicon induction, Artetxe et al. (2019) use such an unsupervised machine translation model to synthesize a parallel corpus by translating monolingual data, process it with word alignment methods, and extract a bilingual dictionary using maximum likelihood estimation.

Benchmarks

Discussion

New Publications

Unsupervised machine translation

Guillaume Lample and Alexis Conneau (2019): Cross-lingual Language Model Pretraining, CoRR
add
@article{DBLP:journals/corr/abs-1901-07291,
author = {Guillaume Lample and Alexis Conneau},
title = {Cross-lingual Language Model Pretraining},
journal = {CoRR},
volume = {abs/1901.07291},
url = {http://arxiv.org/abs/1901.07291},
archiveprefix = {arXiv},
eprint = {1901.07291},
timestamp = {Fri, 01 Feb 2019 13:39:59 +0100},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1901-07291},
bibsource = {dblp computer science bibliography, https://dblp.org},
year = 2019
}
Lample and Conneau (2019)

Backtranslation

Graça, Miguel and Kim, Yunsu and Schamper, Julian and Khadivi, Shahram and Ney, Hermann (2019): Generalizing Back-Translation in Neural Machine Translation, Proceedings of the Fourth Conference on Machine Translation
add
@InProceedings{graa-EtAl:2019:WMT,
author = {Gra{\,c}a, Miguel and Kim, Yunsu and Schamper, Julian and Khadivi, Shahram and Ney, Hermann},
title = {Generalizing Back-Translation in Neural Machine Translation},
booktitle = {Proceedings of the Fourth Conference on Machine Translation},
month = {August},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
pages = {45--52},
url = {http://www.aclweb.org/anthology/W19-5205},
year = 2019
}
Graça et al. (2019)
Miguel Domingo and Francisco Casacuberta (2018): A Machine Translation Approach for Modernizing Historical Documents Using Back Translation, Proceedings of the International Workshop on Spoken Language Translation (IWSLT)
add
@inproceedings{iwslt18-Historical-Domingo,
author = {Miguel Domingo and Francisco Casacuberta},
title = {A Machine Translation Approach for Modernizing Historical Documents Using Back Translation},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)},
year = 2018
}
Domingo and Casacuberta (2018)
Prabhumoye, Shrimai and Tsvetkov, Yulia and Salakhutdinov, Ruslan and Black, Alan W (2018): Style Transfer Through Back-Translation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{P18-1080,
author = {Prabhumoye, Shrimai and Tsvetkov, Yulia and Salakhutdinov, Ruslan and Black, Alan W},
title = {Style Transfer Through Back-Translation},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {866--876},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-1080},
year = 2018
}
Prabhumoye et al. (2018)

Other

Luo, Jiaming and Cao, Yuan and Barzilay, Regina (2019): Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{luo-etal-2019-neural,
author = {Luo, Jiaming and Cao, Yuan and Barzilay, Regina},
title = {Neural Decipherment via Minimum-Cost Flow: From Ugaritic to Linear B},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1303},
pages = {3146--3155},
year = 2019
}
Luo et al. (2019)
Pourdamghani, Nima and Aldarrab, Nada and Ghazvininejad, Marjan and Knight, Kevin and May, Jonathan (2019): Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{pourdamghani-etal-2019-translating,
author = {Pourdamghani, Nima and Aldarrab, Nada and Ghazvininejad, Marjan and Knight, Kevin and May, Jonathan},
title = {Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1293},
pages = {3057--3062},
year = 2019
}
Pourdamghani et al. (2019)
Xia, Mengzhou and Kong, Xiang and Anastasopoulos, Antonios and Neubig, Graham (2019): Generalized Data Augmentation for Low-Resource Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{xia-etal-2019-generalized,
author = {Xia, Mengzhou and Kong, Xiang and Anastasopoulos, Antonios and Neubig, Graham},
title = {Generalized Data Augmentation for Low-Resource Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1579},
pages = {5786--5796},
year = 2019
}
Xia et al. (2019)
Marie, Benjamin and Fujita, Atsushi (2019): Unsupervised Extraction of Partial Translations for Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
add
@inproceedings{marie-fujita-2019-unsupervised,
author = {Marie, Benjamin and Fujita, Atsushi},
title = {Unsupervised Extraction of Partial Translations for Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/N19-1384},
pages = {3834--3844},
year = 2019
}
Marie and Fujita (2019)
Wu, Jiawei and Wang, Xin and Wang, William Yang (2019): Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
add
@inproceedings{wu-etal-2019-extract,
author = {Wu, Jiawei and Wang, Xin and Wang, William Yang},
title = {Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/N19-1120},
pages = {1173--1183},
year = 2019
}
Wu et al. (2019)
Wang, Yining and Zhao, Yang and Zhang, Jiajun and Zong, Chengqing and Xue, Zhengshan (2017): Towards Neural Machine Translation with Partially Aligned Corpora, Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
add
@InProceedings{wang-EtAl:2017:I17-11,
author = {Wang, Yining and Zhao, Yang and Zhang, Jiajun and Zong, Chengqing and Xue, Zhengshan},
title = {Towards Neural Machine Translation with Partially Aligned Corpora},
booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)},
month = {November},
address = {Taipei, Taiwan},
publisher = {Asian Federation of Natural Language Processing},
pages = {384--393},
url = {http://www.aclweb.org/anthology/I17-1039},
year = 2017
}
Wang et al. (2017)
Shen, Tianxiao and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi (2017): Style Transfer from Non-Parallel Text by Cross-Alignment, Advances in Neural Information Processing Systems 30
add
@incollection{NIPS2017-7259,
author = {Shen, Tianxiao and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi},
title = {Style Transfer from Non-Parallel Text by Cross-Alignment},
booktitle = {Advances in Neural Information Processing Systems 30},
editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett},
pages = {6830--6841},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/7259-style-transfer-from-non-parallel-text-by-cross-alignment.pdf},
year = 2017
}
Shen et al. (2017)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions