Alternative Architectures

While the attentional sequence-to-sequence model is currently the dominant architecture for neural machine translation, other architectures have been explored.

Alternative Architectures is the main subject of 44 publications. 14 are discussed here.

Topics in NeuralNetworkModels

Publications

Kalchbrenner and Blunsom (2013) build a comprehensive machine translation model by first encoding the source sentence with a convolutional neural network, and then generate the target sentence by reversing the process. A refinement of this was proposed by Gehring et al. (2017) who use multiple convolutional layers in the encoder and the decoder that do not reduce the length of the encoded sequence but incorporate wider context with each layer.

Self Attention (Transformer)

Vaswani et al. (2017) replace the recurrent neural networks used in attentional sequence-to-sequence models with multiple self-attention layers (called Transformer), both for the encoder as well as the decoder. Chen et al. (2018) compare different configurations of Transformer or recurrent neural networks in the encoder and decoder, and report that many of the different quality gains are due to a handful of training tricks, and show better results with a Transformer encoder and a RNN decoder. Emelin et al. (2019) claim a representation bottleneck in the self-attention layers that requires carrying through lexical features, preventing it from focusing on more complex features. They add shortcut connections from the initial embedding layer to each of the self-attention layers, in both encoder and decoder.

Dehghani et al. (2019) propose a variant, called Universal Transformers, that do not use a fixed number of processing layers, but a arbitrary long loop through a single processing layer.

Deeper Transformer Models

Naive implementations of deeper transformer models by just increasing number of encoder and decoder blocks leads to worse and sometimes catastrophic results. Wu et al. (2019) first train a model with n transformer blocks, then keep their parameters fixed and add m additional blocks. Bapna et al. (2018) argue that earlier encoder layers may be lost and connect all encoder layers to the attention computation of the decoder. Wang et al. (2019) successfully train deep transformer models with up to 30 layers by relocating the normalization step to the beginning of the block and by adding residual connections to all previous layers, not just the directly preceding one.

Document Context

Maruf et al. (2018) consider the entire source document as context when translating a sentence. Attention is computed over all input sentences and the sentences are weighted accordingly. Miculicich et al. (2018) extend this work with hierarchical attention which first computes attention over sentences and then over words. Due to computational problems, this is limited to a window of surrounding sentences. Maruf et al. (2019) also use hierarchical attention but compute sentence-level attention over the entire document and filters out the most relevant sentences before extending attention over words. A gate distinguishes between words in the source sentence and words in the context sentences. Junczys-Dowmunt (2019) translates entire source documents (up to 1000 words) at a time by concatenating all input sentences, showing significant improvements.

Benchmarks

Discussion

New Publications

Self-Attention

Xu, Mingzhou and Wong, Derek F. and Yang, Baosong and Zhang, Yue and Chao, Lidia S. (2019): Leveraging Local and Global Patterns for Self-Attention Networks, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{xu-etal-2019-leveraging,
author = {Xu, Mingzhou and Wong, Derek F. and Yang, Baosong and Zhang, Yue and Chao, Lidia S.},
title = {Leveraging Local and Global Patterns for Self-Attention Networks},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1295},
pages = {3069--3075},
year = 2019
}
Xu et al. (2019)
Miculicich Werlen, Lesly and Pappas, Nikolaos and Ram, Dhananjay and Popescu-Belis, Andrei (2018): Self-Attentive Residual Decoder for Neural Machine Translation, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
add
@InProceedings{N18-1124,
author = {Miculicich Werlen, Lesly and Pappas, Nikolaos and Ram, Dhananjay and Popescu-Belis, Andrei},
title = {Self-Attentive Residual Decoder for Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {1366--1379},
location = {New Orleans, Louisiana},
url = {http://aclweb.org/anthology/N18-1124},
year = 2018
}
Werlen et al. (2018)

Transformer

Hao, Jie and Wang, Xing and Yang, Baosong and Wang, Longyue and Zhang, Jinfeng and Tu, Zhaopeng (2019): Modeling Recurrence for Transformer, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
add
@inproceedings{hao-etal-2019-modeling,
author = {Hao, Jie and Wang, Xing and Yang, Baosong and Wang, Longyue and Zhang, Jinfeng and Tu, Zhaopeng},
title = {Modeling Recurrence for Transformer},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
month = {jun},
address = {Minneapolis, Minnesota},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/N19-1122},
pages = {1198--1207},
year = 2019
}
Hao et al. (2019) - recurrence
Hideya Mino and Andrew Finch and Eiichiro Sumita (2017): A Target Attention Model for Neural Machine Translation, Machine Translation Summit XVI
add
@inproceedings{mtsummit2017:Mino,
author = {Hideya Mino and Andrew Finch and Eiichiro Sumita},
title = {A Target Attention Model for Neural Machine Translation},
booktitle = {Machine Translation Summit XVI},
location = {Nagoya, Japan},
year = 2017
}
Mino et al. (2017) - target attention
Zhang, Biao and Xiong, Deyi and jinsong (2018): Accelerating Neural Transformer via an Average Attention Network, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{P18-1166,
author = {Zhang, Biao and Xiong, Deyi and jinsong},
title = {Accelerating Neural Transformer via an Average Attention Network},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {1789--1798},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-1166},
year = 2018
}
Zhang et al. (2018) - average attention

Multi-Layer Fusion

Wang, Qiang and Li, Fuxue and Xiao, Tong and Li, Yanyang and Li, Yinqiao and Zhu, Jingbo (2018): Multi-layer Representation Fusion for Neural Machine Translation, Proceedings of the 27th International Conference on Computational Linguistics
add
@inproceedings{C18-1255,
author = {Wang, Qiang and Li, Fuxue and Xiao, Tong and Li, Yanyang and Li, Yinqiao and Zhu, Jingbo},
title = {Multi-layer Representation Fusion for Neural Machine Translation},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/C18-1255},
pages = {3015--3026},
year = 2018
}
Wang et al. (2018)

Weakly Recurrent

Mattia A. Di Gangi and Marcello Federico (2018): Deep Neural Machine Translation with Weakly-Recurrent Units, Proceedings of the 21st Annual Conference of the European Association for Machine Translation
add
@inproceedings{eamt18-DiGangi,
author = {Mattia A. Di~Gangi and Marcello Federico},
title = {Deep Neural Machine Translation with Weakly-Recurrent Units},
booktitle = {Proceedings of the 21st Annual Conference of the European Association for Machine Translation},
location = {Alicante, Spain},
year = 2018
}
Di Gangi and Federico (2018)

Weight Tying in Embeddings

Pappas, Nikolaos and Miculicich, Lesly and Henderson, James (2018): Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation, Proceedings of the Third Conference on Machine Translation: Research Papers
add
@inproceedings{W18-6308,
author = {Pappas, Nikolaos and Miculicich, Lesly and Henderson, James},
title = {Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation},
booktitle = {Proceedings of the Third Conference on Machine Translation: Research Papers},
month = {oct},
address = {Belgium, Brussels},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/W18-6308},
pages = {73--83},
year = 2018
}
Pappas et al. (2018)
Kuang, Shaohui and Li, Junhui and Branco, António and Luo, Weihua and Xiong, Deyi (2018): Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{P18-1164,
author = {Kuang, Shaohui and Li, Junhui and Branco, Ant{\'o}nio and Luo, Weihua and Xiong, Deyi},
title = {Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {1767--1776},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-1164},
year = 2018
}
Kuang et al. (2018)

Non-Autoregressive

Jiatao Gu and James Bradbury and Caiming Xiong and Victor O.K. Li and Richard Socher (2018): Non-Autoregressive Neural Machine Translation, International Conference on Learning Representations
add
@inproceedings{gu2018nonautoregressive,
author = {Jiatao Gu and James Bradbury and Caiming Xiong and Victor O.K. Li and Richard Socher},
title = {Non-Autoregressive Neural Machine Translation},
booktitle = {International Conference on Learning Representations},
url = {https://openreview.net/forum?id=B1l8BtlCb},
year = 2018
}
Gu et al. (2018)
Wei, Bingzhen and Wang, Mingxuan and Zhou, Hao and Lin, Junyang and Sun, Xu (2019): Imitation Learning for Non-Autoregressive Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{wei-etal-2019-imitation,
author = {Wei, Bingzhen and Wang, Mingxuan and Zhou, Hao and Lin, Junyang and Sun, Xu},
title = {Imitation Learning for Non-Autoregressive Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1125},
pages = {1304--1312},
year = 2019
}
Wei et al. (2019)
Wang, Chunqi and Zhang, Ji and Chen, Haiqing (2018): Semi-Autoregressive Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1044,
author = {Wang, Chunqi and Zhang, Ji and Chen, Haiqing},
title = {Semi-Autoregressive Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1044},
pages = {479--488},
year = 2018
}
Wang et al. (2018)
Libovick\'y, Jindřich and Helcl, Jindřich (2018): End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1336,
author = {Libovick{\'y}, Jind{\v{r}}ich and Helcl, Jind{\v{r}}ich},
title = {End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1336},
pages = {3016--3021},
year = 2018
}
Libovick\'y and Helcl (2018)

Phrase Model

Po-Sen Huang and Chong Wang and Sitao Huang and Dengyong Zhou and Li Deng (2018): Towards Neural Phrase-based Machine Translation, International Conference on Learning Representations
add
@inproceedings{huang2018towards,
author = {Po-Sen Huang and Chong Wang and Sitao Huang and Dengyong Zhou and Li Deng},
title = {Towards Neural Phrase-based Machine Translation},
booktitle = {International Conference on Learning Representations},
url = {https://openreview.net/forum?id=HktJec1RZ},
year = 2018
}
Huang et al. (2018)

Convolutional

Lukasz Kaiser and Aidan N. Gomez and Francois Chollet (2018): Depthwise Separable Convolutions for Neural Machine Translation, International Conference on Learning Representations
add
@inproceedings{kaiser2018depthwise,
author = {Lukasz Kaiser and Aidan N. Gomez and Francois Chollet},
title = {Depthwise Separable Convolutions for Neural Machine Translation},
booktitle = {International Conference on Learning Representations},
url = {https://openreview.net/forum?id=S1jBcueAb},
year = 2018
}
Kaiser et al. (2018)

Neural Hidden Markov

Wang, Weiyue and Zhu, Derui and Alkhouli, Tamer and Gan, Zixuan and Ney, Hermann (2018): Neural Hidden Markov Model for Machine Translation, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
add
@InProceedings{P18-2060,
author = {Wang, Weiyue and Zhu, Derui and Alkhouli, Tamer and Gan, Zixuan and Ney, Hermann},
title = {Neural Hidden Markov Model for Machine Translation},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
publisher = {Association for Computational Linguistics},
pages = {377--382},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-2060},
year = 2018
}
Wang et al. (2018)

Modelling Past and Future

Zheng, Zaixiang and Zhou, Hao and Huang, Shujian and Mou, Lili and Dai, Xinyu and Chen, Jiajun and Tu, Zhaopeng (2018): Modeling Past and Future for Neural Machine Translation, Transactions of the Association for Computational Linguistics
add
@Article{Q18-1011,
author = {Zheng, Zaixiang and Zhou, Hao and Huang, Shujian and Mou, Lili and Dai, Xinyu and Chen, Jiajun and Tu, Zhaopeng},
title = {Modeling Past and Future for Neural Machine Translation},
journal = {Transactions of the Association for Computational Linguistics},
volume = {6},
pages = {145--157},
url = {http://aclweb.org/anthology/Q18-1011},
year = 2018
}
Zheng et al. (2018)

Two-Dimensional

Bahar, Parnia and Brix, Christopher and Ney, Hermann (2018): Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1335,
author = {Bahar, Parnia and Brix, Christopher and Ney, Hermann},
title = {Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1335},
pages = {3009--3015},
year = 2018
}
Bahar et al. (2018)

Gated Memory

Cao, Qian and Xiong, Deyi (2018): Encoding Gated Translation Memory into Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1340,
author = {Cao, Qian and Xiong, Deyi},
title = {Encoding Gated Translation Memory into Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1340},
pages = {3042--3047},
year = 2018
}
Cao and Xiong (2018)

Exploiting Deep Representations

Dou, Zi-Yi and Tu, Zhaopeng and Wang, Xing and Shi, Shuming and Zhang, Tong (2018): Exploiting Deep Representations for Neural Machine Translation, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1457,
author = {Dou, Zi-Yi and Tu, Zhaopeng and Wang, Xing and Shi, Shuming and Zhang, Tong},
title = {Exploiting Deep Representations for Neural Machine Translation},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1457},
pages = {4253--4262},
year = 2018
}
Dou et al. (2018)

Addition/Subtraction

Zhang, Biao and Xiong, Deyi and Su, Jinsong and Lin, Qian and Zhang, Huiji (2018): Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1459,
author = {Zhang, Biao and Xiong, Deyi and Su, Jinsong and Lin, Qian and Zhang, Huiji},
title = {Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1459},
pages = {4273--4283},
year = 2018
}
Zhang et al. (2018)

Document-Level

Laura Jehl and Stefan Riezler (2018): Document-Level Information as Side Constraints for Improved Neural Patent Translation, Annual Meeting of the Association for Machine Translation in the Americas (AMTA)
add
@inproceedings{AMTA2018-Jehl,
author = {Laura Jehl and Stefan Riezler},
title = {Document-Level Information as Side Constraints for Improved Neural Patent Translation},
booktitle = {Annual Meeting of the Association for Machine Translation in the Americas (AMTA)},
location = {Boston, USA},
year = 2018
}
Jehl and Riezler (2018)
Zhang, Jiacheng and Luan, Huanbo and Sun, Maosong and Zhai, Feifei and Xu, Jingfang and Zhang, Min and Liu, Yang (2018): Improving the Transformer Translation Model with Document-Level Context, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
add
@inproceedings{D18-1049,
author = {Zhang, Jiacheng and Luan, Huanbo and Sun, Maosong and Zhai, Feifei and Xu, Jingfang and Zhang, Min and Liu, Yang},
title = {Improving the Transformer Translation Model with Document-Level Context},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1049},
pages = {533--542},
year = 2018
}
Zhang et al. (2018)
Voita, Elena and Sennrich, Rico and Titov, Ivan (2019): When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{voita-etal-2019-good,
author = {Voita, Elena and Sennrich, Rico and Titov, Ivan},
title = {When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1116},
pages = {1198--1212},
year = 2019
}
Voita et al. (2019)
Kuang, Shaohui and Xiong, Deyi (2018): Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model, Proceedings of the 27th International Conference on Computational Linguistics
add
@inproceedings{C18-1051,
author = {Kuang, Shaohui and Xiong, Deyi},
title = {Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/C18-1051},
pages = {607--617},
year = 2018
}
Kuang and Xiong (2018)
Wang, Mingxuan and Xie, Jun and Tan, Zhixing and Su, Jinsong and Xiong, Deyi and Bian, Chao (2018): Neural Machine Translation with Decoding History Enhanced Attention, Proceedings of the 27th International Conference on Computational Linguistics
add
@inproceedings{C18-1124,
author = {Wang, Mingxuan and Xie, Jun and Tan, Zhixing and Su, Jinsong and Xiong, Deyi and Bian, Chao},
title = {Neural Machine Translation with Decoding History Enhanced Attention},
booktitle = {Proceedings of the 27th International Conference on Computational Linguistics},
month = {aug},
address = {Santa Fe, New Mexico, USA},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/C18-1124},
pages = {1464--1473},
year = 2018
}
Wang et al. (2018)
Tu, Zhaopeng and Liu, Yang and Shi, Shuming and Zhang, Tong (2018): Learning to Remember Translation History with a Continuous Cache, Transactions of the Association for Computational Linguistics
add
@article{Q18-1029,
author = {Tu, Zhaopeng and Liu, Yang and Shi, Shuming and Zhang, Tong},
title = {Learning to Remember Translation History with a Continuous Cache},
journal = {Transactions of the Association for Computational Linguistics},
volume = {6},
url = {https://www.aclweb.org/anthology/Q18-1029},
pages = {407--420},
year = 2018
}
Tu et al. (2018)
Maruf, Sameen and Haffari, Gholamreza (2018): Document Context Neural Machine Translation with Memory Networks, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
add
@InProceedings{P18-1118,
author = {Maruf, Sameen and Haffari, Gholamreza},
title = {Document Context Neural Machine Translation with Memory Networks},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
publisher = {Association for Computational Linguistics},
pages = {1275--1284},
location = {Melbourne, Australia},
url = {http://aclweb.org/anthology/P18-1118},
year = 2018
}
Maruf and Haffari (2018)

Sentence-Level Context

Wang, Xing and Tu, Zhaopeng and Wang, Longyue and Shi, Shuming (2019): Exploiting Sentential Context for Neural Machine Translation, Proceedings of the 57th Conference of the Association for Computational Linguistics
add
@inproceedings{wang-etal-2019-exploiting,
author = {Wang, Xing and Tu, Zhaopeng and Wang, Longyue and Shi, Shuming},
title = {Exploiting Sentential Context for Neural Machine Translation},
booktitle = {Proceedings of the 57th Conference of the Association for Computational Linguistics},
month = {jul},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/P19-1624},
pages = {6197--6203},
year = 2019
}
Wang et al. (2019)

End-to-end

Pouget-Abadie, Jean and Bahdanau, Dzmitry and van Merrienboer, Bart and Cho, Kyunghyun and Bengio, Yoshua (2014): Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation, Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
add
@InProceedings{pougetabadie-EtAl:2014:SSST-8,
author = {Pouget-Abadie, Jean and Bahdanau, Dzmitry and van Merrienboer, Bart and Cho, Kyunghyun and Bengio, Yoshua},
title = {Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation},
booktitle = {Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation},
month = {October},
address = {Doha, Qatar},
publisher = {Association for Computational Linguistics},
pages = {78--85},
url = {http://www.aclweb.org/anthology/W14-4009},
year = 2014
}
Pouget-Abadie et al. (2014)
Felix Hill and Kyunghyun Cho and Sébastien Jean and Coline Devin and Yoshua Bengio (2014): Embedding Word Similarity with Neural Machine Translation, CoRR mentioned in Multilingual Multimodal Multitask and Alternative Architectures
add
@article{DBLP:journals/corr/HillCJDB14a,
author = {Felix Hill and Kyunghyun Cho and S{\'{e}}bastien Jean and Coline Devin and Yoshua Bengio},
title = {Embedding Word Similarity with Neural Machine Translation},
journal = {CoRR},
volume = {abs/1412.6448},
url = {http://arxiv.org/abs/1412.6448},
timestamp = {Thu, 01 Jan 2015 19:51:08 +0100},
biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/HillCJDB14a},
bibsource = {dblp computer science bibliography, http://dblp.org},
year = 2014
}
Hill et al. (2014)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions