Search Descriptions

General

Neural machine Translation

Statistical Machine Translation

Search Publications


author

title

other

year

Neural Network Models

Neural network models have received little attention until a recent explosion of research in the 2010s, caused by their success in vision and speech recognition. Such models allow for clustering of related words and flexible use of context.

Neural Network Models and its 15 sub-topics are the main subject of 800 publications.

Publications

Basic models to use neural networks for machine translation were already proposed in the 20th century (Waibel et al., 1991), but not seriously pursued due to lack of computational resources. In fact, quite similar models as the ones currently in use date back to that era (Forcada and Ñeco, 1997; Castaño et al., 1997).
Schwenk et al. (2006) introduce neural language models to machine translation (also called "continuous space language models"), and use them in re-ranking, similar to the earlier work in speech recognition.
The first competitive fully neural machine translation system participated in the WMT evaluation campaign in 2015 (Jean et al., 2015), reaching state-of-the-art performance at IWLST 2015 (Luong and Manning, 2015) and WMT 2016 (Sennrich et al., 2016), The same year, Systran (Crego et al., 2016), Google (Wu et al., 2016), and WIPO (Junczys-Dowmunt et al., 2016) reported large-scale deployments.
Neubig (2017) presents a hands-on tutorial on neural machine translation models.

Technical Background:

A good introduction to modern neural network research is the textbook Deep Learning (Goodfellow et al., 2016). There is also book on neural network methods applied to the natural language processing in general (Goldberg, 2017).

Toolkits: There are several toolkits that implement various neural translation models.

  • OpenNMT (Klein et al., 2017; Klein et al., 2018) is a more recent popular toolkit based on PyTorch
  • fairseq (Ott et al., 2019) is based on PyTorch and supported by Facebook
  • Sockeye (Hieber et al., 2018) is based on MX-Net and supported by Amazon
  • Marian (Junczys-Dowmunt et al., 2018; Junczys-Dowmunt et al., 2018b) is a fast C++ implementation that is focused on fast training and decoding
  • XNMT (Neubig et al., 2018) is also a self-contained toolkit with Python and C++ hooks for extension
  • Tensor2Tensor (Vaswani et al., 2018) is the original implementation of the Transformer model by Google
  • Neural Monkey (Helcl et al., 2018) is based on TensorFlow
  • CytonMT (Wang et al., 2018) is an efficient toolkit implemented in C++
  • Nematus (Sennrich et al., 2017) is an early influential toolkit based on Theano
  • Kyoto-NMT (Cromieres, 2016) is an implementation in Chainer
  • SGNMT (Stahlberg et al., 2018) is a decoder that allows the combination of models implemented with different toolkits

Benchmarks

Discussion

Related Topics

New Publications

  • Effendi et al. (2018)
  • Takebayashi et al. (2018)
  • Pinnis et al. (2017)
  • Rikters et al. (2017)
  • Junczys-Dowmunt (2018)
  • Pinnis et al. (2018)
  • Weng et al. (2017)
  • Sperber et al. (2017)
  • Feng et al. (2017)
  • Dahlmann et al. (2017)
  • Wang et al. (2017)
  • Zhang et al. (2017)
  • Stahlberg and Byrne (2017)
  • Devlin (2017)
  • Wang et al. (2017)
  • Stahlberg et al. (2017)
  • Melo (2015)
  • Costa-jussá et al. (2017)
  • Gupta et al. (2015)
  • Müller et al. (2014)
  • Sennrich et al. (2015)
  • Sennrich et al. (2015)
  • Zhao et al. (2015)
  • Heyman et al. (2017)
  • Carvalho and Nguyen (2017)
  • Carpuat et al. (2017)
  • Denkowski and Neubig (2017)
  • Goto and Tanaka (2017)
  • Morishita et al. (2017)
  • Shu and Nakayama (2017)

Overcoming Low Resource

  • Fadaee et al. (2017)
  • Adams et al. (2017)
  • Chen et al. (2017)
  • Zhang and Zong (2016)

System Descriptions (incomplete)

  • Junczys-Dowmunt et al. (2016)
  • Chung et al. (2016)
  • Guasch and Costa-jussà (2016)
  • Sánchez-Cartagena and Toral (2016)
  • Bradbury and Socher (2016)

Other

  • Mallinson et al. (2017)
  • Jakubina and Langlais (2017)
  • Östling and Tiedemann (2017)
  • Yang et al. (2017)
  • Zhang et al. (2017)
  • Marie and Fujita (2017)
  • See et al. (2016)
  • UNKNOWN CITATION 'NIPS2014_5344'
  • Zhang et al. (2016)
  • Pal et al. (2016)
  • Duong et al. (2016)
  • Clark et al. (2014)

Unpublished ArXiv

  • Pezeshki (2015)
  • Williams et al. (2015)
  • Zhang (2015)
  • Wang et al. (2015)
  • Tu et al. (2015)
  • Huang et al. (2015)
  • UNKNOWN CITATION 'Gouws:2014:unpublished'

Actions

Download

Contribute