WMT17 NMT Training Task

WMT17 Neural MT Training Task

NMT Training Task Important Dates

Release of NMT system and training data	February 15, 2017
Model submission deadline	May 1, 2017 (prior to news task test set release)
Start of manual evaluation	May 15, 2017
End of manual evaluation (provisional)	June 4, 2017
Paper submission, notification, camera-ready	same as the rest of the conference
Conference in Copenhagen	September 7-8, 2017

NMT Training Task Overview

The training of one configuration of a neural MT system takes a few weeks and the currently used training criteria are rather basic, lacking a systematic evaluation and the common models are optimized not even towards BLEU. By organizing Neural MT Training Task, we hope to provide comparable conditions and encourage research into:

training criteria that lead to the best translation quality
organization of training data that helps to speed up the training, make it more robust or leads to better translation quality

We will provide the participants with:

a complete neural MT system to modify, specifically Neural Monkey
a fixed configuration file for running (this defines the neural model)
a fixed and pre-processed collection of training and validation data
a fixed revision number of the MT system which will be used to run the model
a baseline configuration file for training

The participants are expected to train the model and submit the parameters. While doing so, they can modify the training data and the training parts of the neural MT system in any possible way.

We will use the fixed version of the NMT system to run the trained models on the official news test sets. The translations will be then manually evaluated along with the submissions to the standard News Translation Task.

The task is aimed at both experienced NMT researchers and newcomers. A valid submission can be achieved with minimal or no modifications of the provided system at all. Below, we provide suggestions what could people experiment with.

NMT Training Task was inspired by the past Tuning Tasks and serves as its replacement. In the last tuning task where phase-based MT components were fixed, it turned out that the few parameters of a log-linear model do not give enough room for interesting setup differences, especially when the model components are realistically large. This is certainly not the case for a neural model.

Download

The package with the pre-processed training and validation data, and instructions how to start the baseline training is available for download:

wmt17-nmt-training-task-package.tgz (1.9 GB)

Submission

There are two possible configurations (“tracks” of the task) to choose from, depending on the size of GPU memory you can use: 4GB (the common size of GeForce GTX 980) and 8GB (the common size of GeForce GTX 1080). There are two versions of the main configuration file: config_{4,8}GB.ini. The submissions will be evaluated with Neural Monkey 0.1.0 (included in the package above).

In essence, we will need your variables.data file that gets created in the output-*GB/ directory of Neural Monkey. Neural Monkey stores intermediate versions of the model there, so you will probably want to send us the one linked under the name variables.data.best. It may be also interesting to compare the learning curves of your training runs, so if you can, please provide us also with the file events.out.tfevents.* created by TensorFlow during training.

To make a submission, upload your variable.data file (and optionaly events.out.tfevents.* as well) somewhere, where it can be downloaded from. Send the location of your variable files in an e-mail with subject “NMT Training Task Submission” to both bojar@ufal.mff.cuni.cz and musil@ufal.mff.cuni.cz. The e-mail should also contain the name under which the submission will be published (eg. “Charles University, Glove Embeddings”). If you want, you can also include any information you find relevant for meta-analysis of the task.

To sum up, your e-mail should contain for each submission you are making:

download link of your submission (required)
your submission short name (required, e.g. “cuni-glove”)
your submission name (required, e.g. “Charles University, Glove Embeddings”)
is it a primary or secondary submission (required, see below)
variables.data.best (required)
events.out.tfevents.* (nice to have)

Submission deadline is May 1, 2017, before Training Task test sets are released.

The number of submissions you can make is not limited, but participants are required to contribute to the evalution process (see below). If you want to submit more models without increasing your commitments for the evaluation, you can submit some of your models as secondary. Secondary submissions will only be evaluated automatically, not manually.

Preliminary results:

name	BLEU-dev	BLEU-test
4 GB
pavel-denisov	15.98	13.63
ufal	16.24	14.07
baseline	16.74	14.74
AFRL	17.58	15.25
8 GB
baseline	17.47	14.87
AFRL	18.15	15.81
ufal	19.18	16.45

Other Requirements

For each primary run submitted to the training task, the team promises to join the WMT manual evaluation and annotate at a share of the items. The exact involvement will be determined later but usually ranges from 4 to 8 hours of work. This contribution to the manual evaluation can be done in whichever language pair you can evaluate and is needed most.

You are invited to submit a short paper (4 to 6 pages) describing your NMT training technique. You are not required to submit a paper if you do not want to. If you don't, we ask that you give an appropriate description (a few paragraphs) or an appropriate reference describing your method to include or cite in the overview paper.

NMT Training Task Organizers

Ondřej Bojar (Charles University in Prague)
Jindřich Helcl (Charles University in Prague)
Tom Kocmi (Charles University in Prague)
Jindřich Libovický (Charles University in Prague)
Tomáš Musil (Charles University in Prague)

Acknowledgement

Supported by the European Commision under the project (grant number 645452)