WMT17 Neural MT Training Task

NMT Training Task Important Dates

Release of NMT system and training dataFebruary 15, 2017
Model submission deadlineMay 1, 2017 (prior to news task test set release)
Start of manual evaluationMay 15, 2017
End of manual evaluation (provisional)June 4, 2017
Paper submission, notification, camera-readysame as the rest of the conference
Conference in CopenhagenSeptember 7-8, 2017

NMT Training Task Overview

The training of one configuration of a neural MT system takes a few weeks and the currently used training criteria are rather basic, lacking a systematic evaluation and the common models are optimized not even towards BLEU. By organizing Neural MT Training Task, we hope to provide comparable conditions and encourage research into:

We will provide the participants with:

The participants are expected to train the model and submit the parameters. While doing so, they can modify the training data and the training parts of the neural MT system in any possible way.

We will use the fixed version of the NMT system to run the trained models on the official news test sets. The translations will be then manually evaluated along with the submissions to the standard News Translation Task.

The task is aimed at both experienced NMT researchers and newcomers. A valid submission can be achieved with minimal or no modifications of the provided system at all. Below, we provide suggestions what could people experiment with.

NMT Training Task was inspired by the past Tuning Tasks and serves as its replacement. In the last tuning task where phase-based MT components were fixed, it turned out that the few parameters of a log-linear model do not give enough room for interesting setup differences, especially when the model components are realistically large. This is certainly not the case for a neural model.

Download

The package with the pre-processed training and validation data, and instructions how to start the baseline training is available for download:

Submission

There are two possible configurations (“tracks” of the task) to choose from, depending on the size of GPU memory you can use: 4GB (the common size of GeForce GTX 980) and 8GB (the common size of GeForce GTX 1080). There are two versions of the main configuration file: config_{4,8}GB.ini. The submissions will be evaluated with Neural Monkey 0.1.0 (included in the package above).

In essence, we will need your variables.data file that gets created in the output-*GB/ directory of Neural Monkey. Neural Monkey stores intermediate versions of the model there, so you will probably want to send us the one linked under the name variables.data.best. It may be also interesting to compare the learning curves of your training runs, so if you can, please provide us also with the file events.out.tfevents.* created by TensorFlow during training.

To make a submission, upload your variable.data file (and optionaly events.out.tfevents.* as well) somewhere, where it can be downloaded from. Send the location of your variable files in an e-mail with subject “NMT Training Task Submission” to both bojar@ufal.mff.cuni.cz and musil@ufal.mff.cuni.cz. The e-mail should also contain the name under which the submission will be published (eg. “Charles University, Glove Embeddings”). If you want, you can also include any information you find relevant for meta-analysis of the task.

To sum up, your e-mail should contain for each submission you are making:

Submission deadline is May 1, 2017, before Training Task test sets are released.

The number of submissions you can make is not limited, but participants are required to contribute to the evalution process (see below). If you want to submit more models without increasing your commitments for the evaluation, you can submit some of your models as secondary. Secondary submissions will only be evaluated automatically, not manually.

Preliminary results:

name BLEU-dev BLEU-test
4 GB
pavel-denisov 15.98 13.63
ufal 16.24 14.07
baseline 16.74 14.74
AFRL 17.58 15.25
8 GB
baseline 17.47 14.87
AFRL 18.15 15.81
ufal 19.18 16.45

Other Requirements

For each primary run submitted to the training task, the team promises to join the WMT manual evaluation and annotate at a share of the items. The exact involvement will be determined later but usually ranges from 4 to 8 hours of work. This contribution to the manual evaluation can be done in whichever language pair you can evaluate and is needed most.

You are invited to submit a short paper (4 to 6 pages) describing your NMT training technique. You are not required to submit a paper if you do not want to. If you don't, we ask that you give an appropriate description (a few paragraphs) or an appropriate reference describing your method to include or cite in the overview paper.

Suggested Topics for Experimenting

Neural MT is flexible and unexplored enough to offer quite a few interesting things to try:

More details and further ideas are listed in this live document.

We would very much appreciate if prospective participants shared their plans and perhaps even code modifications (e.g. reinforcement learning into which anyone could plug in their MT metric) during the task. If you are interested in this more collaborative form of participation, please:

NMT Training Task Organizers

Ondřej Bojar (Charles University in Prague)
Jindřich Helcl (Charles University in Prague)
Tom Kocmi (Charles University in Prague)
Jindřich Libovický (Charles University in Prague)
Tomáš Musil (Charles University in Prague)

Acknowledgement

Supported by the European Commision under the QT 21 project (grant number 645452)