This conference builds on a series of annual workshops and conferences on statistical machine translation, going back to 2006:


Release of training data for shared tasksJanuary/February, 2019
Evaluation periods for shared tasksApril, 2019
Paper submission deadlineMay 17, 2019
Paper notificationJune 7, 2019
Camera-ready version dueJune 17, 2019
Conference in FlorenceAugust 1 - 2, 2019


This year's conference will feature the following shared tasks:

In addition to the shared tasks, the conference will also feature scientific papers on topics related to MT. Topics of interest include, but are not limited to:

We encourage authors to evaluate their approaches to the above topics using the common data sets created for the shared tasks.


These will both be handled by ACL 2019.


This shared task will examine translation between the following language pairs:

The text for all the test sets will be drawn from news articles. Participants may submit translations for any or all of the language directions. In addition to the common test sets the conference organizers will provide optional training resources.

Language Pairs
This year we introduce two low-resource language pairs (English to/from Kazakh and Gujarati) plus a further Baltic language pair (English to/from Lithuanian) and a non-English pair (French to/from German).
Document level MT
We encourage the use of document-level models for English to German and for Chinese to English. We will ensure that the data for de-en has document boundaries in it. We will evaluate both these pairs with the context visible to evaluators.
Data sets
We will release parallel and monolingual data for all languages, updated where possible. For the low-resource language pairs, we encourage participants to explore additional data sets (sharing these with the community whenever possible).


In this fourth edition of this task, we will evaluate systems for the translation of biomedical documents for the following languages pairs:

Parallel corpora will be available for all language pairs but also monoligual corpora for some languages. Evaluation will be carried out both automatically and manually.


This year we have a new task focusing on robustness of machine translation to noisy input text. We will evaluate translation of the following language pairs:

We release both parallel and monolingual data for all languages pairs. You can find more details in the task page.


This shared task will focus on the translation between three pairs of similar languages:

For more information please visit this page.


This shared task will examine automatic methods for correcting errors produced by machine translation (MT) systems. Automatic Post-editing (APE) aims at improving MT output in black box scenarios, in which the MT system is used "as is" and cannot be modified. From the application point of view APE components would make it possible to:

In this fifth edition of the task, the evaluation will focus on two subtasks:


See task page


Quality estimation systems aim at producing an estimate on the quality of a given translation at system run-time, without access to a reference translation. This topic is particularly relevant from a user perspective. Among other applications, it can: help decide whether a given translation is good enough for publishing as is, filter out sentences that are not good enough for post-editing, select the best translation among options from multiple MT and/or translation memory systems, inform readers of the target language of whether or not they can rely on a translation, and spot parts (words or phrases) of a translation that are potentially incorrect.

This year's WMT shared task on quality estimation consists of three tracks according to the specific needs QE satisfies:
      Task 1: estimating post-editing effort on word and sentence level,
      Task 2: performing MT output diagnostics on document and word/phrase level and
      Task 3: scoring MT outputs just like metrics do, but without a reference.
We provide new train and test sets based on neural machine translation from English to Russian, German and French. We also supply the participants with baseline systems and an automatic evaluation environment for submitting the results.

See the task page for further details.


See task page


Submissions will consist of regular full papers of 6-10 pages, plus additional pages for references, formatted following the ACL 2019 guidelines. Supplementary material can be added to research papers. In addition, shared task participants will be invited to submit short papers (suggested length: 4-6 pages, plus references) describing their systems or their evaluation metrics. Both submission and review processes will be handled electronically. Note that regular papers must be anonymized, while system descriptions should not be.

Research papers that have been or will be submitted to other meetings or publications must indicate this at submission time, and must be withdrawn from the other venues if accepted and published at WMT 2019. We will not accept for publication papers that overlap significantly in content or results with papers that have been or will be published elsewhere. It is acceptable to submit work that has been made available as a technical report (or similar, e.g. in arXiv) without citing it. This double submission policy only applies to research papers, so system papers can have significant overlap with other published work, if it is relevant to the system description.

We encourage individuals who are submitting research papers to evaluate their approaches using the training resources provided by this conference and past workshops, so that their experiments can be repeated by others using these publicly available corpora.




Subscribe to to the announcement list for WMT by entering your e-mail address below. This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed.

You can read past announcements on the Google Groups page for WMT. These also include an archive of announcements from earlier workshops. Google Groups




Ondřej Bojar (Charles University in Prague)
Rajen Chatterjee (Apple)
Christian Federmann (MSR)
Mark Fishel (University of Tartu)
Yvette Graham (DCU)
Barry Haddow (University of Edinburgh)
Matthias Huck (LMU Munich)
Antonio Jimeno Yepes (IBM Research Australia)
Philipp Koehn (University of Edinburgh / Johns Hopkins University)
André Martins (Unbabel)
Christof Monz (University of Amsterdam)
Matteo Negri (FBK)
Aurélie Névéol (LIMSI, CNRS)
Mariana Neves (German Federal Institute for Risk Assessment)
Matt Post (Johns Hopkins University)
Marco Turchi (FBK)
Karin Verspoor (University of Melbourne)



WMT follows the ACL's anti-harassment policy


For general questions, comments, etc. please send email to bhaddow@inf.ed.ac.uk.
For task-specific questions, please contact the relevant organisers.