This conference builds on a series of annual workshops and conferences on statistical machine
translation, going back to 2006:
|Release of training data for shared tasks||January/February, 2017|
|Evaluation periods for shared tasks||TBC|
|Paper submission deadline||TBC|
|Camera-ready version due||TBC|
|Conference in Brussels||October 31 - November 1, 2018|
This year's conference will feature the following shared tasks:
- a news translation task
- a biomedical translation task ,
- an automatic post-editing task,
- a metrics task (assess MT quality given reference translation).
- a quality estimation task (assess MT quality without access to any reference),
- a multimodal translation task
In addition to the shared tasks, the conference will also feature scientific papers on topics related to MT.
Topics of interest include, but are not limited to:
We encourage authors to evaluate their approaches to the above topics
using the common data sets created for the shared tasks.
- word-based, phrase-based, syntax-based, semantics-based SMT
- neural machine translation
- using comparable corpora for MT
- incorporating linguistic information into SMT
- system combination
- error analysis
- manual and automatic method for evaluating MT
- scaling MT to very large data sets
Registration will be handled by EMNLP 2018.
NEWS TRANSLATION TASK
This shared task will examine translation between the
following language pairs:
The text for all the test sets will be drawn from news articles.
Participants may submit translations for any or all of the language
directions. In addition to the common test sets the conference organizers
will provide optional training resources.
- English-Chinese and Chinese-English
- English-Czech and Czech-English
- English-Estonian and Estonian-English
- English-Finnish and Finnish-English
- English-German and German-English
- English-Kazakh and Kazakh-English
- English-Russian and Russian-English
- English-Turkish and Turkish-English
All participants who submit entries will have their translations
evaluated. We will evaluate translation performance by human judgment. To
facilitate the human evaluation we will require participants in the
shared tasks to manually judge some of the submitted translations. For each team,
this will be about 8 hours per language pair submitted.
For 2018 we highlight the following innovations in the news task:
- NEW: Language pairs
- This year we introduce Estonian to/from
English and Kazakh to/from English as additional language pairs.
- NEW: Multilinguality
- We encourage participants
to exploit multilingual training resources. In other words, to use other languages
and third languages to improve translation, for example, participants could exploit the
similarity of Finnish and Estonian, or make use of the Russian-Kazakh data set that we will
- NEW: Deep analysis through additional test sets
- At no additional burden on the News Translation Task participants
(aside from having to translate much larger input data), we will collectively
provide a deeper analysis of various qualities of the translations.
- For more information or if you would like to join this activity, see the
WMT18 Addtitional Test Suites Google doc.
BIOMEDICAL TRANSLATION TASK
In this third edition of this task, we will evaluate systems for the translation of biomedical documents for the following languages pairs:
- English-French and French-English
- English-Portuguese and Portuguese-English
- English-Spanish and Spanish-English
- English-German and German-English
- English-Chinese and Chinese-English NEW
Parallel corpora will be available for all language pairs but also monoligual corpora for some languages.
Evaluation will be carried out both automatically and manually.
AUTOMATIC POST-EDITING TASK
QUALITY ESTIMATION TASK
MULTIMODAL TRANSLATION TASK
PAPER SUBMISSION INFORMATION
Submissions will consist of regular full papers of 6-10 pages, plus
additional pages for references, formatted following the
EMNLP 2018 guidelines.
In addition, shared task participants will be invited to
submit short papers (suggested length: 4-6 pages, plus references) describing their systems or their
evaluation metrics. Both submission and review processes will be handled
Note that regular papers must be anonymized, while system descriptions
do not need to be.
Research papers that have been or will be submitted to other meetings or publications must indicate this at submission time, and must be withdrawn from the other venues if accepted and published at WMT 2017.
We will not accept for publication papers that overlap significantly in content or results with papers that have been or will be published elsewhere.
It is acceptable to submit work that has been made available as a technical report (or similar, e.g. in arXiv) without citing it.
This double submission policy only applies to research papers, so system papers can have significant overlap with other published work, if it is relevant to the system description.
We encourage individuals who are submitting research papers to evaluate
their approaches using the training resources provided by this conference
and past workshops, so that their experiments can be repeated by others
using these publicly available corpora.
Subscribe to to the announcement list for WMT by entering your e-mail address below. This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed.
You can read past
announcements on the Google Groups page for WMT. These also
include an archive of announcements from earlier workshops.||
Ondřej Bojar (Charles University in Prague)
Rajen Chatterjee (FBK)
Christian Federmann (MSR)
Yvette Graham (DCU)
Barry Haddow (University of Edinburgh)
Matthias Huck (University of Edinburgh)
Antonio Jimeno Yepes (IBM Research Australia)
Philipp Koehn (University of Edinburgh / Johns Hopkins University)
Christof Monz (University of Amsterdam)
Matteo Negri (FBK)
Aurélie Névéol (LIMSI, CNRS)
Mariana Neves (German Federal Institute for Risk Assessment)
Matt Post (Johns Hopkins University)
Lucia Specia (University of Sheffield)
Marco Turchi (FBK)
Karin Verspoor (University of Melbourne)
WMT follows the ACL's anti-harassment policy
For general questions, comments, etc. please send email
For task-specific questions, please contact the relevant organisers.
This conference has received funding
from the European Union’s Horizon 2020 research
and innovation programme under grant agreements
645452 (QT21) and 645357 (Cracker).
We thank Yandex for their donation of data for the Russian-English and Turkish-English news tasks.