Results and Collected Judgments

EMNLP 2015 TENTH WORKSHOP
ON STATISTICAL MACHINE TRANSLATION

17-18 September 2015
Lisbon, Portugal

The results of the shared task are summarized in the paper:

Findings of the 2015 Workshop on Statistical Machine Translation
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia and Marco Turchi pdf, bib.

The raw data is available here.

All system submissions, source files, and reference files for all tasks/systems, in both plain text and SGM.
Human judgments (and documentation) collected during the manual evaluation, for all the system ranking tasks. Also see Matt Post's github page
Metrics Task data and results, including metrics submissions and all scripts to reproduce the results published in the metrics task paper. (60 MB).
Tuning Task data and results, including tuning submissions (moses.inis) and the scripts to prepare the unoptimized system. (119 MB).
APE task: training and test data, evaluation scripts, submitted runs and final results.
Quality Estimation task: submissions, baseline and gold-standard for sentence-level subtask, word-level subtask and document-level subtask.

You can also download the raw data from WMT07, WMT08, WMT09, WMT10, WMT11, WMT12, WMT13, and WMT14.

WMT15 receives support from the European Union under the projects MosesCore (grant number 288487), Cracker and QT21.

EMNLP 2015 TENTH WORKSHOPON STATISTICAL MACHINE TRANSLATION