Results and Collected Judgments
17-18 September 2015
| [TRANSLATION TASK]
| [METRICS TASK]
| [TUNING TASK]
| [QUALITY ESTIMATION TASK]
| [AUTOMATIC POST-EDITING TASK]
The results of the shared task are summarized in the paper:
Findings of the 2015 Workshop on Statistical Machine Translation
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia and Marco Turchi
The raw data is available here.
- All system submissions, source files, and reference files for all tasks/systems, in both plain text and SGM.
- Human judgments (and documentation) collected during the manual evaluation, for all the
system ranking tasks. Also see Matt Post's github page
- Metrics Task data and results, including metrics submissions and all scripts to reproduce the results published in the metrics task paper. (60 MB).
- Tuning Task data and results, including tuning submissions (moses.inis) and the scripts to prepare the unoptimized system. (119 MB).
- APE task: training and test data, evaluation scripts, submitted runs and final results.
- Quality Estimation task: submissions, baseline and gold-standard for sentence-level subtask, word-level subtask and document-level subtask.
You can also download the raw data from WMT07, WMT08, WMT09, WMT10, WMT11, WMT12, WMT13, and WMT14.
WMT15 receives support from the European Union under the projects MosesCore
(grant number 288487), Cracker and QT21.