EACL 2009 Fourth Workshop on Statistical Machine Translation

EACL 2009
FOURTH WORKSHOP ON
STATISTICAL MACHINE TRANSLATION

March 30 and 31, 2009
Athens, Greece
http://www.statmt.org/wmt09/

This workshop builds on three previous workshops on statistical machines translation: the NAACL-2006 Workshop on Statistical Machine Translation, the ACL-2007 Workshop on Statistical Machine Translation, and the ACL-2008 Workshop on Statistical Machine Translation. This year's workshop will feature three shared tasks: a shared translation task for 10 pairs of European languages, a shared evaluation task to test automatic evaluation metrics, and a new system combination task combining the output of all the systems entered into the shared translation task.

The workshop will also feature scientific papers on topics related to MT. Topics of interest include, but are not limited to:

word-based, phrase-based, syntax-based SMT
using comparable corpora for SMT
incorporating linguistic information into SMT
decoding
system combination
error analysis
manual and automatic method for evaluating MT
scaling MT to very large data sets

We encourage authors to evaluate their approaches to the above topics using the common data sets created for the shared tasks.

SHARED TRANSLATION TASK

The first shared task which will examine translation between the following language pairs:

English-German and German-English
English-French and French-English
English-Spanish and Spanish-English
English-Czech and Czech-English
English-Hungarian and Hungarian-English

Participants may submit translations for any or all of the language directions. In addition to the common test sets the workshop organizers will provide optional training resources, including a newly expanded release of the Europarl corpora and out-of-domain corpora.

All participants who submit entries will have their translations evaluated. We will evaluate translation performance by human judgment. To facilitate the human evaluation we will require participants in the shared tasks to manually judge some of the submitted translations.

A more detailed description of the shared translation task (including information about the test and training corpora, a freely available MT system, and a number of other resources) is available from http://www.statmt.org/wmt09/translation-task.html. We also provide a baseline machine translation system, whose performance is comparable to the best systems from last year's shared task.

SYSTEM COMBINATION TASK

Participants in the system combination task will be provided with the 1-best translations from each of the systems entered in the shared translation task. We will endeavor to provide a held-out development set for system combination, which will include translations from each of the systems and a reference translation. Any system combination strategy is acceptable, whether it selects the best translation on a per sentence basis or create novel translations by combining the systems' translations. The quality of the system combinations will be judged alongside the individual systems during the manual evaluation, as well as scored with automatic evaluation metrics.

More details of the system combination task is available from http://www.statmt.org/wmt09/system-combination-task.html.

EVALUATION TASK

The evaluation task will assess automatic evaluation metrics' ability to:

Rank systems on their overall performance on the test set
Rank systems on a sentence by sentence level

Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the system combination task. They will be provided with the output from the other two shared tasks along with reference translations. We will measure the correlation of automatic evaluation metrics with the human judgments.

More details of the shared evaluation task (including submission formats and the collected manual evaluations from last year's workshop) is available from http://www.statmt.org/wmt09/evaluation-task.html.

PAPER SUBMISSION INFORMATION

Submissions will consist of regular full papers of max. 8 pages, formatted following the EACL 2009 guidelines. In addition, shared task participants will be invited to submit short papers (max. 4 pages) describing their systems or their evaluation metrics. Both submission and review processes will be handled electronically.

We encourage individuals who are submitting research papers to evaluate their approaches using the training resources provided by this workshop and past workshops, so that their experiments can be repeated by others using these publicly available corpora.

IMPORTANT DATES

Test set distributed for translation task	December 8, 2008
Results submissions for translation task	December 12, 2008
Translations release for system combination	December 22, 2008
System combinations due	January 5, 2009
Translations (and combinations) released for evaluation	January 12, 2009
Automatic evaluation scores due	January 23, 2009

Start of manual evaluation period	January 12, 2009
End of manual evaluation	January 30, 2009

Paper submissions (online)	January 9, 2009
Deadline for reviewing	January 23, 2009
Notification of acceptance	January 30, 2009
Camera-ready deadline	February 13, 2009

Announcements

Subscribe to to the announcement list for WMT09 by entering your e-mail address below. This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed.

Email:

You can read past announcements on the Google Groups page for WMT09.

ORGANIZERS

Chris Callison-Burch (Johns Hopkins University)
Philipp Koehn (University of Edinburgh)
Christof Monz (University of London)
Josh Schroeder (University of Edinburgh)

INVITED TALK

Martin Kay (Stanford University and University Saarbrücken)

PROGRAM COMMITTEE

Lars Ahrenberg (Linköping University)
Yaser Al-Onaizan (IBM Research)
Necip Fazil Ayan (SRI)
Thorsten Brants (Google)
Chris Brockett (Microsoft Research)
Francisco Casacuberta (University of Valencia)
David Chiang (ISI/University of Southern California)
Colin Cherry (Microsoft Research)
Stephen Clark (Oxford University)
Trevor Cohn (Edinburgh University)
Brooke Cowan (MIT)
Mona Diab (Columbia University)
Andreas Eisele (University Saarbrücken)
Marcello Federico (FBK-irst)
George Foster (Canada National Research Council)
Alex Fraser (University of Stuttgart)
Michel Galley (Columbia University)
Jesus Gimenez (Technical University of Catalonia)
Keith Hall (Google)
John Henderson (MITRE)
Rebecca Hwa (University of Pittsburgh)
Doug Jones (Lincoln Labs MIT)
Damianos Karakos (Johns Hopkins University)
Katrin Kirchhoff (University of Washington)
Kevin Knight (ISI/University of Southern California)
Shankar Kumar (Google)
Philippe Langlais (University of Montreal)
Alon Lavie (Carnegie Melon University)
Adam Lopez (Edinburgh University)
Daniel Marcu (ISI/University of Southern California)
Lambert Mathias (Johns Hopkins University)
Bob Moore (Microsoft Research)
Smaranda Muresan (Rutgers University)
Franz Josef Och (Google)
Miles Osborne (Edinburgh University)
Kay Peterson (NIST)
Mark Przybocki (NIST)
Chris Quirk (Microsoft Research)
Antti-Veikko Rosti (BBN Technologies)
Holger Schwenk (LIUM)
Jean Senellart (Systran)
Libin Shen (BBN Technologies)
Wade Shen (Lincoln Labs MIT)
Michel Simard (National Research Council Canada)
David Talbot (Edinburgh University)
Jörg Tiedemann (University of Groningen)
Christoph Tillmann (IBM Research)
Dan Tufiş (Romanian Academy)
Clare Voss (Army Research Labs)
Taro Watanabe (NTT)
Andy Way (Dublin City University)
Jinxi Xu (BBN Technologies)
Richard Zens (Google)

CONTACT

For questions, comments, etc. please send email to pkoehn@inf.ed.ac.uk.

supported by the EuroMatrix project, P6-IST-5-034291-STP
funded by the European Commission under Framework Programme 6