ACL 2014 
        NINTH WORKSHOP ON
        STATISTICAL MACHINE TRANSLATION
      
      
        26-27 June 2014
        Baltimore, USA
      
      [HOME] 
      | [TRANSLATION TASK]  
      | [METRICS TASK]
      | [QUALITY ESTIMATION TASK]
      | [MEDICAL TRANSLATION TASK]
      | [SCHEDULE]
      | [PAPERS]
      | [AUTHORS]
      | [RESULTS]
    
    
      This workshop builds on eight previous workshops on statistical machine
      translation, which is one of the most prestigious venues for research in
	      computational linguistics:
      
    
 
    IMPORTANT DATES
    
      | Release of training data for translation task | Early December 2013 | 
      | Release of training data for quality estimation task | January 15, 2014 | 
      | Test set distributed for translation task | February 24, 2014 | 
      | Submission deadline for translation task | February 28, 2014 | 
      | System outputs distributed for metrics task | March 7, 2014 | 
      | Test sets distributed for quality estimation task | March 7, 2014 | 
      | Submission deadline for metrics task | March 28, 2014 | 
      | Submission deadline for quality estimation task | April 1, 2014 | 
      | Start of manual evaluation period | March 11, 2014 | 
      | End of manual evaluation | April 1, 2014 | 
      | Paper submission deadline | April 1, 2014 | 
      | Notification of acceptance | April 21, 2014 | 
      | Camera-ready deadline | April 28, 2014 | 
      
    
    OVERVIEW
    
      This year's workshop will feature five shared tasks: 
- a translation task, 
    
- a quality estimation task,
    
- a task to test automatic evaluation metrics, 
    
- a medical text translation task
      In addition to the shared tasks, the workshop will also feature scientific papers on topics related to MT.
      Topics of interest include, but are not limited to:
      
        -  word-based, phrase-based, syntax-based, semantics-based SMT
-  using comparable corpora for SMT
-  incorporating linguistic information into SMT
-  decoding
-  system combination
-  error analysis
-  manual and automatic method for evaluating MT
-  scaling MT to very large data sets
We encourage authors to evaluate their approaches to the above topics
      using the common data sets created for the shared tasks.TRANSLATION TASK
    
      The first shared task which will examine translation between the
      following language pairs:
      
        -  English-German and German-English
-  English-French and French-English
-  English-Hindi and Hindi-English NEW
-  English-Czech and Czech-English
-  English-Russian and Russian-English 
Participants may submit translations for any or all of the language
      directions. In addition to the common test sets the workshop organizers
      will provide optional training resources, including a newly expanded
      release of the Europarl corpora and out-of-domain corpora.
      All participants who submit entries will have their translations
      evaluated. We will evaluate translation performance by human judgment. To
      facilitate the human evaluation we will require participants in the
      shared tasks to manually judge some of the submitted translations. For each team,
      this will amount
      to ranking 300 sets of 5 translations, per language pair submitted.
    
    
      We also provide baseline machine translation systems, with performance
      comparable to the best systems from last year's shared task.
    
    QUALITY ESTIMATION TASK
    A topic of increasing interest in MT is that of estimating the quality of translated texts. Different from MT evaluation, quality estimation (QE) systems do not rely on reference translations, but rather predict the quality of an unseen translated text (document, sentence, phrase) at system run-time. This topic is particularly relevant from a user perspective: among other applications, it can (i) help decide whether a given translation is good enough for publishing as is (Soricut and Echihabi, 2010); (ii) filter out sentences that are not good enough for post-editing (Specia, 2011); (iii) select the best translation among options from multiple MT and/or translation memory systems (He et al., 2010); and (iv) inform readers of the target language of whether or not they can rely on a translation (Specia et al., 2011).
    Although still very recent, research in this topic has been showing promising results in the last couple of years. However, efforts are scattered around several groups and, as a consequence, comparing different systems is difficult as there are neither well established baselines nor standard evaluation metrics. In the Quality-Estimation track of the WMT workshop and shared-task, we will provide training and test sets, along with evaluation metrics and a baseline system. By providing a common ground for development and comparison, we expect to foster research in the topic, as well as to attract new people interested in the subject, who can build and evaluate new solutions using the provided resources.
    EVALUATION TASK
The evaluation task will assess automatic evaluation metrics' ability to:
-  Rank systems on their overall performance on the test set 
-  Rank systems on a sentence by sentence level 
Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the system combination task.  They will be provided with the output from the other two shared tasks along with reference translations.   We will measure the correlation of automatic evaluation metrics with the human judgments.  
  MEDICAL TEXT TRANSLATION TASK
  
  See here.
  
    PAPER SUBMISSION INFORMATION
    
      Submissions will consist of regular full papers of 6-10 pages, plus
      additional pages for references, formatted following the 
      ACL 2013
      guidelines. In addition, shared task participants will be invited to
      submit short papers (4-6 pages) describing their systems or their
      evaluation metrics. Both submission and review processes will be handled
      electronically.
      Note that regular papers must be anonymized, while system descriptions 
      do not need to be.
    
    
      We encourage individuals who are submitting research papers to evaluate
      their approaches using the training resources provided by this workshop
      and past workshops, so that their experiments can be repeated by others
      using these publicly available corpora.
    
POSTER FORMAT
The posters will be attached to self standing posterboards measuring 3 ft high and 4 ft wide and sitting on top of tables so there will be laptop/handout space as well.  We will provide pushpins, double-sided tape, that putty-like substance, and clips to affix the posters to the posterboards.
ANNOUNCEMENTS
  | Subscribe to to the announcement list for WMT14 by entering your e-mail address below.  This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed. | 
  
|  | 
  | You can read past
	announcements on the Google Groups page for WMT.  These also
include an archive of annoucements from earlier workshops. |   | 
INVITED TALK
TBC
ORGANIZERS
Ondřej Bojar (Charles University in Prague)
Christian Buck (University of Edinburgh)
Christian Federmann (MSR)
Barry Haddow (University of Edinburgh)
Philipp Koehn (University of Edinburgh / Johns Hopkins University)
Matouš Macháček (Charles University in Prague)
Christof Monz (University of Amsterdam)
Pavel Pecina (Charles University in Prague)
Matt Post (Johns Hopkins University)
Herve Saint-Amand (University of Edinburgh)
Radu Soricut (Google)
Lucia Specia (University of Sheffield)
PROGRAM COMMITTEE
- Lars Ahrenberg (Linköping University)
- Alexander Allauzen (Universite Paris-Sud / LIMSI-CNRS)
- Tim Anderson (Air Force Research Laboratory)
- Eleftherios Avramidis (German Research Center for Artificial Intelligence)
- Wilker Aziz (University of Sheffield)
- Daniel Beck (University of Sheffield)
- Jose Miguel Benedi (Universitàt Politecnica de València)
- Nicola Bertoldi (FBK)
- Alexandra Birch (University of Edinburgh)
- Arianna Bisazza (University of Amsterdam)
- Graeme Blackwood (IBM Research)
- Phil Blunsom (University of Oxford)
- Fabienne Braune (University of Stuttgart)
- Chris Brockett (Microsoft Research)
- Hailong Cao (Harbin Institute of Technology)
- Michael Carl (Copenhagen Business School)
- Marine Carpuat (National Research Council)
- Francisco Casacuberta (Universitat Politècnica de València)
- Daniel Cer (Google)
- Boxing Chen (NRC)
- Colin Cherry (NRC)
- David Chiang (USC/ISI)
- Vishal Chowdhary (Microsoft)
- Steve DeNeefe (SDL Language Weaver)
- Michael Denkowski (Carnegie Mellon University)
- Jacob Devlin (Raytheon BBN Technologies)
- Markus Dreyer (SDL Language Weaver)
- Kevin Duh (Nara Institute of Science and Technology)
- Marcello Federico (FBK)
- Yang Feng (USC/ISI)
- Andrew Finch (NICT)
- Mark Fishel (University of Zurich)
- Jose A. R. Fonollosa (Universitat Politecnica de Catalunya)
- George Foster (NRC)
- Michel Galley (Microsoft Research)
- Juri Ganitkevitch (Johns Hopkins University)
- Katya Garmash (University of Amsterdam)
- Josef van Genabith (Dublin City University)
- Ulrich Germann (University of Edinburgh)
- Daniel Gildea (University of Rochester)
- Kevin Gimpel (Toyota Technological Institute at Chicago)
- Jesús Gonzalez-Rubio (Universitat Politecnica de València)
- Yvette Graham (The University of Melbourne)
- Spence Green (Stanford University)
- Francisco Guzmán (Qatar Computing Research Institute)
- Greg Hanneman (Carnegie Mellon University)
- Christian Hardmeier (Uppsala universitet)
- Eva Hasler (University of Edinburgh)
- Yifan He (New York University)
- Kenneth Heafield (Stanford)
- John Henderson (MITRE)
- Felix Hieber (Heidelberg University)
- Hieu Hoang (University of Edinburgh)
- Stephane Huet (Universite d'Avignon)
- Young-Sook Hwang (SKPlanet)
- Gonzalo Iglesias (University of Cambridge)
- Ann Irvine (Johns Hopkins University)
- Abe Ittycheriah (IBM)
- Laura Jehl (Heidelberg University)
- Doug Jones (MIT Lincoln Laboratory)
- Maxim Khalilov (BMMT)
- Alexander Koller (University of Potsdam)
- Roland Kuhn (National Research Council of Canada)
- Shankar Kumar (Google)
- Mathias Lambert (Amazon.com)
- Phillippe Langlais (Université de Montréal)
- Alon Lavie (Carnegie Mellon University)
- Gennadi Lembersky (NICE Systems)
- William Lewis (Microsoft Research)
- Lemao Liu (The City University of New York)
- Qun Liu (Dublin City University)
- Wolfgang Macherey (Google)
- Saab Mansour (RWTH Aachen University)
- José B. Mariño (Universitat Politècnica de Catalunya)
- Cettolo Mauro (FBK)
- Arne Mauser (Google, Inc)
- Jon May (SDL Language Weaver)
- Wolfgang Menzel (Hamburg University)
- Shachar Mirkin (Xerox Research Centre Europe)
- Yusuke Miyao (National Instutite of Informatics)
- Dragos Munteanu (SDL Language Technologies)
- Markos Mylonakis (Lexis Research)
- Lluis Marquez (Qatar Computing Research Institute)
- Preslav Nakov (Qatar Computing Research Institute)
- Graham Neubig (Nara Institute of Science and Technology)
- Jan Niehues (Karlsruhe Institute of Technology)
- Kemal Oflazer (Carnegie Mellon University - Qatar)
- Daniel Ortiz-Martinez (Copenhagen Business School)
- Stephan Peitz (RWTH Aachen University)
- Sergio Penkale (Lingo24)
- Maja Popovic (DFKI)
- Stefan Riezler (Heidelberg University)
- Johann Roturier (Symantec)
- Raphael Rubino (Prompsit Language Engineering)
- Alexander M. Rush (MIT)
- Anoop Sarkar (Simon Fraser University)
- Hassan Sawaf (eBay Inc.)
- Lane Schwartz (Air Force Research Laboratory)
- Jean Senellart (SYSTRAN)
- Rico Sennrich (University of Zurich)
- Kashif Shah (University of Sheffield)
- Wade Shen (MIT)
- Patrick Simianer (Heidelberg University)
- Linfeng Song (ICT/CAS)
- Sara Stymne (Uppsala University)
- Katsuhito Sudoh (NTT Communication Science Laboratories / Kyoto University)
- Felipe Sanchez-Martínez (Universitat d'Alacant)
- Jörg Tiedemann (Uppsala University)
- Christoph Tillmann (TJ Watson IBM Research)
- Antonio Toral (Dublin City Unversity)
- Hajime Tsukada (NTT Communication Science Laboratories)
- Yulia Tsvetkov (Carnegie Mellon University)
- Dan Tufis (Research Institute for Artificial Intelligence, Romanian Academy)
- Marco Turchi (Fondazione Bruno Kessler)
- Ferhan Ture (University of Maryland)
- Masao Utiyama (NICT)
- Ashish Vaswani (University of Southern California Information Sciences Institute)
- David Vilar (Pixformance GmbH)
- Haifeng Wang (Baidu)
- Taro Watanabe (NICT)
- Marion Weller (University of Stuttgart)
- Philip Williams (University of Edinburgh)
- Guillaume Wisniewski (Univ. Paris Sud and LIMSI-CNRS)
- Hua Wu (Baidu)
- Joern Wuebker (RWTH Aachen University)
- Peng Xu (Google Inc.)
- Wenduan Xu (Cambridge University)
- Francois Yvon (LIMSI/CNRS)
- Richard Zens (Google)
- Hao Zhang (Google)
- Liu Zhanyi (Baidu)
CONTACT
    
      For questions, comments, etc. please send email
      to pkoehn@inf.ed.ac.uk.
    
Supported by the European Commision
 under the
 
 
project (grant number 288487)