The shared evaluation task of the workshop will examine automatic evaluation metrics for machine translation. We will provide all of the translations produced in the shared translation task, as well as the reference translations. You will return rankings for each of each of the translations at the system-level and/or at the sentence-level. We will calculate the correlation on your rankings with the human evaluation when it is completed.
Once we receive the system outputs from the shared translation task we will post all of the system translations, along with source documents and reference translations, for you to evaluate with your metric. The translations will be available in two formats:
<tstset setid="wmt08-de-en-nc-test" srclang="German" trglang="English"> <DOC docid="Speigel-doc1" sysid="UMD_de_en_primary"> <seg id="1"> TRANSLATED ENGLISH TEXT </seg> <seg id="2"> TRANSLATED ENGLISH TEXT </seg> ... </DOC> <DOC docid="Speigel-doc2" sysid="UMD_de_en_primary"> <seg id="13"> TRANSLATED ENGLISH TEXT </seg> <seg id="14"> TRANSLATED ENGLISH TEXT </seg> ... </DOC> </tstset>
The output files for system-level rankings should be formatted in the following way:
<TEST SET> <SYSTEM> <SYSTEM LEVEL SCORE>Where:
TEST SET
is the ID of the test set (given by the setid
attribute of of the tstset
tag in the XML file, or by the directory structure in the plain text files).SYSTEM
is the ID of system being scored (given by the sysid
attribute in the XML document, or as part of the filename for the plain text file).SYSTEM LEVEL SCORE
is the overall system level score.
The output files for segment-level rankings should be formatted in the following way:
<TEST SET> <SYSTEM> <DOCUMENT ID> <SEGMENT ID> <SEGMENT SCORE>Where:
TEST SET
is the ID of the test set.
SYSTEM
is the ID of system being scored.
DOCUMENT ID
is the document ID (given by the docid
tag in the XML document, or identical to the test set ID if you're using the plain text input files).SEGMENT ID
is the segment number of each segment (given by the seg id
tag of the XML file, or the line number starting from one of the plain text input files).SEGMENT SCORE
is the score for the particular segment. supported by the EuroMatrix project, P6-IST-5-034291-STP funded by the European Commission under Framework Programme 6 |