Shared Task: Code-mixed Machine Translation (MixMT)


The mixing of words and phrases from two different languages in a single utterance of text or speech is a frequently observed phenomenon in multilingual communities such as India and Spain. This pattern of communication is broadly categorized as code-mixing or code-switching. In this shared task, we are running two subtasks involving a code-mixed language i.e. Hinglish (code-mixing of Hindi and English). A brief description of both the subtasks is as given below:

The shared task is hosted at Codalab. Please follow the following important guidelines:

Important dates

The training + validation phase starts Apr 01, 2022
The training + validation phase ends Jun 30, 2022
The test phase starts Jul 1, 2022
The test phase ends Jul 30, 2022
Paper submission deadline Sept 7, 2022
Paper notification Oct 9, 2022
Camera-ready deadline Oct 16, 2022
All deadlines are in AoE (Anywhere on Earth).

Note: The system description papers should follow the paper submission policy in WMT, please see the section of paper submission information in WMT homepage for more details.

Training datasets

We provide the following training datasets for both the subtasks:

Subtask-1: Monolingual to code-mixed machine translation

For this subtask, HinGE is the primary training dataset. This dataset is part of an ongoing shared task (HinglishEval) at INLG 2022. We provide the available training and validation dataset at HinglishEval for training the machine translation system for this subtask. We strongly recommend the participating teams to read the dataset description [here] of the HinglishEval task for a better understanding of the dataset format. The download links for the datasets are:

Synthetic dataset: Download the training and validation data.

Human-generated dataset: Download the training and validation data.

Subtask-2: Code-mixed to monolingual machine translation

For this subtask, PHINC is the primary training dataset. It contains 13,738 parallel sentences in the Hinglish and the English languages. [Download]

Evaluation Metrics

We use two evaluation metrics for both the subtasks: ROUGE-L (F1-score) and Word Error Rate (WER).


We use Google Translate as a baseline for both the subtasks. In subtask-1, we translate Hindi sentences (in Devanagari script) into the English language and evaluate against the reference Hinglish sentences. In subtask-2, we translate the Hinglish sentences into English by setting the language of the Hinglish text as Hindi.

Additional Resources

The participating teams are allowed and encouraged to use external datasets for both subtasks. Some of the references to get the external datasets are

Submission Requirements

Please note that we only allow submissions that have attempted both subtasks. Submissions with a solution for only one subtask are not allowed. The following steps need to be followed to create the submission file in both phases:

Sample Submission Files



Feel free to contact for any questions by dropping an email to organizers.