Translating Tense, Case, and Markers

Syntactic properties such as tense or case that add information about content words or define relationships between them may be encoded with morphological inflection or function words. Languages differ in this respect, thus posing a special challenge for translation between languages with different encoding schemes.

Translating Tense Case Markers is the main subject of 32 publications. 14 are discussed here.

Topics in LinguisticProblems

Publications

Schiehlen (1998) analyzes the translation of tense across languages and warns against a simplistic view of the problem. Murata et al. (2001) propose a machine learning method using support vector machines to predict target language tense. Ye et al. (2006) propose to use additional features in a conditional random field classifier to determine verb tenses when translating from Chinese to English. Ueffing and Ney (2003) use a pre-processing method to transform the English verb complex to match its Spanish translation more closely. A similar problem is the prediction of case markers in Japanese (Suzuki and Toutanova, 2006), which may be done using a maximum entropy model as part of a treelet translation system (Toutanova and Suzuki, 2007), or the prediction of aspect markers in Chinese, which may framed as classification problem and models with conditional random fields (Ye et al., 2007).

When noun compounds such as "finance minister" have to be translated into a language that explicitly marks the relationship between nouns,

Paul, Soma and Mathur, Prashant and Kishore, Sushant (2010): Syntactic Construct: An Aid for translating English Nominal Compound into Hindi, Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics

add

@InProceedings{paul-mathur-kishore:2010:CONSTRUCT,
author = {Paul, Soma  and  Mathur, Prashant  and  Kishore, Sushant},
title = {Syntactic Construct: An Aid for translating {E}nglish Nominal Compound into {H}indi},
booktitle = {Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics},
month = {June},
address = {Los Angeles, California},
publisher = {Association for Computational Linguistics},
pages = {32--38},
url = {http://www.aclweb.org/anthology/W10-0805},
year = 2010
}

Paul et al. (2010) propose to first paraphrase the compound into constructions where this relationship is marked, here: "minister of finance", using a rich source side language model, before translating it into an under-resourced target language.

Linguistic analysis of some languages suggests the existence of empty categories. The most commonly known is pro-drop, the omission of pronouns if they are implied by the wider document context, or the verb inflection. Chung and Gildea (2010) use a syntactic parse of Chinese as the source language to detect empty categories and add special tokens in the training and test sets. Xiang et al. (2013) extend this work and also apply it to Korean, by using sparse features in the decoder to further improve translation performance.

Another example of a syntactic markers that are more common in Asian than European languages are numeral classifiers (as in three sheets of paper). Paul et al. (2002) present a corpus-based method to generate them for Japanese and Zhang et al. (2008) present a method for generating Chinese measure words.

Dorr (1994) presents an overview of linguistic differences between languages, called divergences (Gupta and Chatterjee, 2003).

Benchmarks

Discussion

New Publications

Marion Weller and Alexander Fraser and Sabine Schulte Im Walde (2015): Target-Side Generation of Prepositions for SMT, Proceedings of the 18th Annual Conference of the European Association for Machine Translation
add
@InProceedings{W15-4923,
author = {Marion Weller and Alexander Fraser and Sabine Schulte Im Walde},
title = {Target-Side Generation of Prepositions for SMT},
booktitle = {Proceedings of the 18th Annual Conference of the European Association for Machine Translation},
month = {May},
address = {Antalya, Turkey},
url = {http://aclweb.org/anthology/W15-4923},
editor = {\^IIknur Durgar ElÃ¢"‚¬"Kahlout and Mehmed \"Ozkan and Felipe S\'anchezÃ¢"‚¬"Mart\'inez and Gema Ram\'irezÃ¢"‚¬"S\'anchez and Fred Hollowood and Andy Way},
pages = {177--184},
year = 2015
}
Weller et al. (2015)
Xu, Jinan and Liu, Jiangming and Chen, Yufeng and Zhang, Yujie and Ming, Fang and Li, Shaotong (2015): Integrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model, Proceedings of the 1st Workshop on Semantics-Driven Statistical Machine Translation (S2MT 2015)
add
@InProceedings{xu-EtAl:2015:S2MT,
author = {Xu, Jinan and Liu, Jiangming and Chen, Yufeng and Zhang, Yujie and Ming, Fang and Li, Shaotong},
title = {Integrating Case Frame into {Japanese} to {Chinese} Hierarchical Phrase-based Translation Model},
booktitle = {Proceedings of the 1st Workshop on Semantics-Driven Statistical Machine Translation (S2MT 2015)},
month = {July},
address = {Beijing, China},
publisher = {Association for Computational Linguistics},
pages = {23--29},
url = {http://www.aclweb.org/anthology/W15-3503},
year = 2015
}
Xu et al. (2015)
Steele, David (2015): Improving the Translation of Discourse Markers for Chinese into English, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
add
@InProceedings{steele:2015:SRW,
author = {Steele, David},
title = {Improving the Translation of Discourse Markers for {Chinese} into English},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop},
month = {June},
address = {Denver, Colorado},
publisher = {Association for Computational Linguistics},
pages = {110--117},
url = {http://www.aclweb.org/anthology/N15-2015},
year = 2015
}
Steele (2015)
Marion Weller and Sabine Schulte im Walde and Alexander Fraser (2014): Using noun class information to model selectional preferences for translating prepositions in SMT, Proceedings of the Eleventh Conference of the Association for Machine Translation in the Americas (AMTA)
add
@inproceedings{AMTA-2014-Weller,
author = {Marion Weller and Sabine Schulte~im~Walde and Alexander Fraser},
title = {Using noun class information to model selectional preferences for translating prepositions in SMT},
pages = {275-287},
url = {http://www.mt-archive.info/10/AMTA-2014-Weller.pdf},
volume = {1},
booktitle = {Proceedings of the Eleventh Conference of the Association for Machine Translation in the Americas (AMTA)},
location = {Vancouver, BC, Canada},
year = 2014
}
Weller et al. (2014)
Elizabeth Baran and Nianwen Xue (2011): Singular or Plural? Exploiting Parallel Corpora for Chinese Number Prediction, Proceedings of the 13th Machine Translation Summit (MT Summit XIII)
add
@inproceedings{MTS-2011-Baran,
author = {Elizabeth Baran and Nianwen Xue},
title = {Singular or Plural? Exploiting Parallel Corpora for {Chinese} Number Prediction},
url = {http://www.mt-archive.info/MTS-2011-Baran.pdf},
pages = {207-214},
booktitle = {Proceedings of the 13th Machine Translation Summit (MT Summit XIII)},
publisher = {International Association for Machine Translation},
location = {Xiamen, China},
year = 2011
}
Baran and Xue (2011)
Lei Cui and Dongdong Zhang and Mu Li and Ming Zhou (2011): Function Word Generation in Statistical Machine Translation Systems, Proceedings of the 13th Machine Translation Summit (MT Summit XIII)
add
@inproceedings{MTS-2011-Cui,
author = {Lei Cui and Dongdong Zhang and Mu Li and Ming Zhou},
title = {Function Word Generation in Statistical Machine Translation Systems},
url = {http://www.mt-archive.info/MTS-2011-Cui.pdf},
pages = {139-146},
booktitle = {Proceedings of the 13th Machine Translation Summit (MT Summit XIII)},
publisher = {International Association for Machine Translation},
location = {Xiamen, China},
year = 2011
}
Cui et al. (2011)
Jianjun Ma and Degen Huang and Haixia Liu and Wenfeng Sheng (2011): POS Tagging of English Particles for Machine Translation, Proceedings of the 13th Machine Translation Summit (MT Summit XIII)
add
@inproceedings{MTS-2011-Ma-1,
author = {Jianjun Ma and Degen Huang and Haixia Liu and Wenfeng Sheng},
title = {POS Tagging of {English} Particles for Machine Translation},
url = {http://www.mt-archive.info/MTS-2011-Ma-1.pdf},
pages = {57-63},
booktitle = {Proceedings of the 13th Machine Translation Summit (MT Summit XIII)},
publisher = {International Association for Machine Translation},
location = {Xiamen, China},
year = 2011
}
Ma et al. (2011)
Gong, Zhengxian and Zhang, Min and Tan, Chew-lim and Zhou, Guodong (2012): Classifier-Based Tense Model for SMT, Proceedings of COLING 2012: Posters
add
@InProceedings{gong-EtAl:2012:POSTERS,
author = {Gong, Zhengxian and Zhang, Min and Tan, Chew-lim and Zhou, Guodong},
title = {Classifier-Based Tense Model for {SMT}},
booktitle = {Proceedings of COLING 2012: Posters},
month = {December},
address = {Mumbai, India},
publisher = {The COLING 2012 Organizing Committee},
pages = {411--420},
url = {http://www.aclweb.org/anthology/C12-2041},
year = 2012
}
Gong et al. (2012)
Chong, Tze Yuang and Banchs, Rafael and Chng, Eng Siong (2012): An Empirical Evaluation of Stop Word Removal in Statistical Machine Translation, Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
add
@InProceedings{chong-banchs-chng:2012:ESIRMT-HyTra2012,
author = {Chong, Tze Yuang and Banchs, Rafael and Chng, Eng Siong},
title = {An Empirical Evaluation of Stop Word Removal in Statistical Machine Translation},
booktitle = {Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)},
month = {April},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
pages = {30--37},
url = {http://www.aclweb.org/anthology/W12-0104},
year = 2012
}
Chong et al. (2012)
Gong, Zhengxian and Zhang, Min and Tan, Chew Lim and Zhou, Guodong (2012): N-gram-based Tense Models for Statistical Machine Translation, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
add
@InProceedings{gong-EtAl:2012:EMNLP-CoNLL,
author = {Gong, Zhengxian and Zhang, Min and Tan, Chew Lim and Zhou, Guodong},
title = {N-gram-based Tense Models for Statistical Machine Translation},
booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
month = {July},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {276--285},
url = {http://www.aclweb.org/anthology/D12-1026},
year = 2012
}
Gong et al. (2012)
Shilon, Reshef and Fadida, Hanna and Wintner, Shuly (2012): Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions, Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
add
@InProceedings{shilon-fadida-wintner:2012:HybridText2012,
author = {Shilon, Reshef and Fadida, Hanna and Wintner, Shuly},
title = {Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions},
booktitle = {Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data},
month = {April},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
pages = {106--114},
url = {http://www.aclweb.org/anthology/W12-0514},
year = 2012
}
Shilon et al. (2012)
V, Jayan and R, Sunil and V K, Bhadran (2012): Disambiguation of pre/post positions in English - Malayalam Text Translation, Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages
add
@InProceedings{v-r-vk:2012:MTPIL,
author = {V, Jayan and R, Sunil and V K, Bhadran},
title = {Disambiguation of pre/post positions in {English} - Malayalam Text Translation},
booktitle = {Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages},
month = {December},
address = {Mumbai, India},
publisher = {The COLING 2012 Organizing Committee},
pages = {93--102},
url = {http://www.aclweb.org/anthology/W12-5609},
year = 2012
}
Jayan et al. (2012)
Chang, Pi-Chuan and Jurafsky, Daniel and Manning, Christopher D. (2009): Disambiguating "DE" for Chinese-English Machine Translation, Proceedings of the Fourth Workshop on Statistical Machine Translation
add
@InProceedings{chang-jurafsky-manning:2009:WMT-09,
author = {Chang, Pi-Chuan and Jurafsky, Daniel and Manning, Christopher D.},
title = {Disambiguating "{DE}" for {C}hinese-{E}nglish Machine Translation},
booktitle = {Proceedings of the Fourth Workshop on Statistical Machine Translation},
month = {March},
address = {Athens, Greece},
publisher = {Association for Computational Linguistics},
pages = {215--223},
url = {http://www.aclweb.org/anthology/W/W09/W09-0436},
year = 2009
}
Chang et al. (2009)
Setiawan, Hendra and Kan, Min Yen and Li, Haizhou and Resnik, Philip (2009): Topological Ordering of Function Words in Hierarchical Phrase-based Translation, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
add
@InProceedings{setiawan-EtAl:2009:ACLIJCNLP,
author = {Setiawan, Hendra and Kan, Min Yen and Li, Haizhou and Resnik, Philip},
title = {Topological Ordering of Function Words in Hierarchical Phrase-based Translation},
booktitle = {Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {324--332},
url = {http://www.aclweb.org/anthology/P/P09/P09-1037},
year = 2009
}
Setiawan et al. (2009)
Ramanathan, Ananthakrishnan and Choudhary, Hansraj and Ghosh, Avishek and Bhattacharyya, Pushpak (2009): Case markers and Morphology: Addressing the crux of the fluency problem in English-Hindi SMT, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
add
@InProceedings{ramanathan-EtAl:2009:ACLIJCNLP,
author = {Ramanathan, Ananthakrishnan and Choudhary, Hansraj and Ghosh, Avishek and Bhattacharyya, Pushpak},
title = {Case markers and Morphology: Addressing the crux of the fluency problem in English-Hindi SMT},
booktitle = {Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP},
month = {August},
address = {Suntec, Singapore},
publisher = {Association for Computational Linguistics},
pages = {800--808},
url = {http://www.aclweb.org/anthology/P/P09/P09-1090},
year = 2009
}
Ramanathan et al. (2009)
Turki Khemakhem, Ines and Jamoussi, Salma and Ben hamadou, Abdelmajid (2010): Arabic morpho-syntactic feature disambiguation in a translation context, Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation
add
@InProceedings{turkikhemakhem-jamoussi-benhamadou:2010:SSST,
author = {Turki Khemakhem, Ines and Jamoussi, Salma and Ben hamadou, Abdelmajid},
title = {Arabic morpho-syntactic feature disambiguation in a translation context},
booktitle = {Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation},
month = {August},
address = {Beijing, China},
publisher = {Coling 2010 Organizing Committee},
pages = {61--65},
url = {http://www.aclweb.org/anthology/W10-3808},
year = 2010
}
Khemakhem et al. (2010)
Meyer, Thomas (2011): Disambiguating temporal-contrastive connectives for machine translation, Proceedings of the ACL 2011 Student Session
add
@InProceedings{meyer:2011:SS,
author = {Meyer, Thomas},
title = {Disambiguating temporal-contrastive connectives for machine translation},
booktitle = {Proceedings of the ACL 2011 Student Session},
month = {June},
address = {Portland, OR, USA},
publisher = {Association for Computational Linguistics},
pages = {46--51},
url = {http://www.aclweb.org/anthology/P11-3009},
year = 2011
}
Meyer (2011)
Sudip Kumar Naskar and Sivaji Bandyopadhyayn (2006): Handling of Prepositions in English to Bengali Machine Translation, Proceedings of the Third ACL-SIGSEM Workshop on Prepositions
add
@InProceedings{Naskar:2006,
author = {Sudip Kumar Naskar and Sivaji Bandyopadhyayn},
title = {Handling of Prepositions in {English} to {Bengali} Machine Translation},
booktitle = {Proceedings of the Third ACL-SIGSEM Workshop on Prepositions},
month = {July},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
year = 2006
}
Naskar and Bandyopadhyayn (2006)

MT Research Survey Wiki

A Comprehensive Survey of Neural and Statistical Machine Translation Research Publications

Search Descriptions

Translating Tense, Case, and Markers

Publications

Benchmarks

Discussion

Related Topics

New Publications