Search Descriptions

General

Neural machine Translation

Statistical Machine Translation

Search Publications


author

title

other

year

Word Segmentation

Splitting up sentences into word tokens is especially a problem for languages where the writing system does not include spaces between words, such as many Asian languages.

Word Segmentation is the main subject of 16 publications. 5 are discussed here.

Publications

Zhang and Sumita (2008); Zhang et al. (2008) discuss different granularities for Chinese words and suggest a back-off approach. Bai et al. (2008) aim for Chinese word segmentation in the training data to match English words one-to-one, while Chang et al. (2008) adjust the average word length to optimize translation performance. Xu et al. (2008) also use the correspondence to English words in their Bayesian approach.

Benchmarks

Discussion

Related Topics

New Publications

  • Wang et al. (2014)
  • Zeng et al. (2014)
  • Al-Mannai et al. (2014)
  • Neubig et al. (2012)
  • Neubig et al. (2013)
  • Nguyen et al. (2010)
  • Paul et al. (2010)
  • Ma and Way (2009)
  • Xu et al. (2005)
  • Chung and Gildea (2009)
  • Xiao et al. (2010)

Actions

Download

Contribute