machine translation

Advanced Features of the Decoder

The basic features of the decoder are explained in the Tutorial. But to get good results from Moses you probably need to use some of the features described in this page.

Advanced Models
A basic SMT system contains a language model and a translation model, however there are several ways to extend this (and potentially improve translation) by adding extra models. These may improve the modelling of reordering, for example, or capture similarities between related words.
Efficient Phrase and Rule Storage
To build a state-of-the-art translation system, Moses often requires huge phrase-pair or rule tables. The efficient storage and access of these tables requires specialised data structures and this page describes several different implementations.
Given an MT model and a source sentence, the problem of finding the best translation is an intractable search problem. Moses implements several methods for taming this intractability.
Unknown Words
No matter how big your training data is, there will always be OOVs (out-of-vocabulary words) in the text you wish to translate. One approach may be to transliterate - if your source and target languages have different character sets.
Hybrid Translation
Sometimes you need rules! If you want to add explicit knowledge to Moses models, for example for translating terminology or numbers, dates etc., Moses has a few ways of making this possible.
Moses as a Service
Moses includes a basic server which can deliver translations over xml-rpc.
Incremental Training
The traditional Moses pipeline is a sequence of batch processes, but what if you want to add extra training data to a running system? Storing the phrase table in a suffix array makes this possible.
Domain Adaptation
When the training data differs in a systematic way from the test data you have a domain problem. Several techniques have been proposed in the literature have been proposed and Moses includes implementations of many of them.
Constrained Decoding
In some applications, you know that translation but you need to know how the model derived it.
Cache-based Models
These can be a useful way for the document context to influence the translation.
Pipeline Creation Language
A generic mechanism for managing pipelines of software components, such as Moses training.
Obsolete Features
Things that have been removed, but documentation is preserved for posterity.
Edit - History - Print
Page last modified on March 11, 2015, at 12:06 PM