machine translation

Moses as a Service


Moses Server

The Moses server enables you to run the decoder as a server process, and send it sentences to be translated via XMLRPC. This means that one Moses process can service distributed clients coded in Java, perl, python, php, or any of the many other languages which have XMLRPC libraries.

To build the Moses server, you need to have XMLRPC-c installed and you need to add the argument --with-xmlrpc-c=<path-xmlrpc-c-config> to the configure arguments. It has been tested with the latest stable version, 1.33.17. You will also need to configure Moses for multi-threaded operation, as described above.

Compiling with xmlrpc-c library will create the moses executable that has a few extra arguments. To use it as a server, add the arggument --server. Use other arguments to specify the listening port and log-file (--server-port and --server-log). These default to 8080 and /dev/null respectively.

A sample client is included in the server directory (in perl), which requires the SOAP::Lite perl module installed. To access the Moses server, an XMLRPC request should be sent to http://host:port/RPC2 where the parameter is a map containing the keys text and (optionally) align. The value of the first of these parameters is the text to be translated and the second, if present, causes alignment information to be returned to the client. The client will receive a map containing the same two keys, where the value associated with the text key is the translated text, and the align key (if present) maps to a list of maps. The alignment gives the segmentation in target order, with each list element specifying the target start position (tgt-start), source start position (src-start) and source end position (src-end).

Note that although the Moses server needs to be built against multi-threaded moses, it can be run in single-threaded mode using the --serial option. This enables it to be used with non-threadsafe libraries such as (currently) irstlm.

Open Machine Translation Core (OMTC) - A proposed machine translation system standard

A proposed standard for machine translation APIs has been developed as part of the MosesCore project (European Commission Grant Number 288487 under the 7th Framework Programme). It is called Open Machine Translation Core (OMTC) and defines a service interface for MT interfaces. This approach allows software engineers to wrap disparate MT back-ends such that they look identical to others no matter which flavour of MT system is being wrapped. This provides a standard protocol for “talking” to MT back-ends. In applications where many MT back-ends are to be used, OMTC allows for easier integration of these back-ends. Even in applications where one MT back-end is used, OMTC provides highly cohesive, yet low coupled, interfaces that should allow the back-end to be replaced by another with little effort.

OMTC standardises the follow aspects of an MT system:

  • Resources: A resource is an object that is provided or constructed by a user action for use in an MT system. Examples of resources are: translation memory, glossary, MT engine, or a document. Two base resource types are defined, from which all other resource types are derived, they are primary and derived resources. Primary resources are resource which are constructed outside of the MT system and are made available to it, e.g., through an upload action. Primary resources are used to defined mono- and multi-lingual resources, translation memories and glossaries. Derived resources, on the other hand, are ones which have been constructed by user action inside of the MT system, e.g., a SMT engine.
  • Sessions: A session is the period of time in which a user interacts with the MT system. The session interface hierarchy supports both user identity and anonymity. Mixin interfaces are, also, defined, to integrate with any authentication system.
  • Session Negotiation: This is an optional part of the standard and, if used, shall allow a client and the MT server to come to an agreement about which features, resources (this includes exchange and document formats), pre-requisites (e.g. payment) and API version support is to be expected from both parties. If no agreement can be found then the client's session should be torn down, but this is completely application defined.
  • Authorisation: OMTC can integrate with an authorisation system that may be being used in an MT system. It allows users and roles to be mapped into the API.
  • Machine Translation Engines: Machine translation engines are derived resources which are capable of performing machine translation of, possibly, unseen sentences. An engine may be an SMT decoding pipeline, for instance. It is application defined as to how this part of the API is implemented. Optionally engine functionality can be mixed-in in order to add the following operations: composition, evaluation, parameter updating, querying, (re-)training, testing and updating. Potentially long running tasks return tickets in order for the application to track these tasks.
  • Translators: Translators, as defined in OMTC, are a derived resource and are a conglomeration of, at least one of the following, an MT engine, a collection of translation memories, and a collection of glossaries. The translator interface provides methods for translation with returned tickets due to the long running nature of these tasks.

A reference implementation of OMTC has been constructed in Java v1.7. It is available in the contrib/omtc directory of the mosesdecoder as a Git submodule. Please see the contrib/omtc/README for details.

Edit - History - Print
Page last modified on February 07, 2017, at 11:29 PM