Moses
statistical
machine translation
system

Code Guide

Github, branching, and merging

If you want to code with Moses, you should create your own repository in one of a number of ways.

The preference is that you fork the repository if you're doing long-term research. If you fixed a bug, please commit it yourself, or create a pull request.

  • Clone the moses github repository to your hard disk and work with it:
        git clone https://github.com/moses-smt/mosesdecoder.git mymoses
        cd mymoses
           edit files ....
        git commit -am "Check in"

You don't need a github login or permission to do this. All changes are stored on your own hard disk

  • Clone AND branch the repository:
        git clone https://github.com/moses-smt/mosesdecoder.git mymoses
        cd mymoses
        git checkout -b mybranch
           edit files ....
        git commit -am "Check in"

You still don't need a github login or permission to do this.

  • Clone and branch AND push to github:
        git clone https://github.com/moses-smt/mosesdecoder.git mymoses
        cd mymoses
        git checkout -b mybranch
           edit files ....
        git commit -am "Check in"
        git push origin mybranch
           edit files ....
        git commit -am "Check in again"
        git push

You need a github account. And you have to ask one of the Moses administrators to add you as a committer to the Moses repository.

NB. To delete a LOCAL branch:

    git branch -D new-branch

To delete a branch on the github server:

    git push origin --delete new-branch
  • Fork the repository. You need a github account. You don't need permission from the Moses administrators. Log into github.com on their webiste, and go to the Moses page:
         https://github.com/moses-smt/mosesdecoder

Press the Fork button. This creates a new repository only you have write access to. Clone that repository and do whatever you want. Eg.

         git clone https://github.com/hieuhoang/mosesdecoder.git
         cd hieuhoang
           edit files ....
         git commit -am "Check in again"
         git push
  • Clone and check into master
        git clone https://github.com/moses-smt/mosesdecoder.git mymoses
        cd mymoses
           edit files ....
        git commit -am "Check in"
        git push

You need a github account and write permission to the Moses repository.

  • Create pull request. Fork a repository and read the instructions here:
        https://help.github.com/articles/using-pull-requests

Working with multiple branches

Assuming you've done Fork the repository, you can merge the latest changes from the main Moses repository with this command:

   git pull https://github.com/moses-smt/mosesdecoder.git

In your own repository, you can create branches and switch between them using

   git checkout -b new-branch
   edit files...
   git commit -am "check in"

   git checkout master
   edit files...
   ...

To get the latest changes from the main Moses repository to your branch, on your fork:

   git checkout master
   git pull https://github.com/moses-smt/mosesdecoder.git
   git checkout new-branch
   git merge master

Regression test

If you've changed any of the C++ code and intend to check into the main Moses repository, please run the regression test to make sure you haven't broken anything:

  git submodule init
  git submodule update
  ./bjam with-irstlm=... --with-srilm=... -a --with-regtest >& reg.out &

Check the output for any failures:

   grep FAIL reg.out

Contact the administrators

Contact Hieu Hoang or Barry Haddow, or any of the other administrators you might know, if you need help or permission to the github repository.

The code

This section gives a overview of the code. All the source code is commented for Doxygen, so you can browse a current snapshot of the source documentation. Moses is implemented using object-oriented principles, and you can get a good idea of its class organization from this documentation

The source code is in the following directories

  • moses/util contains some shared utilities, such as code for reading and parsing files, and for hash tables. This was originally part of KenLM.
  • moses/lm contains KenLM, Moses default language model.
  • moses/src contains the code for the decoder
  • moses/src/LM contains the language model wrappers used by the decoder
  • Other subdirectories of moses/src contain some more specialised parts of the decoder, such as alternative chart decoding algorithms.
  • moses-cmd/src contains code relevant to the command line version of the phrase-based decoder
  • moses-cmd/src contains code relevant to the command line version of the chart-based decoder
  • mert contains the code for the Moses mert implementation, originally described here

In the following, we provide a short walk-through of the decoder.

Quick Start

  • The main function: moses-cmd/src/Main.cpp
  • Initialize the decoder
    • moses/src/Parameter.cpp specifies parameters
    • moses/src/StaticData.cpp contains globals, loads tables
  • Process a sentence
    • Manager.cpp implements the decoding algorithm
    • TranslationOptionCollection.cpp contains translation options
    • Hypothesis.cpp represents partial translation
    • HypothesisStack.cpp contain viable hypotheses, implements pruning
  • Output results: moses-cmd/src/Main.cpp
    • moses-cmd/src/IOStream::OutputBestHypo print best translation
    • n-best lists generated in Manager.cpp, output in IOStream.cpp

Detailed Guides

Edit - History - Print
Page last modified on July 28, 2013, at 08:33 AM