Moses
statistical
machine translation
system

Coding Style

To ensure maintainability and consistency, please follow the recommendations below when developing Moses.

Formatting

Indentations are 2 spaces. No tab characters allowed in the code. To ensure that your code follows this format run scripts/other/beautify.perl in the directory of the source code.

Opening braces are on a separate line, for instance:

       if (expr) {
         ...
       }

However, there is one special case, namely functions: they have the opening brace at the beginning of the next line, thus:

	int Function(int x)
	{
		body of function
	}

Upper/lowercase: Start all functions and class names with capital letters. Start all variable with small letter. Start all class variable with m_. For instance:

      void CalcNBest(size_t count, LatticePathList &ret) const;
      Sentence m_source;

Use long variable names, not variables called s, q, or qb7.

Do not use Hungarian notation.

Comments

The code will be parsed by Doxygen to create online documentation. To support this, you have to add comments for each class, function, and class variable. More information is available at the Doxygen web site.

Class definitions in the *.h file need to be preceeded by a block starting with /**, for instance:

 /** The Manager class implements a stack decoding algorithm.
 * Hypotheses are organized in stacks. One stack contains all hypothesis that have
 * the same number of foreign words translated.  The data structure for hypothesis
 * stacks is the class HypothesisStack. The data structure for a hypothesis
 * is the class Hypothesis.
 [...]
 **/
 class Manager

Class member variable definitions in *.h must be followed by a comment that starts with //!, for instance:

 size_t m_maxNumFactors;  //! max number of factors on both source and target sides

Functions in the *.cpp file need to be preceeded by a block starting with /**, for instance:

 /**
  * Main decoder loop that translates a sentence by expanding
  * hypotheses stack by stack, until the end of the sentence.
  */
 void Manager::ProcessSentence()

Function parameters are described by param, for instance:

 /** Create translation options that exactly cover a specific input span.
  * Called by CreateTranslationOptions() and ProcessUnknownWord()
  * \param decodeGraph list of decoding steps
  * \param factorCollection input sentence with all factors
  * \param startPos first position in input sentence
  * \param lastPos last position in input sentence
  * \param adhereTableLimit whether phrase & generation table limits are adhered to
  */
 void TranslationOptionCollection::CreateTranslationOptionsForRange(
   const DecodeGraph &decodeGraph
   , size_t startPos
   , size_t endPos
   , bool adhereTableLimit)
 {

In addition the definition in the *.h must be preceded by a short comment that starts with //!. This comment will be displayed in the beginning of the class definition. For instance:

 //! load all language models as specified in ini file
 bool LoadLanguageModels();

Data types and methods

  • Code for cross-platform compatibility.
  • Use object-orientated designs, including
    • Create Get/Set functions rather than exposing class variables.
    • Label functions, variables and arguments as const where possible.
    • Prefer references over pointers
    • General styles
    • Prefer enum types over integers
    • Resolve compiler warnings as well as errors
    • Delete tracing/debugging code once they are not needed.

Source Control Etiquette

  • Do not check in non-compilable code, or if functionality is reduced
  • Ignore the above if you need to, just let people know
  • Check-in your work often to avoid resolution conflicts
  • Add log messages to check-ins
  • Check in make/project files. However, you are not required to update project files other than the ones you use.
Edit - History - Print
Page last modified on May 26, 2015, at 05:41 PM