mirror of
https://github.com/moses-smt/mosesdecoder.git
synced 2025-01-01 00:12:27 +03:00
.. | ||
documentation/training-pipeline | ||
pcl | ||
python | ||
README |
Arrow Based Moses Training Pipeline =================================== This demonstration implements a training pipeline that is shown in the Dia diagram in documentation/training-pipeline/moses-pypeline.dia. The demo has been tested with: - Moses v1.0 - Giza++ v1.0.7 - IRSTLM v5.70.04 Setup ----- To use the demonstration you must first initialise the git submodules for this clone. Return to the top level directory and issue the following command: $ git submodule update --init --recursive This will clone PCL, available at Github (git://github.com/ianj-als/pcl.git), and Pypeline submodules, available at GitHub (git://github.com/ianj-als/pypeline.git). Return to the arrow-pipelines contrib directory: $ cd contrib/arrow-pipelines To use the PCL compiler and run-time set the following environment variables (assuming Bash shell): $ export PATH=$PATH:`pwd`/python/pcl/src/pclc:`pwd`/python/pcl/src/pcl-run $ export PYTHONPATH=$PYTHONPATH:`pwd`/python/pcl/libs/pypeline/src Three environment variables need to be set before the pipeline can be run, they are: - MOSES_HOME : The directory where Moses has been cloned, or installed, - IRSTLM : The installation directory of your IRSTLM, and - GIZA_HOME : The installation directory of GIZA++. Building the example training pipeline -------------------------------------- $ cd pcl $ make Running the example training pipeline ------------------------------------- To execute the training pipeline run the following command: $ pcl-run.py training_pipeline Once complete the output of the pipeline can be found in the directories: - test_data/tokenisation - test_data/model - test_data/lm - test_data/mert