mosesdecoder/contrib/python
Jeroen Vermeulen 4c59b350df Modernize "C" includes in contrib.
This is one of those little chores in managing a long-lived C++
project: standard C headers like stdio.h and math.h now have their own
place in the C++ standard as resp. cstdio, cmath, and so on.  In this
branch the #include names are updated for the contrib/ subdirectory.

C++11 adds cstdint, but to support compilation with the previous
standard, that change is left for later.
2015-03-28 20:58:46 +07:00
..
examples improved interface, better docs, +fixing after the changes with alignment-info 2012-11-15 12:58:04 +01:00
moses Modernize "C" includes in contrib. 2015-03-28 20:58:46 +07:00
example.py changing constructor's interface, pass tlimit to constructor 2012-11-15 18:05:03 +01:00
README.md Was not up to date anymore with current PhraseDictionaryTree() constructor, the parameterer nscores seems no longer required. Adapted Cython code and reran cython (newer version) to generate C++ version.. 2014-06-02 14:41:33 +02:00
setup.py Fix compile error in python wrapper 2013-04-12 17:38:11 +01:00

Python interface to Moses

The idea is to have some of Moses' internals exposed to Python (inspired on pycdec).

What's been interfaced?

  • Binary tables:

      Moses::PhraseDictionaryTree
      OnDiskPt::OnDiskWrapper
    

Building

  1. Build the python extension:

    You need to compile Moses with link=shared

    ./bjam --libdir=path link=shared
    

    Then you can build the extension (in case you used --libdir=path above, use --moses-lib=path below)

    python setup.py build_ext -i [--with-cmph] [--moses-lib=PATH] [--cython] [--max-factors=NUM] [--max-kenlm-order=NUM]
    

    Use --cython if you want to re-compile the pyx files, note that they already come compiled so that you don't need to have Cython installed

Example

Getting a phrase table

cd examples
export LC_ALL=C
cat phrase-table.txt | sort | ../../../bin/processPhraseTable -ttable 0 0 - -nscores 5 -alignment-info -out phrase-table

Getting a rule table

cd examples
../../../bin/CreateOnDiskPt 0 0 5 20 2 rule-table.txt rule-table

Querying

  1. Phrase-based

     echo "casa" | python example.py examples/phrase-table 5 20
     echo "essa casa" | python example.py examples/phrase-table 5 20
    
  2. Hierarchical

     echo "i [X]" | python example.py examples/rule-table 5 20
     echo "have [X]" | python example.py examples/rule-table 5 20
     echo "[X][X] do not [X][X] [X]" | python example.py examples/rule-table 5 20
    

Code

from moses.dictree import load # load abstracts away the choice of implementation by checking the available files
import sys

if len(sys.argv) != 4:
    print "Usage: %s table nscores tlimit < query > result" % (sys.argv[0])
    sys.exit(0)

path = sys.argv[1]
nscores = int(sys.argv[2])
tlimit = int(sys.argv[3])

table = load(path, nscores, tlimit)

for line in sys.stdin:
    f = line.strip()
    result = table.query(f)
    # you could simply print the matches
    # print '\n'.join([' ||| '.join((f, str(e))) for e in matches])
    # or you can use their attributes
    print result.source
    for e in result:
        if e.lhs:
            print '\t%s -> %s ||| %s ||| %s' % (e.lhs, 
                    ' '.join(e.rhs), 
                    e.scores, 
                    e.alignment)
        else:
            print '\t%s ||| %s ||| %s' % (' '.join(e.rhs), 
                    e.scores, 
                    e.alignment)

Changing the code

If you want to add your changes you are going to have to recompile the cython code.

  1. Compile the cython code:

    python setup.py build_ext -i --cython